Class: RDF::NTriples::Writer

Inherits:
Writer
  • Object
show all
Includes:
Util::Logger
Defined in:
lib/rdf/ntriples/writer.rb

Overview

N-Triples serializer.

Output is serialized for UTF-8, to serialize as ASCII (with) unicode escapes, set encoding: Encoding::ASCII as an option to #initialize.

Examples:

Obtaining an NTriples writer class

RDF::Writer.for(:ntriples)     #=> RDF::NTriples::Writer
RDF::Writer.for("etc/test.nt")
RDF::Writer.for(file_name:      "etc/test.nt")
RDF::Writer.for(file_extension: "nt")
RDF::Writer.for(content_type:   "application/n-triples")

Serializing RDF statements into an NTriples file

RDF::NTriples::Writer.open("etc/test.nt") do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements into an NTriples string

RDF::NTriples::Writer.buffer do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

Serializing RDF statements into an NTriples string with escaped UTF-8

RDF::NTriples::Writer.buffer(encoding: Encoding::ASCII) do |writer|
  graph.each_statement do |statement|
    writer << statement
  end
end

See Also:

Direct Known Subclasses

RDF::NQuads::Writer

Constant Summary collapse

ESCAPE_PLAIN =
/\A[\x20-\x21\x23-\x26\x28#{Regexp.escape '['}#{Regexp.escape ']'}-\x7E]*\z/m.freeze
ESCAPE_PLAIN_U =
/\A(?:#{Reader::IRI_RANGE}|#{Reader::UCHAR})*\z/.freeze

Constants included from Util::Logger

Util::Logger::IOWrapper

Instance Attribute Summary

Attributes inherited from Writer

#options

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Util::Logger

#log_debug, #log_depth, #log_error, #log_fatal, #log_info, #log_recover, #log_recovering?, #log_statistics, #log_warn, #logger

Methods inherited from Writer

accept?, #base_uri, buffer, #canonicalize?, dump, each, #encoding, #flush, for, format, #format_list, #format_term, #node_id, open, options, #prefix, #prefixes, #prefixes=, #puts, #quoted, #to_sym, to_sym, #uri_for, #validate?, #write_epilogue, #write_prologue, #write_statement, #write_triples

Methods included from Util::Aliasing::LateBound

#alias_method

Methods included from Writable

#<<, #insert, #insert_graph, #insert_reader, #insert_statement, #insert_statements, #writable?

Methods included from Util::Coercions

#coerce_statements

Constructor Details

#initialize(output = $stdout, validate: true, **options) {|writer| ... } ⇒ Writer

Initializes the writer.

Parameters:

  • output (IO, File) (defaults to: $stdout)

    the output stream

  • validate (Boolean) (defaults to: true)

    (true) whether to validate terms when serializing

  • options (Hash{Symbol => Object})

    ({}) any additional options. See Writer#initialize

Yields:

  • (writer)

    self

Yield Parameters:

Yield Returns:

  • (void)


192
193
194
# File 'lib/rdf/ntriples/writer.rb', line 192

def initialize(output = $stdout, validate: true, **options, &block)
  super
end

Class Method Details

.escape(string, encoding = nil) ⇒ String

Escape Literal and URI content. If encoding is ASCII, all unicode is escaped, otherwise only ASCII characters that must be escaped are escaped.

Parameters:

  • string (String)
  • encoding (Encoding) (defaults to: nil)

Returns:

  • (String)

See Also:



57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'lib/rdf/ntriples/writer.rb', line 57

def self.escape(string, encoding = nil)
  ret = case
    when string.match?(ESCAPE_PLAIN) # a shortcut for the simple case
      string
    when string.ascii_only?
      StringIO.open do |buffer|
        buffer.set_encoding(Encoding::ASCII)
        string.each_byte { |u| buffer << escape_ascii(u, encoding) }
        buffer.string
      end
    when encoding && encoding != Encoding::ASCII
      # Not encoding UTF-8 characters
      StringIO.open do |buffer|
        buffer.set_encoding(encoding)
        string.each_char do |u|
          buffer << case u.ord
          when (0x00..0x7F)
            escape_ascii(u, encoding)
          else
            u
          end
        end
        buffer.string
      end
    else
      # Encode ASCII && UTF-8 characters
      StringIO.open do |buffer|
        buffer.set_encoding(Encoding::ASCII)
        string.each_codepoint { |u| buffer << escape_unicode(u, encoding) }
        buffer.string
      end
  end
  encoding ? ret.encode(encoding) : ret
end

.escape_ascii(u, encoding) ⇒ String

Standard ASCII escape sequences. If encoding is ASCII, use Test-Cases sequences, otherwise, assume the test-cases escape sequences. Otherwise, the N-Triples recommendation includes \b and \f escape sequences.

Within STRING_LITERAL_QUOTE, only the characters U+0022, U+005C, U+000A, U+000D are encoded using ECHAR. ECHAR must not be used for characters that are allowed directly in STRING_LITERAL_QUOTE.

Parameters:

  • u (Integer, #ord)

Returns:

  • (String)

Raises:

  • (ArgumentError)

    if u is not a valid Unicode codepoint

See Also:



126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# File 'lib/rdf/ntriples/writer.rb', line 126

def self.escape_ascii(u, encoding)
  case (u = u.ord)
  when (0x08)       then "\\b"
  when (0x09)       then "\\t"
  when (0x0A)       then "\\n"
  when (0x0C)       then "\\f"
  when (0x0D)       then "\\r"
  when (0x22)       then "\\\""
  when (0x5C)       then "\\\\"
  when (0x00..0x1F) then escape_utf16(u)
  when (0x7F)       then escape_utf16(u)
  when (0x20..0x7E) then u.chr
  else
    raise ArgumentError.new("expected an ASCII character in (0x00..0x7F), but got 0x#{u.to_s(16)}")
  end
end

.escape_unicode(u, encoding) ⇒ String

Escape ascii and unicode characters. If encoding is UTF_8, only ascii characters are escaped.

Parameters:

  • u (Integer, #ord)
  • encoding (Encoding)

Returns:

  • (String)

Raises:

  • (ArgumentError)

    if u is not a valid Unicode codepoint

See Also:



101
102
103
104
105
106
107
108
109
110
111
112
# File 'lib/rdf/ntriples/writer.rb', line 101

def self.escape_unicode(u, encoding)
  case (u = u.ord)
    when (0x00..0x7F)        # ASCII 7-bit
      escape_ascii(u, encoding)
    when (0x80..0xFFFF)      # Unicode BMP
      escape_utf16(u)
    when (0x10000..0x10FFFF) # Unicode
      escape_utf32(u)
    else
      raise ArgumentError.new("expected a Unicode codepoint in (0x00..0x10FFFF), but got 0x#{u.to_s(16)}")
  end
end

.escape_utf16(u) ⇒ String

Parameters:

  • u (Integer, #ord)

Returns:

  • (String)

See Also:



147
148
149
# File 'lib/rdf/ntriples/writer.rb', line 147

def self.escape_utf16(u)
  sprintf("\\u%04X", u.ord)
end

.escape_utf32(u) ⇒ String

Parameters:

  • u (Integer, #ord)

Returns:

  • (String)

See Also:



155
156
157
# File 'lib/rdf/ntriples/writer.rb', line 155

def self.escape_utf32(u)
  sprintf("\\U%08X", u.ord)
end

.serialize(value) ⇒ String

Returns the serialized N-Triples representation of the given RDF value.

Parameters:

Returns:

  • (String)

Raises:

  • (ArgumentError)

    if value is not an RDF::Statement or RDF::Term



166
167
168
169
170
171
172
173
174
175
176
177
178
# File 'lib/rdf/ntriples/writer.rb', line 166

def self.serialize(value)
  writer = (@serialize_writer_memo ||= self.new)
  case value
    when nil then nil
    when FalseClass then value.to_s
    when RDF::Statement
      writer.format_statement(value) + "\n"
    when RDF::Term
      writer.format_term(value)
    else
      raise ArgumentError, "expected an RDF::Statement or RDF::Term, but got #{value.inspect}"
  end
end

Instance Method Details

#format_literal(literal, **options) ⇒ String

Returns the N-Triples representation of a literal.

Parameters:

  • literal (RDF::Literal, String, #to_s)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


309
310
311
312
313
314
315
316
317
318
319
320
321
# File 'lib/rdf/ntriples/writer.rb', line 309

def format_literal(literal, **options)
  case literal
    when RDF::Literal
      # Note, escaping here is more robust than in Term
      text = quoted(escaped(literal.value))
      text << "@#{literal.language}" if literal.language?
      text << "--#{literal.direction}" if literal.direction?
      text << "^^<#{uri_for(literal.datatype)}>" if literal.datatype?
      text
    else
      quoted(escaped(literal.to_s))
  end
end

#format_node(node, unique_bnodes: false, **options) ⇒ String

Returns the N-Triples representation of a blank node.

Parameters:

  • node (RDF::Node)
  • unique_bnodes (Boolean) (defaults to: false)

    (false) Serialize node using unique identifier, rather than any used to create the node.

  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


255
256
257
# File 'lib/rdf/ntriples/writer.rb', line 255

def format_node(node, unique_bnodes: false, **options)
  unique_bnodes ? node.to_unique_base : node.to_s
end

#format_quotedTriple(statement, **options) ⇒ String

Returns the N-Triples representation of an RDF-star quoted triple.

Parameters:

  • statement (RDF::Statement)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


232
233
234
# File 'lib/rdf/ntriples/writer.rb', line 232

def format_quotedTriple(statement, **options)
  "<<%s %s %s>>" % statement.to_a.map { |value| format_term(value, **options) }
end

#format_statement(statement, **options) ⇒ String

Returns the N-Triples representation of a statement.

Parameters:

  • statement (RDF::Statement)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


222
223
224
# File 'lib/rdf/ntriples/writer.rb', line 222

def format_statement(statement, **options)
  format_triple(*statement.to_triple, **options)
end

#format_triple(subject, predicate, object, **options) ⇒ String

Returns the N-Triples representation of a triple.

Parameters:

Returns:

  • (String)


243
244
245
# File 'lib/rdf/ntriples/writer.rb', line 243

def format_triple(subject, predicate, object, **options)
  "%s %s %s ." % [subject, predicate, object].map { |value| format_term(value, **options) }
end

#format_uri(uri, **options) ⇒ String

Returns the N-Triples representation of a URI reference using write encoding.

Parameters:

  • uri (RDF::URI)
  • options (Hash{Symbol => Object})

    ({})

Returns:

  • (String)


265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
# File 'lib/rdf/ntriples/writer.rb', line 265

def format_uri(uri, **options)
  string = uri.to_s
  iriref = case
    when string.match?(ESCAPE_PLAIN_U) # a shortcut for the simple case
      string
    when string.ascii_only? || (encoding && encoding != Encoding::ASCII)
      StringIO.open do |buffer|
        buffer.set_encoding(encoding)
        string.each_char do |u|
          buffer << case u.ord
            when (0x00..0x20) then self.class.escape_utf16(u)
            when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|}
              self.class.escape_utf16(u)
            else u
          end
        end
        buffer.string
      end
    else
      # Encode ASCII && UTF-8/16 characters
      StringIO.open do |buffer|
        buffer.set_encoding(Encoding::ASCII)
        string.each_byte do |u|
          buffer << case u
            when (0x00..0x20) then self.class.escape_utf16(u)
            when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|}
              self.class.escape_utf16(u)
            when (0x80..0xFFFF)                then self.class.escape_utf16(u)
            when (0x10000..0x10FFFF)           then self.class.escape_utf32(u)
            else u
          end
        end
        buffer.string
      end
  end
  encoding ? "<#{iriref}>".encode(encoding) : "<#{iriref}>"
end

#write_comment(text)

This method returns an undefined value.

Outputs an N-Triples comment line.

Parameters:

  • text (String)


201
202
203
# File 'lib/rdf/ntriples/writer.rb', line 201

def write_comment(text)
  puts "# #{text.chomp}" # TODO: correctly output multi-line comments
end

#write_triple(subject, predicate, object)

This method returns an undefined value.

Outputs the N-Triples representation of a triple.

Parameters:



212
213
214
# File 'lib/rdf/ntriples/writer.rb', line 212

def write_triple(subject, predicate, object)
  puts format_triple(subject, predicate, object, **@options)
end