Class: RDF::NTriples::Writer
- Includes:
- Util::Logger
- Defined in:
- lib/rdf/ntriples/writer.rb
Overview
N-Triples serializer.
Output is serialized for UTF-8, to serialize as ASCII (with) unicode escapes, set encoding: Encoding::ASCII as an option to #initialize.
Direct Known Subclasses
Constant Summary collapse
- ESCAPE_PLAIN =
/\A[\x20-\x21\x23-\x26\x28#{Regexp.escape '['}#{Regexp.escape ']'}-\x7E]*\z/m.freeze
- ESCAPE_PLAIN_U =
/\A(?:#{Reader::IRI_RANGE}|#{Reader::UCHAR})*\z/.freeze
Constants included from Util::Logger
Instance Attribute Summary
Attributes inherited from Writer
Class Method Summary collapse
-
.escape(string, encoding = nil) ⇒ String
Escape Literal and URI content.
-
.escape_ascii(u, encoding) ⇒ String
Standard ASCII escape sequences.
-
.escape_unicode(u, encoding) ⇒ String
Escape ascii and unicode characters.
- .escape_utf16(u) ⇒ String
- .escape_utf32(u) ⇒ String
-
.serialize(value) ⇒ String
Returns the serialized N-Triples representation of the given RDF value.
Instance Method Summary collapse
-
#format_literal(literal, **options) ⇒ String
Returns the N-Triples representation of a literal.
-
#format_node(node, unique_bnodes: false, **options) ⇒ String
Returns the N-Triples representation of a blank node.
-
#format_quotedTriple(statement, **options) ⇒ String
Returns the N-Triples representation of an RDF* reified statement.
-
#format_statement(statement, **options) ⇒ String
Returns the N-Triples representation of a statement.
-
#format_triple(subject, predicate, object, **options) ⇒ String
Returns the N-Triples representation of a triple.
-
#format_uri(uri, **options) ⇒ String
Returns the N-Triples representation of a URI reference using write encoding.
-
#initialize(output = $stdout, validate: true, **options) {|writer| ... } ⇒ Writer
constructor
Initializes the writer.
-
#write_comment(text)
Outputs an N-Triples comment line.
-
#write_triple(subject, predicate, object)
Outputs the N-Triples representation of a triple.
Methods included from Util::Logger
#log_debug, #log_depth, #log_error, #log_fatal, #log_info, #log_recover, #log_recovering?, #log_statistics, #log_warn, #logger
Methods inherited from Writer
accept?, #base_uri, buffer, #canonicalize?, dump, each, #encoding, #flush, for, format, #format_list, #format_term, #node_id, open, options, #prefix, #prefixes, #prefixes=, #puts, #quoted, #to_sym, to_sym, #uri_for, #validate?, #write_epilogue, #write_prologue, #write_statement, #write_triples
Methods included from Util::Aliasing::LateBound
Methods included from Writable
#<<, #insert, #insert_graph, #insert_reader, #insert_statement, #insert_statements, #writable?
Methods included from Util::Coercions
Constructor Details
#initialize(output = $stdout, validate: true, **options) {|writer| ... } ⇒ Writer
Initializes the writer.
190 191 192 |
# File 'lib/rdf/ntriples/writer.rb', line 190 def initialize(output = $stdout, validate: true, **, &block) super end |
Class Method Details
.escape(string, encoding = nil) ⇒ String
Escape Literal and URI content. If encoding is ASCII, all unicode is escaped, otherwise only ASCII characters that must be escaped are escaped.
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/rdf/ntriples/writer.rb', line 57 def self.escape(string, encoding = nil) ret = case when string.match?(ESCAPE_PLAIN) # a shortcut for the simple case string when string.ascii_only? StringIO.open do |buffer| buffer.set_encoding(Encoding::ASCII) string.each_byte { |u| buffer << escape_ascii(u, encoding) } buffer.string end when encoding && encoding != Encoding::ASCII # Not encoding UTF-8 characters StringIO.open do |buffer| buffer.set_encoding(encoding) string.each_char do |u| buffer << case u.ord when (0x00..0x7F) escape_ascii(u, encoding) else u end end buffer.string end else # Encode ASCII && UTF-8 characters StringIO.open do |buffer| buffer.set_encoding(Encoding::ASCII) string.each_codepoint { |u| buffer << escape_unicode(u, encoding) } buffer.string end end encoding ? ret.encode(encoding) : ret end |
.escape_ascii(u, encoding) ⇒ String
Standard ASCII escape sequences. If encoding is ASCII, use Test-Cases
sequences, otherwise, assume the test-cases escape sequences. Otherwise,
the N-Triples recommendation includes \b
and \f
escape sequences.
Within STRING_LITERAL_QUOTE, only the characters U+0022
, U+005C
, U+000A
, U+000D
are encoded using ECHAR
. ECHAR
must not be used for characters that are allowed directly in STRING_LITERAL_QUOTE.
126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
# File 'lib/rdf/ntriples/writer.rb', line 126 def self.escape_ascii(u, encoding) case (u = u.ord) when (0x00..0x07) then escape_utf16(u) when (0x0A) then "\\n" when (0x0D) then "\\r" when (0x0E..0x1F) then escape_utf16(u) when (0x22) then "\\\"" when (0x5C) then "\\\\" when (0x7F) then escape_utf16(u) when (0x00..0x7F) then u.chr else raise ArgumentError.new("expected an ASCII character in (0x00..0x7F), but got 0x#{u.to_s(16)}") end end |
.escape_unicode(u, encoding) ⇒ String
Escape ascii and unicode characters. If encoding is UTF_8, only ascii characters are escaped.
101 102 103 104 105 106 107 108 109 110 111 112 |
# File 'lib/rdf/ntriples/writer.rb', line 101 def self.escape_unicode(u, encoding) case (u = u.ord) when (0x00..0x7F) # ASCII 7-bit escape_ascii(u, encoding) when (0x80..0xFFFF) # Unicode BMP escape_utf16(u) when (0x10000..0x10FFFF) # Unicode escape_utf32(u) else raise ArgumentError.new("expected a Unicode codepoint in (0x00..0x10FFFF), but got 0x#{u.to_s(16)}") end end |
.escape_utf16(u) ⇒ String
145 146 147 |
# File 'lib/rdf/ntriples/writer.rb', line 145 def self.escape_utf16(u) sprintf("\\u%04X", u.ord) end |
.escape_utf32(u) ⇒ String
153 154 155 |
# File 'lib/rdf/ntriples/writer.rb', line 153 def self.escape_utf32(u) sprintf("\\U%08X", u.ord) end |
.serialize(value) ⇒ String
Returns the serialized N-Triples representation of the given RDF value.
164 165 166 167 168 169 170 171 172 173 174 175 176 |
# File 'lib/rdf/ntriples/writer.rb', line 164 def self.serialize(value) writer = (@serialize_writer_memo ||= self.new) case value when nil then nil when FalseClass then value.to_s when RDF::Statement writer.format_statement(value) + "\n" when RDF::Term writer.format_term(value) else raise ArgumentError, "expected an RDF::Statement or RDF::Term, but got #{value.inspect}" end end |
Instance Method Details
#format_literal(literal, **options) ⇒ String
Returns the N-Triples representation of a literal.
307 308 309 310 311 312 313 314 315 316 317 318 |
# File 'lib/rdf/ntriples/writer.rb', line 307 def format_literal(literal, **) case literal when RDF::Literal # Note, escaping here is more robust than in Term text = quoted(escaped(literal.value)) text << "@#{literal.language}" if literal.language? text << "^^<#{uri_for(literal.datatype)}>" if literal.datatype? text else quoted(escaped(literal.to_s)) end end |
#format_node(node, unique_bnodes: false, **options) ⇒ String
Returns the N-Triples representation of a blank node.
253 254 255 |
# File 'lib/rdf/ntriples/writer.rb', line 253 def format_node(node, unique_bnodes: false, **) unique_bnodes ? node.to_unique_base : node.to_s end |
#format_quotedTriple(statement, **options) ⇒ String
Returns the N-Triples representation of an RDF* reified statement.
230 231 232 |
# File 'lib/rdf/ntriples/writer.rb', line 230 def format_quotedTriple(statement, **) "<<%s %s %s>>" % statement.to_a.map { |value| format_term(value, **) } end |
#format_statement(statement, **options) ⇒ String
Returns the N-Triples representation of a statement.
220 221 222 |
# File 'lib/rdf/ntriples/writer.rb', line 220 def format_statement(statement, **) format_triple(*statement.to_triple, **) end |
#format_triple(subject, predicate, object, **options) ⇒ String
Returns the N-Triples representation of a triple.
241 242 243 |
# File 'lib/rdf/ntriples/writer.rb', line 241 def format_triple(subject, predicate, object, **) "%s %s %s ." % [subject, predicate, object].map { |value| format_term(value, **) } end |
#format_uri(uri, **options) ⇒ String
Returns the N-Triples representation of a URI reference using write encoding.
263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 |
# File 'lib/rdf/ntriples/writer.rb', line 263 def format_uri(uri, **) string = uri.to_s iriref = case when string.match?(ESCAPE_PLAIN_U) # a shortcut for the simple case string when string.ascii_only? || (encoding && encoding != Encoding::ASCII) StringIO.open do |buffer| buffer.set_encoding(encoding) string.each_char do |u| buffer << case u.ord when (0x00..0x20) then self.class.escape_utf16(u) when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|} self.class.escape_utf16(u) else u end end buffer.string end else # Encode ASCII && UTF-8/16 characters StringIO.open do |buffer| buffer.set_encoding(Encoding::ASCII) string.each_byte do |u| buffer << case u when (0x00..0x20) then self.class.escape_utf16(u) when 0x22, 0x3c, 0x3e, 0x5c, 0x5e, 0x60, 0x7b, 0x7c, 0x7d # "<>\^`{|} self.class.escape_utf16(u) when (0x80..0xFFFF) then self.class.escape_utf16(u) when (0x10000..0x10FFFF) then self.class.escape_utf32(u) else u end end buffer.string end end encoding ? "<#{iriref}>".encode(encoding) : "<#{iriref}>" end |
#write_comment(text)
This method returns an undefined value.
Outputs an N-Triples comment line.
199 200 201 |
# File 'lib/rdf/ntriples/writer.rb', line 199 def write_comment(text) puts "# #{text.chomp}" # TODO: correctly output multi-line comments end |
#write_triple(subject, predicate, object)
This method returns an undefined value.
Outputs the N-Triples representation of a triple.
210 211 212 |
# File 'lib/rdf/ntriples/writer.rb', line 210 def write_triple(subject, predicate, object) puts format_triple(subject, predicate, object, **@options) end |