Class: RDF::Turtle::Reader

Inherits:
Reader
  • Object
show all
Includes:
EBNF::LL1::Parser, Terminals
Defined in:
lib/rdf/turtle/reader.rb

Overview

A parser for the Turtle 2

Defined Under Namespace

Classes: Recovery, SyntaxError

Constant Summary

Constants included from Terminals

Terminals::ANON, Terminals::BASE, Terminals::BLANK_NODE_LABEL, Terminals::DECIMAL, Terminals::DOUBLE, Terminals::ECHAR, Terminals::EXPONENT, Terminals::INTEGER, Terminals::IRIREF, Terminals::IRI_RANGE, Terminals::LANGTAG, Terminals::PERCENT, Terminals::PLX, Terminals::PNAME_LN, Terminals::PNAME_NS, Terminals::PN_CHARS, Terminals::PN_CHARS_BASE, Terminals::PN_CHARS_BODY, Terminals::PN_CHARS_U, Terminals::PN_LOCAL, Terminals::PN_LOCAL_BODY, Terminals::PN_LOCAL_ESC, Terminals::PN_PREFIX, Terminals::PREFIX, Terminals::STRING_LITERAL_LONG_QUOTE, Terminals::STRING_LITERAL_LONG_SINGLE_QUOTE, Terminals::STRING_LITERAL_QUOTE, Terminals::STRING_LITERAL_SINGLE_QUOTE, Terminals::UCHAR, Terminals::U_CHARS1, Terminals::U_CHARS2, Terminals::WS

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input = nil, options = {}, &block) ⇒ RDF::Turtle::Reader

Initializes a new reader instance.

Note, the spec does not define a default mapping for the empty prefix, but it is so commonly used in examples that we define it to be the empty string anyway, except when validating.

Parameters:

  • input (String, #to_s) (defaults to: nil)
  • options (Hash{Symbol => Object}) (defaults to: {})

Options Hash (options):

  • :prefixes (Hash) — default: Hash.new

    the prefix mappings to use (for acessing intermediate parser productions)

  • :base_uri (#to_s) — default: nil

    the base URI to use when resolving relative URIs (for acessing intermediate parser productions)

  • :anon_base (#to_s) — default: "b0"

    Basis for generating anonymous Nodes

  • :validate (Boolean) — default: false

    whether to validate the parsed statements and values. If not validating, the parser will attempt to recover from errors.

  • :errors (Array)

    array for placing errors found when parsing

  • :warnings (Array)

    array for placing warnings found when parsing

  • :progress (Boolean)

    Show progress of parser productions

  • :debug (Boolean, Integer, Array)

    Detailed debug output. If set to an Integer, output is restricted to messages of that priority: ‘0` for errors, `1` for warnings, `2` for processor tracing, and anything else for various levels of debug. If set to an Array, information is collected in the array instead of being output to `$stderr`.

  • :freebase (Boolean) — default: false

    Use optimized Freebase reader



91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# File 'lib/rdf/turtle/reader.rb', line 91

def initialize(input = nil, options = {}, &block)
  super do
    @options = {
      anon_base:  "b0",
      validate:  false,
      whitespace:  WS,
    }.merge(options)
    @options = {prefixes:  {nil => ""}}.merge(@options) unless @options[:validate]
    @errors = @options[:errors] || []
    @warnings = @options[:warnings] || []
    @depth = 0
    @prod_stack = []

    @options[:debug] ||= case
    when RDF::Turtle.debug? then true
    when @options[:progress] then 2
    when @options[:validate] then 1
    end

    @options[:base_uri] = RDF::URI(base_uri || "")
    debug("base IRI") {base_uri.inspect}
    
    debug("validate") {validate?.inspect}
    debug("canonicalize") {canonicalize?.inspect}
    debug("intern") {intern?.inspect}

    @lexer = EBNF::LL1::Lexer.new(input, self.class.patterns, @options)

    if block_given?
      case block.arity
        when 0 then instance_eval(&block)
        else block.call(self)
      end
    end
  end
end

Instance Attribute Details

#errorsArray<String> (readonly)

Accumulated errors found during processing

Returns:

  • (Array<String>)


36
37
38
# File 'lib/rdf/turtle/reader.rb', line 36

def errors
  @errors
end

#warningsArray<String> (readonly)

Accumulated warnings found during processing

Returns:

  • (Array<String>)


41
42
43
# File 'lib/rdf/turtle/reader.rb', line 41

def warnings
  @warnings
end

Class Method Details

.new(input = nil, options = {}, &block) ⇒ Object

Redirect for Freebase Reader



47
48
49
50
51
52
53
54
55
56
# File 'lib/rdf/turtle/reader.rb', line 47

def self.new(input = nil, options = {}, &block)
  klass = if options[:freebase]
    FreebaseReader
  else
    self
  end
  reader = klass.allocate
  reader.send(:initialize, input, options, &block)
  reader
end

Instance Method Details

#add_statement(production, statement) ⇒ RDF::Statement

add a statement, object can be literal or URI or bnode

Parameters:

  • production (Symbol)
  • statement (RDF::Statement)

    the subject of the statement

Returns:

  • (RDF::Statement)

    Added statement

Raises:

  • (RDF::ReaderError)

    Checks parameter types and raises if they are incorrect if parsing mode is validate.



187
188
189
190
191
192
193
# File 'lib/rdf/turtle/reader.rb', line 187

def add_statement(production, statement)
  error("Statement is invalid: #{statement.inspect.inspect}", production: produciton) if validate? && statement.invalid?
  @callback.call(statement) if statement.subject &&
                               statement.predicate &&
                               statement.object &&
                               (validate? ? statement.valid? : true)
end

#bnode(value = nil) ⇒ Object

Keep track of allocated BNodes



249
250
251
252
253
# File 'lib/rdf/turtle/reader.rb', line 249

def bnode(value = nil)
  return RDF::Node.new unless value
  @bnode_cache ||= {}
  @bnode_cache[value.to_s] ||= RDF::Node.new(value)
end

#each_statement {|statement| ... } ⇒ void

This method returns an undefined value.

Iterates the given block for each RDF statement in the input.

Yields:

  • (statement)

Yield Parameters:

  • statement (RDF::Statement)


138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
# File 'lib/rdf/turtle/reader.rb', line 138

def each_statement(&block)
  if block_given?
    @recovering = false
    @callback = block

    begin
      while (@lexer.first rescue true)
        read_statement
      end
    rescue EBNF::LL1::Lexer::Error, SyntaxError, EOFError, Recovery
      # Terminate loop if EOF found while recovering
    end

    if validate?
      if !warnings.empty? && !@options[:warnings]
        $stderr.puts "Warnings: #{warnings.join("\n")}"
      end
      if !errors.empty?
        $stderr.puts "Errors: #{errors.join("\n")}" unless @options[:errors]
        raise RDF::ReaderError, "Errors found during processing"
      end
    end
  end
  enum_for(:each_statement)
end

#each_triple {|subject, predicate, object| ... } ⇒ void

This method returns an undefined value.

Iterates the given block for each RDF triple in the input.

Yields:

  • (subject, predicate, object)

Yield Parameters:

  • subject (RDF::Resource)
  • predicate (RDF::URI)
  • object (RDF::Value)


172
173
174
175
176
177
178
179
# File 'lib/rdf/turtle/reader.rb', line 172

def each_triple(&block)
  if block_given?
    each_statement do |statement|
      block.call(*statement.to_triple)
    end
  end
  enum_for(:each_triple)
end

#inspectObject



128
129
130
# File 'lib/rdf/turtle/reader.rb', line 128

def inspect
  sprintf("#<%s:%#0x(%s)>", self.class.name, __id__, base_uri.to_s)
end

#literal(value, options = {}) ⇒ Object

Create a literal



209
210
211
212
213
214
215
216
217
218
219
# File 'lib/rdf/turtle/reader.rb', line 209

def literal(value, options = {})
  debug("literal") do
    "value: #{value.inspect}, " +
    "options: #{options.inspect}, " +
    "validate: #{validate?.inspect}, " +
    "c14n?: #{canonicalize?.inspect}"
  end
  RDF::Literal.new(value, options.merge(validate:  validate?, canonicalize:  canonicalize?))
rescue ArgumentError => e
  error("Argument Error #{e.message}", production: :literal, token: @lexer.first)
end

#pname(prefix, suffix) ⇒ Object

Expand a PNAME using string concatenation



235
236
237
238
239
240
241
242
243
244
245
246
# File 'lib/rdf/turtle/reader.rb', line 235

def pname(prefix, suffix)
  # Prefixes must be defined, except special case for empty prefix being alias for current @base
  if prefix(prefix)
    base = prefix(prefix).to_s
  elsif !prefix(prefix)
    error("undefined prefix", production: :pname, token: prefix)
    base = ''
  end
  suffix = suffix.to_s.sub(/^\#/, "") if base.index("#")
  debug("pname") {"base: '#{base}', suffix: '#{suffix}'"}
  process_iri(base + suffix.to_s)
end

#prefix(prefix, iri = nil) ⇒ Object

Override #prefix to take a relative IRI

prefix directives map a local name to an IRI, also resolved against the current In-Scope Base URI. Spec confusion, presume that an undefined empty prefix has an empty relative IRI, which uses string contatnation rules against the in-scope IRI at the time of use



227
228
229
230
231
# File 'lib/rdf/turtle/reader.rb', line 227

def prefix(prefix, iri = nil)
  # Relative IRIs are resolved against @base
  iri = process_iri(iri) if iri
  super(prefix, iri)
end

#process_iri(iri) ⇒ Object

Process a URI against base



196
197
198
199
200
201
202
203
204
205
206
# File 'lib/rdf/turtle/reader.rb', line 196

def process_iri(iri)
  iri = iri.value[1..-2] if iri === :IRIREF
  value = RDF::URI(iri)
  value = base_uri.join(value) if value.relative?
  value.validate! if validate?
  value.canonicalize! if canonicalize?
  value = RDF::URI.intern(value) if intern?
  value
rescue ArgumentError => e
  error("process_iri", e)
end