Class: RDF2JSON::Converter

Inherits:
Object
  • Object
show all
Defined in:
lib/rdf2json/rdf2json.rb

Overview

Class that takes an input file (RDF N-Triples/N-Quads) and appends JSON/JSON-LD to a possible pre-existing output file. A namespace and prefix can be given that handle ‘–namespace` and `–prefix` parameters in conjunction with the `–minimize` parameter.

Instance Method Summary collapse

Constructor Details

#initialize(input_filename, output_filename, input_format, output_format, namespace = nil, prefix = nil, summary = nil) ⇒ Converter

Initializes a new converter instance.

input_filename

path/filename of the input file in RDF N-Triples/N-Quads

output_filename

path/filename of the output file to which JSON/JSON-LD is being appended

input_format

format of the input file (:ntriples or :nquads)

output_format

format of the output (:json or jsonld)

namespace

a possible namespace for replacing “@id” keys (may be nil)

prefix

a possible prefix for shortening keys (may be nil)

summary

determines whether summary statistics should be printed (may be nil; means no summary)



180
181
182
183
184
185
186
187
188
# File 'lib/rdf2json/rdf2json.rb', line 180

def initialize(input_filename, output_filename, input_format, output_format, namespace = nil, prefix = nil, summary = nil)
  @input_file = File.open(input_filename, 'r')
  @output_file = File.open(output_filename, 'a')
  @input_format = input_format
  @output_format = output_format
  @namespace = namespace
  @prefix = prefix
  @summary = summary
end

Instance Method Details

#convertObject

Convert the input file by appending the newly formatted data to the output file.

At the end of the conversion a short statistic is output. It tells the number of lines read from the input file, the number of errors in the N-Triples/N-Quads file, the number of JSON/JSON-LD documents appended to the output file (equiv. to number of lines appended).



196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
# File 'lib/rdf2json/rdf2json.rb', line 196

def convert
  no_of_lines = 0
  documents = 0
  no_of_statements = 0
  read_errors = 0
  last_subject = nil
  subject_block = ''

  @input_file.each_line { |line|
    no_of_lines += 1

    subject = "#{line.sub(/>.*/, '')}>"

    last_subject = subject unless last_subject

    if subject == last_subject then
      subject_block << line
    else
      stats = write_graph(subject_block)
      documents += stats[:documents]
      no_of_statements += stats[:no_of_statements]
      read_errors += stats[:read_errors]
      subject_block = ''
    end

    last_subject = subject
  }

  stats = write_graph(subject_block)
  documents += stats[:documents]
  no_of_statements += stats[:no_of_statements]
  read_errors += stats[:read_errors]

  @output_file.close

  if @summary then
    puts "Total number of lines read                   : #{no_of_lines}"
    puts "Statement read errors (N-Quads or N-Triples) : #{read_errors}"
    puts "Statements that are captured in JSON/JSON-LD : #{no_of_statements}"
    puts "JSON/JSON-LD documents output                : #{documents}"
  end
end

#minify(jsonld_hash) ⇒ Object

Minimize a JSON-LD hash to JSON.

jsonld_hash

a JSON-LD hash that should be rewritten to plain JSON



242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
# File 'lib/rdf2json/rdf2json.rb', line 242

def minify(jsonld_hash)
  jsonld_hash.keys.each { |key|
    if key == '@type' then
      jsonld_hash.delete(key)
    elsif @prefix and key.match(@prefix) then
      shortened_key = key.sub(@prefix, '')
      jsonld_hash[shortened_key] = jsonld_hash.delete(key)
      key = shortened_key
    end

    if jsonld_hash[key].instance_of?(Array) then
      jsonld_hash[key].each_index { |index|
        if jsonld_hash[key][index].has_key?('@value') then
          jsonld_hash[key][index] = jsonld_hash[key][index]['@value']
        elsif jsonld_hash[key][index].has_key?('@id') then
          jsonld_hash[key][index] = jsonld_hash[key][index]['@id']
        end
      }
    elsif jsonld_hash[key].instance_of?(Hash) then
      minify(jsonld_hash[key])
    end
  }
end

#write_graph(block) ⇒ Object

Takes a block of RDF statements that share the same subject and creates a JSON/JSON-LD document from them, which is appended to the output file.

block

one or more lines that share the same subject in RDF N-Triples/N-Quads



270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
# File 'lib/rdf2json/rdf2json.rb', line 270

def write_graph(block)
  return { :read_errors => 0, :no_of_statements => 0, :documents => 0 } unless block and not block.empty?

  # Virtuoso output error-handling:
  block.gsub!("\\'", "'")

  read_errors = 0
  no_of_statements = 0
  graph = RDF::Graph.new
  RDF::Reader.for(@input_format).new(block) { |reader|
    begin
      reader.each_statement { |statement|
        no_of_statements += 1
        graph.insert(statement)
      }
    rescue RDF::ReaderError
      read_errors += 1
    end
  }

  documents = 0
  JSON::LD::API::fromRdf(graph) { |document|
    document.each{ |entity|
      # Parsed JSON-LD representation:
      entity = JSON.parse(entity.to_json)

      entity[@namespace] = entity.delete('@id') if @namespace
      minify(entity) if @output_format == :json

      @output_file.puts entity.to_json
      documents += 1
    }
  }

  return { :read_errors => read_errors, :no_of_statements => no_of_statements, :documents => documents }
end