Class: Traject::MockReader

Inherits:
Object
  • Object
show all
Defined in:
lib/traject/mock_reader.rb

Overview

A mock reader, designed to do almost no work during a run to provide better benchmarking

It pulls in 20 records from the end of this file and then just returns them over and over again, up to the specified limit

Specify in a config files as follows:

require 'traject/mock_writer'
require 'traject/mock_reader'

settings do
  store "reader_class_name", "Traject::MockReader"
  store "writer_class_name", "Traject::MockWriter"
  store "mock_reader.limit", 4_000 # default is 10_000
end

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input_stream, settings = {}) ⇒ MockReader

Returns a new instance of MockReader.

Parameters:

  • input_stream (Ignored)

    (ignored)

  • settings (Hash) (defaults to: {})

    (looks only for an integer in 'mock_reader.limit')


27
28
29
30
31
32
33
34
# File 'lib/traject/mock_reader.rb', line 27

def initialize(input_stream, settings = {})
  @limit = (settings["mock_reader.limit"]  || 10_000).to_i

  @records = load_ndjson(File.open(__FILE__))

  # freeze it immutable for thread safety and performance
  @records.each {|r| r.fields.freeze}
end

Instance Attribute Details

#limitObject

Returns the value of attribute limit


23
24
25
# File 'lib/traject/mock_reader.rb', line 23

def limit
  @limit
end

Instance Method Details

#eachObject


62
63
64
65
66
67
68
69
70
71
# File 'lib/traject/mock_reader.rb', line 62

def each
  unless block_given?
    enum_for(:each)
  else
    size = @records.size
    @limit.times do |i|
      yield @records[i % size]
    end
  end
end

#load_ndjson(file_io) ⇒ Object

newline delimited json, assuming no internal unescaped newlines in json too!


38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/traject/mock_reader.rb', line 38

def load_ndjson(file_io)
  records = []

  this_file_iter = file_io.each_line


  while true
    line = this_file_iter.next
    break if /^\_\_END\_\_/.match line
  end

  begin
    while true
      json = this_file_iter.next
      next unless /\S/.match json
      records << MARC::Record.new_from_hash(JSON.parse(json))
    end
  rescue StopIteration
  end

  return records
end