Class: Traject::LineWriter

Inherits:
Object
  • Object
show all
Defined in:
lib/traject/line_writer.rb

Overview

A writer for Traject::Indexer, that just writes out all the output as serialized text with #puts.

Should be thread-safe (ie, multiple worker threads can be calling #put concurrently), by wrapping write to actual output file in a mutex synchronize. This does not seem to effect performance much, as far as I could tell benchmarking.

Output will be sent to settings["output_file"] string path, or else settings["output_stream"] (ruby IO object), or else stdout.

This class can be sub-classed to write out different serialized reprentations -- subclasses will just override the #serialize method. For instance, see JsonWriter.

Direct Known Subclasses

DebugWriter, DelimitedWriter, JsonWriter, YamlWriter

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(argSettings) ⇒ LineWriter

Returns a new instance of LineWriter.



21
22
23
24
25
26
27
# File 'lib/traject/line_writer.rb', line 21

def initialize(argSettings)
  @settings     = argSettings
  @write_mutex  = Mutex.new

  # trigger lazy loading now for thread-safety
  @output_file = open_output_file
end

Instance Attribute Details

#output_fileObject (readonly)

Returns the value of attribute output_file.



19
20
21
# File 'lib/traject/line_writer.rb', line 19

def output_file
  @output_file
end

#settingsObject (readonly)

Returns the value of attribute settings.



18
19
20
# File 'lib/traject/line_writer.rb', line 18

def settings
  @settings
end

#write_mutexObject (readonly)

Returns the value of attribute write_mutex.



19
20
21
# File 'lib/traject/line_writer.rb', line 19

def write_mutex
  @write_mutex
end

Instance Method Details

#_write(data) ⇒ Object



29
30
31
# File 'lib/traject/line_writer.rb', line 29

def _write(data)
  output_file.puts(data)
end

#closeObject



59
60
61
# File 'lib/traject/line_writer.rb', line 59

def close
  @output_file.close unless (@output_file.nil? || @output_file.tty?)
end

#open_output_fileObject



45
46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/traject/line_writer.rb', line 45

def open_output_file
  unless defined? @output_file
    of =
      if settings["output_file"]
        File.open(settings["output_file"], 'w:UTF-8')
      elsif settings["output_stream"]
        settings["output_stream"]
      else
        $stdout
      end
  end
  return of
end

#put(context) ⇒ Object



38
39
40
41
42
43
# File 'lib/traject/line_writer.rb', line 38

def put(context)
  serialized = serialize(context)
  write_mutex.synchronize do
    _write(serialized)
  end
end

#serialize(context) ⇒ Object



34
35
36
# File 'lib/traject/line_writer.rb', line 34

def serialize(context)
  context.output_hash
end