Class: Traject::Indexer::NokogiriIndexer

Inherits:
Traject::Indexer show all
Includes:
Macros::NokogiriMacros
Defined in:
lib/traject/indexer/nokogiri_indexer.rb

Overview

An indexer sub-class for XML, where the source records in the pipeline are Nokogiri::XML::Document objects. It sets a default reader of NokogiriReader, and includes Traject::Macros::Nokogiri (with extract_xpath).

See docs on XML use. (TODO)

Constant Summary

Constants inherited from Traject::Indexer

ArityError, CompletedStateError, NamingError

Instance Attribute Summary

Attributes inherited from Traject::Indexer

#logger, #reader_class, #writer, #writer_class

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Macros::NokogiriMacros

#default_namespaces, #extract_xpath

Methods inherited from Traject::Indexer

#after_processing, apply_class_configure_block, #complete, #completed?, #configure, configure, #create_logger, #each_record, #initialize, legacy_marc_mode!, #load_config_file, #log_skip, #logger_format, #map_record, #map_to_context!, #process, #process_record, #process_with, #reader!, #run_after_processing_steps, #settings, #to_field, #writer!

Methods included from QualifiedConstGet

#qualified_const_get

Methods included from Macros::Transformation

#append, #default, #first_only, #gsub, #prepend, #split, #strip, #transform, #translation_map, #unique

Methods included from Macros::Basic

#literal

Constructor Details

This class inherits a constructor from Traject::Indexer

Class Method Details

.default_settingsObject



15
16
17
# File 'lib/traject/indexer/nokogiri_indexer.rb', line 15

def self.default_settings
  @default_settings ||= super.merge("reader_class_name" => "Traject::NokogiriReader")
end

Instance Method Details

#source_record_id_procObject

Overridden from base Indexer, try an id attribute or element on record.



20
21
22
23
24
25
26
27
# File 'lib/traject/indexer/nokogiri_indexer.rb', line 20

def source_record_id_proc
  @source_record_id_proc ||= lambda do |source_xml_record|
    if ( source_xml_record &&
         source_xml_record.kind_of?(Nokogiri::XML::Node) )
      source_xml_record['id'] || (el = source_xml_record.at_xpath('./id') && el.text)
    end
  end
end