Module: Jekyll::Algolia::Extractor

Includes:
Jekyll::Algolia
Defined in:
lib/jekyll/algolia/extractor.rb

Overview

Module to extract records from Jekyll files

Constant Summary

Constants included from Jekyll::Algolia

MissingCredentialsError, VERSION

Class Method Summary collapse

Methods included from Jekyll::Algolia

init, load_overwrites, site

Class Method Details

.add_unique_object_id(record) ⇒ Object

Public: Adds a unique :objectID field to the hash, representing the current content of the record



47
48
49
50
# File 'lib/jekyll/algolia/extractor.rb', line 47

def self.add_unique_object_id(record)
  record[:objectID] = AlgoliaHTMLExtractor.uuid(record)
  record
end

.extract_raw_records(content) ⇒ Object

Public: Extract raw records from the file, including content for each node and its headings

content - The HTML content to parse



56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# File 'lib/jekyll/algolia/extractor.rb', line 56

def self.extract_raw_records(content)
  records = AlgoliaHTMLExtractor.run(
    content,
    options: {
      css_selector: Configurator.algolia('nodes_to_index'),
      tags_to_exclude: 'script,style,iframe'
    }
  )
  # We remove objectIDs, as the will be added at the very end, after all
  # the hooks and shrinkage
  records.each do |record|
    record.delete(:objectID)
  end

  records
end

.run(file) ⇒ Object

Public: Extract records from the file

file - The Jekyll file to process



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/jekyll/algolia/extractor.rb', line 14

def self.run(file)
  # Getting all nodes from the HTML input
  raw_records = extract_raw_records(file.content)
  # Getting file metadata
   = FileBrowser.(file)

  # If no content, we still index the metadata
  raw_records = [] if raw_records.empty?

  # Building the list of records
  records = []
  raw_records.map do |record|
    # We do not need to pass the HTML node element to the final record
    node = record[:node]
    record.delete(:node)

    # Merging each record info with file info
    record = Utils.compact_empty(record.merge())

    # Apply custom user-defined hooks
    # Users can return `nil` from the hook to signal we should not index
    # such a record
    record = Hooks.apply_each(record, node, Jekyll::Algolia.site)
    next if record.nil?

    records << record
  end

  records
end