Module: Jekyll::Algolia::Shrinker

Includes:
Jekyll::Algolia
Defined in:
lib/jekyll/algolia/shrinker.rb

Overview

Module to shrink a record so it fits in the plan quotas

Constant Summary

Constants included from Jekyll::Algolia

VERSION

Class Method Summary collapse

Methods included from Jekyll::Algolia

init, load_overwrites, run, site

Class Method Details

.fit_to_size(raw_record, max_size) ⇒ Object

Public: Attempt to reduce the size of the record by reducing the size of the less needed attributes

  • raw_record: The record to attempt to reduce

  • max_size: The max size to achieve in bytes

The excerpts are the attributes most subject to being reduced. We’ll go as far as removing them if there is no other choice.



24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# File 'lib/jekyll/algolia/shrinker.rb', line 24

def self.fit_to_size(raw_record, max_size)
  return raw_record if size(raw_record) <= max_size

  # No excerpt, we can't shrink it
  return stop_with_error(raw_record) unless raw_record.key?(:excerpt_html)

  record = raw_record.clone

  # We replace the HTML excerpt with the textual one
  record[:excerpt_html] = record[:excerpt_text]
  return record if size(record) <= max_size

  # We halve the excerpts
  excerpt_words = record[:excerpt_text].split(/\s+/)
  shortened_excerpt = excerpt_words[0...excerpt_words.size / 2].join(' ')
  record[:excerpt_text] = shortened_excerpt
  record[:excerpt_html] = shortened_excerpt
  return record if size(record) <= max_size

  # We remove the excerpts completely
  record.delete(:excerpt_text)
  record.delete(:excerpt_html)
  return record if size(record) <= max_size

  # Still too big, we fail
  stop_with_error(record)
end

.readable_largest_record_keys(record) ⇒ Object

Public: Returns a string explaining which attributes are the largest in the record

record - The record hash to analyze



93
94
95
96
97
98
99
100
101
102
# File 'lib/jekyll/algolia/shrinker.rb', line 93

def self.readable_largest_record_keys(record)
  keys = Hash[record.map { |key, value| [key, value.to_s.length] }]
  largest_keys = keys.sort_by { |_, value| value }.reverse[0..2]
  output = []
  largest_keys.each do |key, size|
    size = Filesize.from("#{size} B").to_s('Kb')
    output << "#{key} (#{size})"
  end
  output.join(', ')
end

.size(record) ⇒ Object

Public: Get the byte size of the object once converted to JSON

  • record: The record to estimate



12
13
14
# File 'lib/jekyll/algolia/shrinker.rb', line 12

def self.size(record)
  record.to_json.length
end

.stop_processObject

Public: Stop the current process



105
106
107
# File 'lib/jekyll/algolia/shrinker.rb', line 105

def self.stop_process
  exit 1
end

.stop_with_error(record) ⇒ Object

Public: Stop the current indexing process and display details about the record that is too big to be pushed

  • record: The record causing the error

This will display an error message and log the wrong record in a file in the source directory



59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/jekyll/algolia/shrinker.rb', line 59

def self.stop_with_error(record)
  record_size = size(record)
  record_size_readable = Filesize.from("#{record_size}B").to_s('Kb')
  max_record_size = Configurator.algolia('max_record_size')
  max_record_size_readable = Filesize
                             .from("#{max_record_size}B").to_s('Kb')

  probable_wrong_keys = readable_largest_record_keys(record)

  # Writing the full record to disk for inspection
  record_log_path = Logger.write_to_file(
    'jekyll-algolia-record-too-big.log',
    JSON.pretty_generate(record)
  )

  details = {
    'object_title' => record[:title],
    'object_url' => record[:url],
    'probable_wrong_keys' => probable_wrong_keys,
    'record_log_path' => record_log_path,
    'nodes_to_index' => Configurator.algolia('nodes_to_index'),
    'record_size' => record_size_readable,
    'max_record_size' => max_record_size_readable
  }

  Logger.known_message('record_too_big', details)

  stop_process
end