Class: Picky::Bundle

Inherits:
Object show all
Defined in:
lib/picky/bundle.rb,
lib/picky/bundle_indexed.rb,
lib/picky/bundle_indexing.rb,
lib/picky/bundle_realtime.rb

Overview

A Bundle is a number of indexes per [index, category] combination.

At most, there are three indexes:

  • core index (always used)

  • weights index (always used)

  • similarity index (used with similarity)

In Picky, indexing is separated from the index handling itself through a parallel structure.

Both use methods provided by this base class, but have very different goals:

  • Indexing::Bundle is just concerned with creating index files and providing helper functions to e.g. check the indexes.

  • Index::Bundle is concerned with loading these index files into memory and looking up search data as fast as possible.

This is the indexing bundle.

It does all menial tasks that have nothing to do with the actual index running etc. (Find these in Indexed::Bundle)

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(name, category, weight_strategy, partial_strategy, similarity_strategy, options = {}) ⇒ Bundle

TODO Move the strategies into options.



49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/picky/bundle.rb', line 49

def initialize name, category, weight_strategy, partial_strategy, similarity_strategy, options = {}
  @name     = name
  @category = category

  @weight_strategy     = weight_strategy
  @partial_strategy    = partial_strategy
  @similarity_strategy = similarity_strategy
  
  @hints      = options[:hints]
  @backend    = options[:backend]

  reset_backend
end

Instance Attribute Details

#backend_configurationObject

Returns the value of attribute backend_configuration.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def backend_configuration
  @backend_configuration
end

#backend_invertedObject

Returns the value of attribute backend_inverted.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def backend_inverted
  @backend_inverted
end

#backend_realtimeObject

Returns the value of attribute backend_realtime.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def backend_realtime
  @backend_realtime
end

#backend_similarityObject

Returns the value of attribute backend_similarity.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def backend_similarity
  @backend_similarity
end

#backend_weightsObject

Returns the value of attribute backend_weights.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def backend_weights
  @backend_weights
end

#categoryObject (readonly)

Returns the value of attribute category.



25
26
27
# File 'lib/picky/bundle.rb', line 25

def category
  @category
end

#configurationObject

Returns the value of attribute configuration.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def configuration
  @configuration
end

#invertedObject

Returns the value of attribute inverted.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def inverted
  @inverted
end

#nameObject (readonly)

Returns the value of attribute name.



25
26
27
# File 'lib/picky/bundle.rb', line 25

def name
  @name
end

#partial_strategyObject

Returns the value of attribute partial_strategy.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def partial_strategy
  @partial_strategy
end

#realtimeObject

Returns the value of attribute realtime.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def realtime
  @realtime
end

#similarityObject

Returns the value of attribute similarity.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def similarity
  @similarity
end

#similarity_strategyObject

Returns the value of attribute similarity_strategy.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def similarity_strategy
  @similarity_strategy
end

#weight_strategyObject

Returns the value of attribute weight_strategy.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def weight_strategy
  @weight_strategy
end

#weightsObject

Returns the value of attribute weights.



28
29
30
# File 'lib/picky/bundle.rb', line 28

def weights
  @weights
end

Instance Method Details

#[](str_or_sym) ⇒ Object

Get settings for this bundle.

Returns an object.



57
58
59
# File 'lib/picky/bundle_indexed.rb', line 57

def [] str_or_sym
  @configuration[str_or_sym]
end

#add(id, str_or_sym, method: :unshift, static: false, force_update: false) ⇒ Object

Returns a reference to the array where the id has been added.

Does not add to realtime if static.

TODO What does static do again? TODO Why the realtime index? Is it really necessary?

Not absolutely. It was for efficient deletion/replacement.


55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
# File 'lib/picky/bundle_realtime.rb', line 55

def add id, str_or_sym, method: :unshift, static: false, force_update: false
  # If static, indexing will be slower, but will use less
  # space in the end.
  #
  if static
    ids = @inverted[str_or_sym] ||= []
    if force_update
      ids.delete id
      ids.send method, id
    else
      # TODO Adding should not change the array if it's already in.
      #
      if ids.include?(id)
        # Do nothing. Not forced, and already in.
      else
        ids.send method, id
      end
    end
  else
    # Use a generalized strategy.
    #
    str_or_syms = (@realtime[id] ||= []) # (static ? nil : []))

    # Inverted.
    #
    ids = if str_or_syms.include?(str_or_sym)
      ids = @inverted[str_or_sym] ||= []
      # If updates are forced or if it isn't in there already
      # then remove and add to the index.
      if force_update || !ids.include?(id)
        ids.delete id
        ids.send method, id
      end
      ids
    else
      # Update the realtime index.
      #
      str_or_syms << str_or_sym
      # TODO Add has_key? to index backends.
      # ids = if @inverted.has_key?(str_or_sym)
      #   @inverted[str_or_sym]
      # else
      #   @inverted[str_or_sym] = []
      # end
      ids = (@inverted[str_or_sym] ||= [])
      ids.send method, id
    end
  end
    
  # Weights.
  #
  @weights[str_or_sym] = self.weight_strategy.weight_for ids.size

  # Similarity.
  #
  add_similarity str_or_sym, method: method

  # Return reference.
  #
  ids
end

#add_partialized(id, text, method: :unshift, static: false, force_update: false) ⇒ Object

Partializes the text and then adds each.



134
135
136
137
138
# File 'lib/picky/bundle_realtime.rb', line 134

def add_partialized id, text, method: :unshift, static: false, force_update: false
  partialized text do |partial_text|
    add id, partial_text, method: method, static: static, force_update: force_update
  end
end

#add_similarity(str_or_sym, method: :unshift) ⇒ Object

Add string/symbol to similarity index.



119
120
121
122
123
124
125
126
127
128
129
130
# File 'lib/picky/bundle_realtime.rb', line 119

def add_similarity str_or_sym, method: :unshift
  if encoded = self.similarity_strategy.encode(str_or_sym)
    similars = @similarity[encoded] ||= []

    # Not completely correct, as others will also be affected, but meh.
    #
    similars.delete str_or_sym if similars.include? str_or_sym
    similars << str_or_sym

    self.similarity_strategy.prioritize similars, str_or_sym
  end
end

#backendObject

If no specific backend has been set, uses the category’s backend.



69
70
71
# File 'lib/picky/bundle.rb', line 69

def backend
  @backend || category.backend
end

#build_realtime(symbol_keys) ⇒ Object

Builds the realtime mapping.

Note: Experimental feature. Might be removed in 5.0.

THINK Maybe load it and just replace the arrays with the corresponding ones.



149
150
151
152
153
154
155
156
157
158
# File 'lib/picky/bundle_realtime.rb', line 149

def build_realtime symbol_keys
  clear_realtime
  @inverted.each_pair do |str_or_sym, ids|
    ids.each do |id|
      str_or_syms = (@realtime[id] ||= [])
      str_or_sym = str_or_sym.to_sym if symbol_keys
      @realtime[id] << str_or_sym unless str_or_syms.include? str_or_sym
    end
  end
end

#clearObject

Clears all indexes.



102
103
104
105
106
107
108
# File 'lib/picky/bundle_indexed.rb', line 102

def clear
  clear_inverted
  clear_weights
  clear_similarity
  clear_configuration
  clear_realtime
end

#clear_configurationObject

Clears the configuration.



127
128
129
# File 'lib/picky/bundle_indexed.rb', line 127

def clear_configuration
  configuration.clear
end

#clear_invertedObject

Clears the core index.



112
113
114
# File 'lib/picky/bundle_indexed.rb', line 112

def clear_inverted
  inverted.clear
end

#clear_realtimeObject

Clears the realtime mapping.



132
133
134
# File 'lib/picky/bundle_indexed.rb', line 132

def clear_realtime
  realtime.clear
end

#clear_similarityObject

Clears the similarity index.



122
123
124
# File 'lib/picky/bundle_indexed.rb', line 122

def clear_similarity
  similarity.clear
end

#clear_weightsObject

Clears the weights index.



117
118
119
# File 'lib/picky/bundle_indexed.rb', line 117

def clear_weights
  weights.clear
end

#create_backendsObject

Extract specific indexes from backend.

TODO Move @backend_ into the backend?



84
85
86
87
88
89
90
# File 'lib/picky/bundle.rb', line 84

def create_backends
  @backend_inverted      = backend.create_inverted self, @hints
  @backend_weights       = backend.create_weights self, @hints
  @backend_similarity    = backend.create_similarity self, @hints
  @backend_configuration = backend.create_configuration self, @hints
  @backend_realtime      = backend.create_realtime self, @hints
end

#deleteObject

Delete all index files.



120
121
122
123
124
125
126
127
128
# File 'lib/picky/bundle.rb', line 120

def delete
  @backend_inverted.delete       if @backend_inverted.respond_to? :delete
  # THINK about this. Perhaps the strategies should implement the backend methods?
  #
  @backend_weights.delete        if @backend_weights.respond_to?(:delete) && @weight_strategy.respond_to?(:saved?) && @weight_strategy.saved?
  @backend_similarity.delete     if @backend_similarity.respond_to? :delete
  @backend_configuration.delete  if @backend_configuration.respond_to? :delete
  @backend_realtime.delete       if @backend_realtime.respond_to? :delete
end

#dumpObject

Saves the indexes in a dump file.



33
34
35
36
37
38
39
40
41
# File 'lib/picky/bundle_indexing.rb', line 33

def dump
  @backend_inverted.dump @inverted
  # THINK about this. Perhaps the strategies should implement the backend methods? Or only the internal index ones?
  #
  @backend_weights.dump @weights if @weight_strategy.respond_to?(:saved?) && @weight_strategy.saved?
  @backend_similarity.dump @similarity if @similarity_strategy.respond_to?(:saved?) && @similarity_strategy.saved?
  @backend_configuration.dump @configuration
  @backend_realtime.dump @realtime
end

#emptyObject

“Empties” the index(es) by getting a new empty internal backend instance.



104
105
106
# File 'lib/picky/bundle.rb', line 104

def empty
  on_all_indexes_call :empty
end

#identifierObject



62
63
64
# File 'lib/picky/bundle.rb', line 62

def identifier
  @identifier ||= :"#{category.identifier}:#{name}"
end

#ids(str_or_sym) ⇒ Object

Get the ids for the given symbol.

Returns a (potentially empty) array of ids.

Note: If the backend wants to return a special enumerable, the backend should do so.



26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/picky/bundle_indexed.rb', line 26

def ids str_or_sym
  @inverted[str_or_sym] || []
  # THINK Place the key_format conversion here – or move into the backend?
  #
  # if @key_format
  #   class << self
  #     def ids
  #       (@inverted[sym_or_string] || []).map &@key_format
  #     end
  #   end
  # else
  #   class << self
  #     def ids
  #       @inverted[sym_or_string] || []
  #     end
  #   end
  # end
end

#index_path(type = nil) ⇒ Object

Path and partial filename of a specific subindex.

Subindexes are:

* inverted index
* weights index
* partial index
* similarity index

Returns just the part without subindex type, if none given.



166
167
168
# File 'lib/picky/bundle.rb', line 166

def index_path type = nil
  ::File.join index_directory, "#{category.name}_#{name}#{ "_#{type}" if type }"
end

#initialize_backendsObject

Initial indexes.

Note that if the weights strategy doesn’t need to be saved, the strategy itself pretends to be an index.



97
98
99
# File 'lib/picky/bundle.rb', line 97

def initialize_backends
  on_all_indexes_call :initial
end

#key_formatObject

If a key format is set, use it, else forward to the category.



151
152
153
# File 'lib/picky/bundle.rb', line 151

def key_format
  @key_format ||= @category.key_format
end

#load(symbol_keys = false) ⇒ Object

Loads all indexes.

Loading loads index objects from the backend. They should each respond to [] and return something appropriate.



66
67
68
69
70
71
72
# File 'lib/picky/bundle_indexed.rb', line 66

def load symbol_keys = false
  load_inverted symbol_keys
  load_weights symbol_keys
  load_similarity symbol_keys
  load_configuration
  load_realtime
end

#load_configurationObject

Loads the configuration.



91
92
93
# File 'lib/picky/bundle_indexed.rb', line 91

def load_configuration
  self.configuration = @backend_configuration.load false
end

#load_inverted(symbol_keys) ⇒ Object

Loads the core index.



76
77
78
# File 'lib/picky/bundle_indexed.rb', line 76

def load_inverted symbol_keys
  self.inverted = @backend_inverted.load symbol_keys
end

#load_realtimeObject

Loads the realtime mapping.



96
97
98
# File 'lib/picky/bundle_indexed.rb', line 96

def load_realtime
  self.realtime = @backend_realtime.load false
end

#load_similarity(symbol_keys) ⇒ Object

Loads the similarity index.



86
87
88
# File 'lib/picky/bundle_indexed.rb', line 86

def load_similarity symbol_keys
  self.similarity = @backend_similarity.load symbol_keys unless @similarity_strategy.respond_to?(:saved?) && !@similarity_strategy.saved?
end

#load_weights(symbol_keys) ⇒ Object

Loads the weights index.



81
82
83
# File 'lib/picky/bundle_indexed.rb', line 81

def load_weights symbol_keys
  self.weights = @backend_weights.load symbol_keys unless @weight_strategy.respond_to?(:saved?) && !@weight_strategy.saved?
end

#on_all_indexes_call(method_name) ⇒ Object

Extracted to avoid duplicate code.



110
111
112
113
114
115
116
# File 'lib/picky/bundle.rb', line 110

def on_all_indexes_call method_name
  @inverted      = @backend_inverted.send method_name
  @weights       = @weight_strategy.respond_to?(:saved?) && !@weight_strategy.saved? ? @weight_strategy : @backend_weights.send(method_name)
  @similarity    = @backend_similarity.send method_name
  @configuration = @backend_configuration.send method_name
  @realtime      = @backend_realtime.send method_name
end

#partialized(text, &block) ⇒ Object



139
140
141
# File 'lib/picky/bundle_realtime.rb', line 139

def partialized text, &block
  self.partial_strategy.each_partial text, &block
end

#remove(id) ⇒ Object

Removes the given id from the indexes.

TODO Simplify (and slow) this again – remove the realtime index.



14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# File 'lib/picky/bundle_realtime.rb', line 14

def remove id
  # Is it anywhere?
  #
  str_or_syms = @realtime[id]

  return if str_or_syms.blank?

  str_or_syms.each do |str_or_sym|
    ids = @inverted[str_or_sym]
    ids.delete id

    if ids.empty?
      @inverted.delete str_or_sym
      @weights.delete  str_or_sym

      # Since no element uses this sym anymore, we can delete the similarity for it.
      #
      # TODO Not really. Since multiple syms can point to the same encoded.
      # In essence, we don't know if and when we can remove it.
      # (One idea is to add an array of ids and remove from that)
      #
      @similarity.delete self.similarity_strategy.encode(str_or_sym)
    else
      @weights[str_or_sym] = self.weight_strategy.weight_for ids.size
      # @weights[str_or_sym] = self.weight_strategy.respond_to?(:[]) &&
      #                        self.weight_strategy[str_or_sym] ||
      #                        self.weight_strategy.weight_for(ids.size)
    end
  end

  @realtime.delete id
end

#reset_backendObject

Initializes all necessary indexes from the backend.



75
76
77
78
# File 'lib/picky/bundle.rb', line 75

def reset_backend
  create_backends
  initialize_backends
end

#similar(str_or_sym) ⇒ Object

Get a list of similar texts.

Note: Also checks for itself.



134
135
136
137
138
139
140
141
142
143
144
145
146
147
# File 'lib/picky/bundle.rb', line 134

def similar str_or_sym
  code = similarity_strategy.encode str_or_sym
  return [] unless code
  @similarity[code] || []
  
  # similar_codes = @similarity[code]
  # if similar_codes.blank?
  #   [] # Return a simple array.
  # else
  #   similar_codes = similar_codes.dup
  #   similar_codes.delete text # Remove itself.
  #   similar_codes
  # end
end

#to_sObject



182
183
184
# File 'lib/picky/bundle.rb', line 182

def to_s
  "#{self.class}(#{identifier})"
end

#to_tree_s(indent = 0, &block) ⇒ Object



170
171
172
173
174
175
176
177
178
179
180
# File 'lib/picky/bundle.rb', line 170

def to_tree_s indent = 0, &block
  s = <<-TREE
#{' ' * indent}#{self.class.name.gsub('Picky::','')}(#{name})
#{' ' * indent}    Inverted(#{inverted.size})[#{backend_inverted}]#{block && block.call(inverted)}
#{' ' * indent}    Weights (#{weights.size})[#{backend_weights}]#{block && block.call(weights)}
#{' ' * indent}    Similari(#{similarity.size})[#{backend_similarity}]#{block && block.call(similarity)}
#{' ' * indent}    Realtime(#{realtime.size})[#{backend_realtime}]#{block && block.call(realtime)}
#{' ' * indent}    Configur(#{configuration.size})[#{backend_configuration}]#{block && block.call(configuration)}
TREE
  s.chomp
end

#weight(str_or_sym) ⇒ Object

Get a weight for the given symbol.

Returns a number, or nil.



49
50
51
# File 'lib/picky/bundle_indexed.rb', line 49

def weight str_or_sym
  @weights[str_or_sym]
end