Class: Picky::Bundle
- Defined in:
- lib/picky/bundle.rb,
lib/picky/bundle_indexed.rb,
lib/picky/bundle_indexing.rb,
lib/picky/bundle_realtime.rb
Overview
A Bundle is a number of indexes per [index, category] combination.
At most, there are three indexes:
-
core index (always used)
-
weights index (always used)
-
similarity index (used with similarity)
In Picky, indexing is separated from the index handling itself through a parallel structure.
Both use methods provided by this base class, but have very different goals:
-
Indexing::Bundle is just concerned with creating index files and providing helper functions to e.g. check the indexes.
-
Index::Bundle is concerned with loading these index files into memory and looking up search data as fast as possible.
This is the indexing bundle.
It does all menial tasks that have nothing to do with the actual index running etc. (Find these in Indexed::Bundle)
Instance Attribute Summary collapse
-
#backend_configuration ⇒ Object
Returns the value of attribute backend_configuration.
-
#backend_inverted ⇒ Object
Returns the value of attribute backend_inverted.
-
#backend_realtime ⇒ Object
Returns the value of attribute backend_realtime.
-
#backend_similarity ⇒ Object
Returns the value of attribute backend_similarity.
-
#backend_weights ⇒ Object
Returns the value of attribute backend_weights.
-
#category ⇒ Object
readonly
Returns the value of attribute category.
-
#configuration ⇒ Object
Returns the value of attribute configuration.
-
#inverted ⇒ Object
Returns the value of attribute inverted.
-
#name ⇒ Object
readonly
Returns the value of attribute name.
-
#partial_strategy ⇒ Object
Returns the value of attribute partial_strategy.
-
#realtime ⇒ Object
Returns the value of attribute realtime.
-
#similarity ⇒ Object
Returns the value of attribute similarity.
-
#similarity_strategy ⇒ Object
Returns the value of attribute similarity_strategy.
-
#weight_strategy ⇒ Object
Returns the value of attribute weight_strategy.
-
#weights ⇒ Object
Returns the value of attribute weights.
Instance Method Summary collapse
-
#[](str_or_sym) ⇒ Object
Get settings for this bundle.
-
#add(id, str_or_sym, method: :unshift, static: false, force_update: false) ⇒ Object
Returns a reference to the array where the id has been added.
-
#add_partialized(id, text, method: :unshift, static: false, force_update: false) ⇒ Object
Partializes the text and then adds each.
-
#add_similarity(str_or_sym, method: :unshift) ⇒ Object
Add string/symbol to similarity index.
-
#backend ⇒ Object
If no specific backend has been set, uses the category’s backend.
-
#build_realtime(symbol_keys) ⇒ Object
Builds the realtime mapping.
-
#clear ⇒ Object
Clears all indexes.
-
#clear_configuration ⇒ Object
Clears the configuration.
-
#clear_inverted ⇒ Object
Clears the core index.
-
#clear_realtime ⇒ Object
Clears the realtime mapping.
-
#clear_similarity ⇒ Object
Clears the similarity index.
-
#clear_weights ⇒ Object
Clears the weights index.
-
#create_backends ⇒ Object
Extract specific indexes from backend.
-
#delete ⇒ Object
Delete all index files.
-
#dump ⇒ Object
Saves the indexes in a dump file.
-
#empty ⇒ Object
“Empties” the index(es) by getting a new empty internal backend instance.
- #identifier ⇒ Object
-
#ids(str_or_sym) ⇒ Object
Get the ids for the given symbol.
-
#index_path(type = nil) ⇒ Object
Path and partial filename of a specific subindex.
-
#initialize(name, category, weight_strategy, partial_strategy, similarity_strategy, options = {}) ⇒ Bundle
constructor
TODO Move the strategies into options.
-
#initialize_backends ⇒ Object
Initial indexes.
-
#key_format ⇒ Object
If a key format is set, use it, else forward to the category.
-
#load(symbol_keys = false) ⇒ Object
Loads all indexes.
-
#load_configuration ⇒ Object
Loads the configuration.
-
#load_inverted(symbol_keys) ⇒ Object
Loads the core index.
-
#load_realtime ⇒ Object
Loads the realtime mapping.
-
#load_similarity(symbol_keys) ⇒ Object
Loads the similarity index.
-
#load_weights(symbol_keys) ⇒ Object
Loads the weights index.
-
#on_all_indexes_call(method_name) ⇒ Object
Extracted to avoid duplicate code.
- #partialized(text, &block) ⇒ Object
-
#remove(id) ⇒ Object
Removes the given id from the indexes.
-
#reset_backend ⇒ Object
Initializes all necessary indexes from the backend.
-
#similar(str_or_sym) ⇒ Object
Get a list of similar texts.
- #to_s ⇒ Object
- #to_tree_s(indent = 0, &block) ⇒ Object
-
#weight(str_or_sym) ⇒ Object
Get a weight for the given symbol.
Constructor Details
#initialize(name, category, weight_strategy, partial_strategy, similarity_strategy, options = {}) ⇒ Bundle
TODO Move the strategies into options.
49 50 51 52 53 54 55 56 57 58 59 60 61 |
# File 'lib/picky/bundle.rb', line 49 def initialize name, category, weight_strategy, partial_strategy, similarity_strategy, = {} @name = name @category = category @weight_strategy = weight_strategy @partial_strategy = partial_strategy @similarity_strategy = similarity_strategy @hints = [:hints] @backend = [:backend] reset_backend end |
Instance Attribute Details
#backend_configuration ⇒ Object
Returns the value of attribute backend_configuration.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def backend_configuration @backend_configuration end |
#backend_inverted ⇒ Object
Returns the value of attribute backend_inverted.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def backend_inverted @backend_inverted end |
#backend_realtime ⇒ Object
Returns the value of attribute backend_realtime.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def backend_realtime @backend_realtime end |
#backend_similarity ⇒ Object
Returns the value of attribute backend_similarity.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def backend_similarity @backend_similarity end |
#backend_weights ⇒ Object
Returns the value of attribute backend_weights.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def backend_weights @backend_weights end |
#category ⇒ Object (readonly)
Returns the value of attribute category.
25 26 27 |
# File 'lib/picky/bundle.rb', line 25 def category @category end |
#configuration ⇒ Object
Returns the value of attribute configuration.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def configuration @configuration end |
#inverted ⇒ Object
Returns the value of attribute inverted.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def inverted @inverted end |
#name ⇒ Object (readonly)
Returns the value of attribute name.
25 26 27 |
# File 'lib/picky/bundle.rb', line 25 def name @name end |
#partial_strategy ⇒ Object
Returns the value of attribute partial_strategy.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def partial_strategy @partial_strategy end |
#realtime ⇒ Object
Returns the value of attribute realtime.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def realtime @realtime end |
#similarity ⇒ Object
Returns the value of attribute similarity.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def similarity @similarity end |
#similarity_strategy ⇒ Object
Returns the value of attribute similarity_strategy.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def similarity_strategy @similarity_strategy end |
#weight_strategy ⇒ Object
Returns the value of attribute weight_strategy.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def weight_strategy @weight_strategy end |
#weights ⇒ Object
Returns the value of attribute weights.
28 29 30 |
# File 'lib/picky/bundle.rb', line 28 def weights @weights end |
Instance Method Details
#[](str_or_sym) ⇒ Object
Get settings for this bundle.
Returns an object.
57 58 59 |
# File 'lib/picky/bundle_indexed.rb', line 57 def [] str_or_sym @configuration[str_or_sym] end |
#add(id, str_or_sym, method: :unshift, static: false, force_update: false) ⇒ Object
Returns a reference to the array where the id has been added.
Does not add to realtime if static.
TODO What does static do again? TODO Why the realtime index? Is it really necessary?
Not absolutely. It was for efficient deletion/replacement.
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
# File 'lib/picky/bundle_realtime.rb', line 55 def add id, str_or_sym, method: :unshift, static: false, force_update: false # If static, indexing will be slower, but will use less # space in the end. # if static ids = @inverted[str_or_sym] ||= [] if force_update ids.delete id ids.send method, id else # TODO Adding should not change the array if it's already in. # if ids.include?(id) # Do nothing. Not forced, and already in. else ids.send method, id end end else # Use a generalized strategy. # str_or_syms = (@realtime[id] ||= []) # (static ? nil : [])) # Inverted. # ids = if str_or_syms.include?(str_or_sym) ids = @inverted[str_or_sym] ||= [] # If updates are forced or if it isn't in there already # then remove and add to the index. if force_update || !ids.include?(id) ids.delete id ids.send method, id end ids else # Update the realtime index. # str_or_syms << str_or_sym # TODO Add has_key? to index backends. # ids = if @inverted.has_key?(str_or_sym) # @inverted[str_or_sym] # else # @inverted[str_or_sym] = [] # end ids = (@inverted[str_or_sym] ||= []) ids.send method, id end end # Weights. # @weights[str_or_sym] = self.weight_strategy.weight_for ids.size # Similarity. # add_similarity str_or_sym, method: method # Return reference. # ids end |
#add_partialized(id, text, method: :unshift, static: false, force_update: false) ⇒ Object
Partializes the text and then adds each.
134 135 136 137 138 |
# File 'lib/picky/bundle_realtime.rb', line 134 def add_partialized id, text, method: :unshift, static: false, force_update: false partialized text do |partial_text| add id, partial_text, method: method, static: static, force_update: force_update end end |
#add_similarity(str_or_sym, method: :unshift) ⇒ Object
Add string/symbol to similarity index.
119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/picky/bundle_realtime.rb', line 119 def add_similarity str_or_sym, method: :unshift if encoded = self.similarity_strategy.encode(str_or_sym) similars = @similarity[encoded] ||= [] # Not completely correct, as others will also be affected, but meh. # similars.delete str_or_sym if similars.include? str_or_sym similars << str_or_sym self.similarity_strategy.prioritize similars, str_or_sym end end |
#backend ⇒ Object
If no specific backend has been set, uses the category’s backend.
69 70 71 |
# File 'lib/picky/bundle.rb', line 69 def backend @backend || category.backend end |
#build_realtime(symbol_keys) ⇒ Object
Builds the realtime mapping.
Note: Experimental feature. Might be removed in 5.0.
THINK Maybe load it and just replace the arrays with the corresponding ones.
149 150 151 152 153 154 155 156 157 158 |
# File 'lib/picky/bundle_realtime.rb', line 149 def build_realtime symbol_keys clear_realtime @inverted.each_pair do |str_or_sym, ids| ids.each do |id| str_or_syms = (@realtime[id] ||= []) str_or_sym = str_or_sym.to_sym if symbol_keys @realtime[id] << str_or_sym unless str_or_syms.include? str_or_sym end end end |
#clear ⇒ Object
Clears all indexes.
102 103 104 105 106 107 108 |
# File 'lib/picky/bundle_indexed.rb', line 102 def clear clear_inverted clear_weights clear_similarity clear_configuration clear_realtime end |
#clear_configuration ⇒ Object
Clears the configuration.
127 128 129 |
# File 'lib/picky/bundle_indexed.rb', line 127 def clear_configuration configuration.clear end |
#clear_inverted ⇒ Object
Clears the core index.
112 113 114 |
# File 'lib/picky/bundle_indexed.rb', line 112 def clear_inverted inverted.clear end |
#clear_realtime ⇒ Object
Clears the realtime mapping.
132 133 134 |
# File 'lib/picky/bundle_indexed.rb', line 132 def clear_realtime realtime.clear end |
#clear_similarity ⇒ Object
Clears the similarity index.
122 123 124 |
# File 'lib/picky/bundle_indexed.rb', line 122 def clear_similarity similarity.clear end |
#clear_weights ⇒ Object
Clears the weights index.
117 118 119 |
# File 'lib/picky/bundle_indexed.rb', line 117 def clear_weights weights.clear end |
#create_backends ⇒ Object
Extract specific indexes from backend.
TODO Move @backend_ into the backend?
84 85 86 87 88 89 90 |
# File 'lib/picky/bundle.rb', line 84 def create_backends @backend_inverted = backend.create_inverted self, @hints @backend_weights = backend.create_weights self, @hints @backend_similarity = backend.create_similarity self, @hints @backend_configuration = backend.create_configuration self, @hints @backend_realtime = backend.create_realtime self, @hints end |
#delete ⇒ Object
Delete all index files.
120 121 122 123 124 125 126 127 128 |
# File 'lib/picky/bundle.rb', line 120 def delete @backend_inverted.delete if @backend_inverted.respond_to? :delete # THINK about this. Perhaps the strategies should implement the backend methods? # @backend_weights.delete if @backend_weights.respond_to?(:delete) && @weight_strategy.respond_to?(:saved?) && @weight_strategy.saved? @backend_similarity.delete if @backend_similarity.respond_to? :delete @backend_configuration.delete if @backend_configuration.respond_to? :delete @backend_realtime.delete if @backend_realtime.respond_to? :delete end |
#dump ⇒ Object
Saves the indexes in a dump file.
33 34 35 36 37 38 39 40 41 |
# File 'lib/picky/bundle_indexing.rb', line 33 def dump @backend_inverted.dump @inverted # THINK about this. Perhaps the strategies should implement the backend methods? Or only the internal index ones? # @backend_weights.dump @weights if @weight_strategy.respond_to?(:saved?) && @weight_strategy.saved? @backend_similarity.dump @similarity if @similarity_strategy.respond_to?(:saved?) && @similarity_strategy.saved? @backend_configuration.dump @configuration @backend_realtime.dump @realtime end |
#empty ⇒ Object
“Empties” the index(es) by getting a new empty internal backend instance.
104 105 106 |
# File 'lib/picky/bundle.rb', line 104 def empty on_all_indexes_call :empty end |
#identifier ⇒ Object
62 63 64 |
# File 'lib/picky/bundle.rb', line 62 def identifier @identifier ||= :"#{category.identifier}:#{name}" end |
#ids(str_or_sym) ⇒ Object
Get the ids for the given symbol.
Returns a (potentially empty) array of ids.
Note: If the backend wants to return a special enumerable, the backend should do so.
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'lib/picky/bundle_indexed.rb', line 26 def ids str_or_sym @inverted[str_or_sym] || [] # THINK Place the key_format conversion here – or move into the backend? # # if @key_format # class << self # def ids # (@inverted[sym_or_string] || []).map &@key_format # end # end # else # class << self # def ids # @inverted[sym_or_string] || [] # end # end # end end |
#index_path(type = nil) ⇒ Object
Path and partial filename of a specific subindex.
Subindexes are:
* inverted index
* weights index
* partial index
* similarity index
Returns just the part without subindex type, if none given.
166 167 168 |
# File 'lib/picky/bundle.rb', line 166 def index_path type = nil ::File.join index_directory, "#{category.name}_#{name}#{ "_#{type}" if type }" end |
#initialize_backends ⇒ Object
Initial indexes.
Note that if the weights strategy doesn’t need to be saved, the strategy itself pretends to be an index.
97 98 99 |
# File 'lib/picky/bundle.rb', line 97 def initialize_backends on_all_indexes_call :initial end |
#key_format ⇒ Object
If a key format is set, use it, else forward to the category.
151 152 153 |
# File 'lib/picky/bundle.rb', line 151 def key_format @key_format ||= @category.key_format end |
#load(symbol_keys = false) ⇒ Object
Loads all indexes.
Loading loads index objects from the backend. They should each respond to [] and return something appropriate.
66 67 68 69 70 71 72 |
# File 'lib/picky/bundle_indexed.rb', line 66 def load symbol_keys = false load_inverted symbol_keys load_weights symbol_keys load_similarity symbol_keys load_configuration load_realtime end |
#load_configuration ⇒ Object
Loads the configuration.
91 92 93 |
# File 'lib/picky/bundle_indexed.rb', line 91 def load_configuration self.configuration = @backend_configuration.load false end |
#load_inverted(symbol_keys) ⇒ Object
Loads the core index.
76 77 78 |
# File 'lib/picky/bundle_indexed.rb', line 76 def load_inverted symbol_keys self.inverted = @backend_inverted.load symbol_keys end |
#load_realtime ⇒ Object
Loads the realtime mapping.
96 97 98 |
# File 'lib/picky/bundle_indexed.rb', line 96 def load_realtime self.realtime = @backend_realtime.load false end |
#load_similarity(symbol_keys) ⇒ Object
Loads the similarity index.
86 87 88 |
# File 'lib/picky/bundle_indexed.rb', line 86 def load_similarity symbol_keys self.similarity = @backend_similarity.load symbol_keys unless @similarity_strategy.respond_to?(:saved?) && !@similarity_strategy.saved? end |
#load_weights(symbol_keys) ⇒ Object
Loads the weights index.
81 82 83 |
# File 'lib/picky/bundle_indexed.rb', line 81 def load_weights symbol_keys self.weights = @backend_weights.load symbol_keys unless @weight_strategy.respond_to?(:saved?) && !@weight_strategy.saved? end |
#on_all_indexes_call(method_name) ⇒ Object
Extracted to avoid duplicate code.
110 111 112 113 114 115 116 |
# File 'lib/picky/bundle.rb', line 110 def on_all_indexes_call method_name @inverted = @backend_inverted.send method_name @weights = @weight_strategy.respond_to?(:saved?) && !@weight_strategy.saved? ? @weight_strategy : @backend_weights.send(method_name) @similarity = @backend_similarity.send method_name @configuration = @backend_configuration.send method_name @realtime = @backend_realtime.send method_name end |
#partialized(text, &block) ⇒ Object
139 140 141 |
# File 'lib/picky/bundle_realtime.rb', line 139 def partialized text, &block self.partial_strategy.each_partial text, &block end |
#remove(id) ⇒ Object
Removes the given id from the indexes.
TODO Simplify (and slow) this again – remove the realtime index.
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
# File 'lib/picky/bundle_realtime.rb', line 14 def remove id # Is it anywhere? # str_or_syms = @realtime[id] return if str_or_syms.blank? str_or_syms.each do |str_or_sym| ids = @inverted[str_or_sym] ids.delete id if ids.empty? @inverted.delete str_or_sym @weights.delete str_or_sym # Since no element uses this sym anymore, we can delete the similarity for it. # # TODO Not really. Since multiple syms can point to the same encoded. # In essence, we don't know if and when we can remove it. # (One idea is to add an array of ids and remove from that) # @similarity.delete self.similarity_strategy.encode(str_or_sym) else @weights[str_or_sym] = self.weight_strategy.weight_for ids.size # @weights[str_or_sym] = self.weight_strategy.respond_to?(:[]) && # self.weight_strategy[str_or_sym] || # self.weight_strategy.weight_for(ids.size) end end @realtime.delete id end |
#reset_backend ⇒ Object
Initializes all necessary indexes from the backend.
75 76 77 78 |
# File 'lib/picky/bundle.rb', line 75 def reset_backend create_backends initialize_backends end |
#similar(str_or_sym) ⇒ Object
Get a list of similar texts.
Note: Also checks for itself.
134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
# File 'lib/picky/bundle.rb', line 134 def similar str_or_sym code = similarity_strategy.encode str_or_sym return [] unless code @similarity[code] || [] # similar_codes = @similarity[code] # if similar_codes.blank? # [] # Return a simple array. # else # similar_codes = similar_codes.dup # similar_codes.delete text # Remove itself. # similar_codes # end end |
#to_s ⇒ Object
182 183 184 |
# File 'lib/picky/bundle.rb', line 182 def to_s "#{self.class}(#{identifier})" end |
#to_tree_s(indent = 0, &block) ⇒ Object
170 171 172 173 174 175 176 177 178 179 180 |
# File 'lib/picky/bundle.rb', line 170 def to_tree_s indent = 0, &block s = <<-TREE #{' ' * indent}#{self.class.name.gsub('Picky::','')}(#{name}) #{' ' * indent} Inverted(#{inverted.size})[#{backend_inverted}]#{block && block.call(inverted)} #{' ' * indent} Weights (#{weights.size})[#{backend_weights}]#{block && block.call(weights)} #{' ' * indent} Similari(#{similarity.size})[#{backend_similarity}]#{block && block.call(similarity)} #{' ' * indent} Realtime(#{realtime.size})[#{backend_realtime}]#{block && block.call(realtime)} #{' ' * indent} Configur(#{configuration.size})[#{backend_configuration}]#{block && block.call(configuration)} TREE s.chomp end |
#weight(str_or_sym) ⇒ Object
Get a weight for the given symbol.
Returns a number, or nil.
49 50 51 |
# File 'lib/picky/bundle_indexed.rb', line 49 def weight str_or_sym @weights[str_or_sym] end |