Class: Queries::Otu::Autocomplete
- Inherits:
-
Query::Autocomplete
- Object
- Query
- Query::Autocomplete
- Queries::Otu::Autocomplete
- Defined in:
- lib/queries/otu/autocomplete.rb
Overview
See Query::Autocomplete for optimization strategy per name. There are 4 classes of name, each which has the same strategy: OTU name, Original TaxonName, TaxonName, CommonName We then apply a global priority pulling the best names from each sub-strategy to the top.
Constant Summary collapse
- QUERIES =
Keys are method names. Existence of method is checked before requesting the query
{ # OTU otu_name_exact: {priority: 1}, autocomplete_exact_id: {priority: 1}, autocomplete_identifier_cached_exact: {priority: 1}, otu_name_start_match: {priority: 200}, otu_name_similarity: {priority: 220}, # TaxonName autocomplete_taxon_name: {priority: nil}, # Priority is slotted from 10 .. 20 # These are all approximately covered in the blanket taxon_name autocomplete # taxon_name_name_exact: {priority: 10}, # taxon_name_identifier_exact: {priority: 10}, # taxon_name_name_start_match: {priority: 100}, # taxon_name_name_high_cuttoff: {priority: 200}, # CommonName # These should all be covered/moved to common_name_autocomplete, autocomplete_common_name_exact: {priority: 100}, autocomplete_common_name_like: {priority: 1000} # common_name_identifier_exact: {priority: 10}, # common_name_name_start_match: {priority: 100}, # common_name_name_similarity: {priority: 200}, }.freeze
Instance Attribute Summary collapse
-
#exact ⇒ Boolean
&exact=<“true”|“false”> if ‘true’ then only #name = query_string results are returned (no fuzzy matching).
-
#having_taxon_name_only ⇒ Object
Boolean, nil true - only return Otus with ‘name` = nil false,nil - no effect.
-
#with_taxon_name ⇒ Object
Boolean, nil true - OTU must have taxon name false - OTU must not have taxon name nil - ignored.
Attributes inherited from Query::Autocomplete
#dynamic_limit, #project_id, #query_string
Attributes inherited from Query
Instance Method Summary collapse
-
#api_autocomplete ⇒ Object
Maintains valid_taxon_name_id needed for API.
- #autocomplete ⇒ Object
- #autocomplete_base ⇒ Object
-
#autocomplete_taxon_name ⇒ Scope
Pull the result of a TaxonName autocomplete.
- #base_query ⇒ Object
- #compact_priorities(otus) ⇒ Object
-
#initialize(string, project_id: nil, having_taxon_name_only: false, with_taxon_name: nil, exact: 'false') ⇒ Autocomplete
constructor
A new instance of Autocomplete.
- #otu_name_exact ⇒ Object
-
#otu_name_similarity ⇒ Object
All records that meet the similarity cuttoff - this is intended as a generic replacement for wildcarded results.
- #otu_name_start_match ⇒ Object
Methods inherited from Query::Autocomplete
#autocomplete_cached, #autocomplete_cached_wildcard_anywhere, #autocomplete_common_name_exact, #autocomplete_common_name_like, #autocomplete_exact_id, #autocomplete_exactly_named, #autocomplete_named, #autocomplete_ordered_wildcard_pieces_in_cached, #combine_or_clauses, #common_name_name, #common_name_table, #common_name_wild_pieces, #exactly_named, #fragments, #integers, #least_levenshtein, #match_wildcard_end_in_cached, #match_wildcard_in_cached, #named, #only_ids, #only_integers?, #parent, #parent_child_join, #parent_child_where, #pieces, #scope, #string_fragments, #wildcard_wrapped_integers, #wildcard_wrapped_years, #with_cached, #with_cached_like, #with_id, #with_project_id, #year_letter, #years
Methods inherited from Query
#alphabetic_strings, #alphanumeric_strings, base_name, #base_name, #build_terms, #cached_facet, #end_wildcard, #levenshtein_distance, #match_ordered_wildcard_pieces_in_cached, #no_terms?, #referenced_klass, referenced_klass, #referenced_klass_except, #referenced_klass_intersection, #referenced_klass_union, #start_and_end_wildcard, #start_wildcard, #table, #wildcard_pieces
Constructor Details
#initialize(string, project_id: nil, having_taxon_name_only: false, with_taxon_name: nil, exact: 'false') ⇒ Autocomplete
Returns a new instance of Autocomplete.
56 57 58 59 60 61 62 63 |
# File 'lib/queries/otu/autocomplete.rb', line 56 def initialize(string, project_id: nil, having_taxon_name_only: false, with_taxon_name: nil, exact: 'false') super(string, project_id:) @having_taxon_name_only = boolean_param({having_taxon_name_only:}, :having_taxon_name_only) @with_taxon_name = boolean_param({with_taxon_name:}, :with_taxon_name) # TODO: move to mode @exact = boolean_param({exact:}, :exact) end |
Instance Attribute Details
#exact ⇒ Boolean
Returns &exact=<“true”|“false”> if ‘true’ then only #name = query_string results are returned (no fuzzy matching).
27 28 29 |
# File 'lib/queries/otu/autocomplete.rb', line 27 def exact @exact end |
#having_taxon_name_only ⇒ Object
Returns Boolean, nil true - only return Otus with ‘name` = nil false,nil - no effect.
16 17 18 |
# File 'lib/queries/otu/autocomplete.rb', line 16 def having_taxon_name_only @having_taxon_name_only end |
#with_taxon_name ⇒ Object
Returns Boolean, nil true - OTU must have taxon name false - OTU must not have taxon name nil - ignored.
22 23 24 |
# File 'lib/queries/otu/autocomplete.rb', line 22 def with_taxon_name @with_taxon_name end |
Instance Method Details
#api_autocomplete ⇒ Object
Maintains valid_taxon_name_id needed for API.
Considerations:
otus -> taxon names -> valid taxon name_id <- otu can return more OTUs than the original query
because there can be multiple OTUs for the valid name of an invalid original result.
right now we pick the first valid OTU for the name with distinct on()
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
# File 'lib/queries/otu/autocomplete.rb', line 118 def api_autocomplete @with_taxon_name = true # This limit() has more impact now. Since all # names are loaded large matches can swamp exact names # before priority ordering is applied. May require tuning. otus = compact_priorities( autocomplete_base.limit(30) ) otu_order = otus.map(&:id).uniq f = ::Otu.where(id: otu_order) .joins('left join taxon_names t1 on otus.taxon_name_id = t1.id') .joins('left join otus o2 on t1.cached_valid_taxon_name_id = o2.taxon_name_id') .select('distinct on (otus.id) otus.id, otus.name, otus.taxon_name_id, COALESCE(o2.id, otus.id) as otu_valid_id') f.sort_by.with_index { |item, idx| [(otu_order.index(item.id) || 999), (idx || 999)] } end |
#autocomplete ⇒ Object
149 150 151 |
# File 'lib/queries/otu/autocomplete.rb', line 149 def autocomplete compact_priorities( autocomplete_base.limit(40) ) end |
#autocomplete_base ⇒ Object
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
# File 'lib/queries/otu/autocomplete.rb', line 153 def autocomplete_base queries = [] QUERIES.each do |q, p| if self.respond_to?(q) a = send(q) next if a.nil? # query has returned nil y = p[:priority] a = a.joins(:taxon_name) if with_taxon_name a = a.where.missing(:taxon_name) if with_taxon_name == false a = a.joins(:taxon_name).where(otus: {name: nil}) if having_taxon_name_only a = a.select("otus.*, #{y} as priority") unless y.nil? queries.push a end end queries.compact! referenced_klass_union(queries).order('priority') end |
#autocomplete_taxon_name ⇒ Scope
Returns Pull the result of a TaxonName autocomplete. Maintain the order returned, and re-cast the result in terms of an OTU query. Expensive but maintain order is key.
96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
# File 'lib/queries/otu/autocomplete.rb', line 96 def autocomplete_taxon_name taxon_names = Queries::TaxonName::Autocomplete.new(query_string, exact:, project_id:).autocomplete # an array, not a query ids = taxon_names.map(&:id) # TODO: Experiment with :cached_valid_taxon_name_id) # We assume we want to land on Valid OTUs, but see # return nil if ids.empty? min = 10.0 max = 20.0 scale = (max - min) / ids.count.to_f base_query.select("otus.*, ((#{min} + row_number() OVER ())::float * #{scale}) as priority") # small incrementing numbers for priority .joins("INNER JOIN ( SELECT unnest(ARRAY[#{ids.join(',')}]) AS id, row_number() OVER () AS row_num ) AS id_order ON otus.taxon_name_id = id_order.id") .order('id_order.row_num') end |
#base_query ⇒ Object
65 66 67 68 69 |
# File 'lib/queries/otu/autocomplete.rb', line 65 def base_query q = ::Otu.all q = q.where(project_id:) if project_id.any? q end |
#compact_priorities(otus) ⇒ Object
136 137 138 139 140 141 142 143 144 145 146 147 |
# File 'lib/queries/otu/autocomplete.rb', line 136 def compact_priorities(otus) # Mmmmarg! # We may have the same name at different priorities, strike all but the highest/first. r = [] i = {} otus.each do |o| next if i[o.id] r.push o i[o.id] = true end r end |
#otu_name_exact ⇒ Object
71 72 73 |
# File 'lib/queries/otu/autocomplete.rb', line 71 def otu_name_exact base_query.where(otus: {name: query_string}) end |
#otu_name_similarity ⇒ Object
All records that meet the similarity cuttoff
-
this is intended as a generic replacement for wildcarded results
Observations:
- was similarity(), experimenting with word_similarity
- 3 letter matches are going to be low probability, matches kick in at 4
86 87 88 89 90 91 |
# File 'lib/queries/otu/autocomplete.rb', line 86 def otu_name_similarity base_query .where('otus.name % ?', query_string) .where( ApplicationRecord.sanitize_sql_array(["word_similarity('%s', otus.name) > 0.33", query_string])) .order('otus.name, length(otus.name)') end |
#otu_name_start_match ⇒ Object
75 76 77 |
# File 'lib/queries/otu/autocomplete.rb', line 75 def otu_name_start_match base_query.where('otus.name ilike ?', query_string + '%') end |