Class: XapianDb::Database

Inherits:
Object
  • Object
show all
Includes:
Utilities
Defined in:
lib/xapian_db/database.rb

Overview

Base class for a Xapian database

Direct Known Subclasses

InMemoryDatabase, PersistentDatabase

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Utilities

#assert_valid_keys, #camelize, #constantize

Instance Attribute Details

#readerObject (readonly)


13
14
15
# File 'lib/xapian_db/database.rb', line 13

def reader
  @reader
end

Instance Method Details

#delete_doc_with_unique_term(term) ⇒ Object

Delete a document identified by a unique term; this method is used by the orm adapters


33
34
35
36
# File 'lib/xapian_db/database.rb', line 33

def delete_doc_with_unique_term(term)
  writer.delete_document("Q#{term}")
  true
end

#delete_docs_of_class(klass) ⇒ Object

Delete all docs of a specific class.

If `klass` tracks its descendants, then docs of any subclasses will be deleted, too. (ActiveRecord does this by default; the gem 'descendants_tracker' offers an alternative.)


44
45
46
47
48
49
50
51
52
# File 'lib/xapian_db/database.rb', line 44

def delete_docs_of_class(klass)
  writer.delete_document("C#{klass}")
  if klass.respond_to? :descendants
    klass.descendants.each do |subclass|
      writer.delete_document("C#{subclass}")
    end
  end
  true
end

#facets(attribute, expression) ⇒ Hash<Class, Integer>

A very simple implementation of facets using Xapian collapse key.


156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# File 'lib/xapian_db/database.rb', line 156

def facets(attribute, expression)
 # return an empty hash if no search expression is given
  return {} if expression.nil? || expression.strip.empty?
  value_number         = XapianDb::DocumentBlueprint.value_number_for(attribute)
  @query_parser        ||= QueryParser.new(self)
  query                = @query_parser.parse(expression)
  enquiry              = Xapian::Enquire.new(reader)
  enquiry.query        = query
  enquiry.collapse_key = value_number
  facets = {}
  enquiry.mset(0, size).matches.each do |match|
    facet_value = match.document.value(value_number)
    # We must add 1 to the collapse_count since collapse_count means
    # "how many other matches are there?"
    facets[facet_value] = match.collapse_count + 1
  end
  facets
end

#find_similar_to(docs, options = {}) ⇒ XapianDb::Resultset

Find documents that are similar to one or more reference documents. It is basically the implementation of this suggestion: trac.xapian.org/wiki/FAQ/FindSimilar

Options Hash (options):

  • :class (Class)

    an indexed class; if a class is passed, the result will contain objects of this class only


128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
# File 'lib/xapian_db/database.rb', line 128

def find_similar_to(docs, options={})
  docs = [docs].flatten
  reference = Xapian::RSet.new
  docs.each { |doc| reference.add_document doc.docid }
  pk_terms    = docs.map { |doc| "Q#{doc.data}" }
  class_terms = docs.map { |doc| "C#{doc.indexed_class}" }

  relevant_terms = Xapian::Enquire.new(reader).eset(40, reference).terms.map {|e| e.name } - pk_terms - class_terms
  relevant_terms.reject! { |term| term =~ /INDEXED_CLASS/ }

  reference_query = Xapian::Query.new Xapian::Query::OP_OR, pk_terms
  terms_query     = Xapian::Query.new Xapian::Query::OP_OR, relevant_terms
  final_query     = Xapian::Query.new Xapian::Query::OP_AND_NOT, terms_query, reference_query
  if options[:class]
    class_scope = "indexed_class:#{options[:class].name.downcase}"
    @query_parser ||= QueryParser.new(self)
    class_query   = @query_parser.parse(class_scope)
    final_query   = Xapian::Query.new Xapian::Query::OP_AND, class_query, final_query
  end
  enquiry       = Xapian::Enquire.new(reader)
  enquiry.query = final_query
  Resultset.new(enquiry, :db_size => self.size, :limit => options[:limit])
end

#search(expression, options = {}) ⇒ XapianDb::Resultset

Perform a search

Examples:

Simple Query

resultset = db.search("foo")

Wildcard Query

resultset = db.search("fo*")

Boolean Query

resultset = db.search("foo or baz")

Field Query

resultset = db.search("name:foo")

Options Hash (options):

  • :per_page (Integer)

    How many docs per page?

  • :sort_indices (Array<Integer>) — default: nil

    An array of attribute indices to sort by. This option is used internally by the search method implemented on configured classes. Do not use it directly unless you know what you do

  • :sort_decending (Boolean) — default: false

    Reverse the sort order?


71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
# File 'lib/xapian_db/database.rb', line 71

def search(expression, options={})
  opts          = {:sort_decending => false}.merge(options)
  @query_parser ||= QueryParser.new(self)
  query         = @query_parser.parse(expression)

  # If we do not have a valid query we return an empty result set
  return Resultset.new(nil, opts) unless query

  start = Time.now

  enquiry        = Xapian::Enquire.new(reader)
  enquiry.query  = query
  sort_indices   = opts.delete :sort_indices
  sort_decending = opts.delete :sort_decending
  order          = opts.delete :order
  raise ArgumentError.new "you can't use sort_indices and order, only one of them" if sort_indices && order

  if order
    sort_indices = order.map{ |attr_name| XapianDb::DocumentBlueprint.value_number_for attr_name.to_sym }
  end

  sorter = Xapian::MultiValueKeyMaker.new
  if sort_indices
    sort_indices.each { |index| sorter.add_value index }
    enquiry.set_sort_by_key_then_relevance(sorter, sort_decending)
  else
    sorter.add_value DocumentBlueprint.value_number_for(:natural_sort_order)
    enquiry.set_sort_by_relevance_then_key sorter, false
  end

  opts[:spelling_suggestion] = @query_parser.spelling_suggestion
  opts[:db_size]             = self.size

  retries = 0
  begin
    result = Resultset.new(enquiry, opts)
  rescue IOError => ex
    raise unless ex.message =~ /DatabaseModifiedError: /
    raise if retries >= 5
    sleep 0.1
    retries += 1
    Rails.logger.warn "XapianDb: DatabaseModifiedError, retry #{retries}" if defined?(Rails)
    @reader.reopen
    retry
  end

  Rails.logger.debug "XapianDb search (#{(Time.now - start) * 1000}ms) #{expression}" if defined?(Rails)
  result
end

#sizeInteger

Size of the database (number of docs)


17
18
19
# File 'lib/xapian_db/database.rb', line 17

def size
  reader.doccount
end

#store_doc(doc) ⇒ Object

Store a Xapian document


24
25
26
27
28
# File 'lib/xapian_db/database.rb', line 24

def store_doc(doc)
  # We always replace; Xapian adds the document automatically if
  # it is not found
  writer.replace_document("Q#{doc.data}", doc)
end