Class: QueryParser

Inherits:
Object
  • Object
show all
Defined in:
lib/queryparser.rb

Overview

Takes a query in plain english and turns it into a string suitable to passing to Lucene or Solr.

Assuming a Lucene / Solr database that has the body of the data in the content field with the entry heading in a title field, sub headings in a subheading field

p = QueryParser.new('content')
l = p->parse("apple")            
  => "content:apple"

l = p->parse("apple and banana")          
  => "+(+content:apple +content:banana)"

l = p.parse('apple not banana or cherry') 
  => "+((+content:apple -content:banana) content:cherry)"

Here we boost the score of those queries that also match the title field of the document

p = QueryParser.new("content", nil, 'title' => '^10')
l = p.parse("apple")
  => "content:apple title:apple^10"

Now with an extra boosting for subheadings

p = QueryParser.new("content", nil, 'title' => '^10', 'subheading' => '^5')
l = p.parse("apple")
  => "content:apple title:apple^10 subheading:apple^5"

We can also change the similarity of the match. In Lucene terms a similarity of 1.0 will mean that ‘banana’ will only match ‘banana’. However a similarity of 0.6 (entered as ~0.6) will allow ‘banana’ to match ‘canada’ which is only two letters different. The default similarity in Lucene is 0.6 (if I remember correctly).

p = QueryParser.new("content", '~0.6', 'title' => '^10')
l = p.parse("apple not banana")
  => "+(+content:apple~0.6 -content:banana~0.6) title:apple~0.6^10"

Defined Under Namespace

Modules: Exceptions Classes: And, Not, Or, Set, Term

Constant Summary collapse

VERSION =
'1.0.1'

Instance Method Summary collapse

Constructor Details

#initialize(field, similarity = nil, boosts = {}) ⇒ QueryParser

Returns a new instance of QueryParser.



44
45
46
47
48
# File 'lib/queryparser.rb', line 44

def initialize(field, similarity = nil, boosts = {})
  @field = field
  @similarity = similarity
  @boosts = boosts
end

Instance Method Details

#parse(text) ⇒ Object

Takes a plain english query and converts it into a string that can be fed into Lucene or Solr. It will apply the similarity and boostings set in the constructor.



53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# File 'lib/queryparser.rb', line 53

def parse(text)
  a = tokenise(text)
  b = expand(a)
  check_braces(b)
  has_content(b)

  c = add_implicit_and(b)

  d = maketree(c)
  if d.class != Array then
    d = [d]
  end

  f = process_not(d)
  g = process_and_or(f, 'and')
  h = process_and_or(g, 'or')

  # Wrap everything in an and
  s = QueryParser::And.new
  s.add(h)

  t = reduce(s)

  b = QueryParser::Or.new
  b.add(t.boostable())

  a = Array.new
  x = t.lucene(@field, @similarity)
  if x[0].chr == '(' then
    x = "+#{x}"
  end
  a << x
    
  @boosts.each_pair do |k, v|
    x = [@similarity, v].join('')
    a << b.lucene(k,x)
  end

  return a.join(' ')
end