Module: ParsingNesting::Tree

Defined in:
lib/parsing_nesting/tree.rb

Defined Under Namespace

Classes: AndList, ExcludedClause, List, MandatoryClause, Node, NotExpression, OrList, Phrase, Term

Class Method Summary collapse

Class Method Details

.parse(string, query_parser = 'dismax') ⇒ Object

Get parslet output for string (parslet output is json-y objects), and transform to an actual abstract syntax tree made up of more semantic ruby objects, Node’s. The top one will always be a List.

Call #to_query on resulting Node in order to transform to Solr query, optionally passing in Solr params to be used as LocalParams in nested dismax queries.

Our approach here works, but as we have to put in special cases it starts getting messy. Ideally we might want to actually transform the Object graph (abstract syntax tree) instead of trying to handle special cases in #to_query. For instance, transform object graph for a problematic pure-negative clause to the corresponding object graph without that (-a AND -b) ==> (NOT (a OR b). Transform (NOT NOT a) to (a). That would probably be more robust. But instead we handle special cases in to_query, which means the special cases tend to multiply and need to be handled at multiple levels. But it’s working for now.

the #negate method was an experiment in transforming parse tree in place, but isn’t being used. But it’s left as a sign post.



24
25
26
# File 'lib/parsing_nesting/tree.rb', line 24

def self.parse(string, query_parser = 'dismax')
  to_node_tree(ParsingNesting::Grammar.new.parse(string), query_parser)
end

.to_node_tree(tree, query_parser) ⇒ Object

theoretically Parslet’s Transform could be used for this, but I think the manner in which I’m parsing to Parslet labelled hash isn’t exactly what Parslet Transform is set up to work with, I couldn’t figure it out. But easy enough to do ‘manually’.



32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/parsing_nesting/tree.rb', line 32

def self.to_node_tree(tree, query_parser)
  if tree.is_a? Array
    # at one point I was normalizing top-level lists of one item to just
    # be that item, no list wrapper. But having the list wrapper
    # at the top level is actually useful for Solr output.
    List.new(tree.collect { |i| to_node_tree(i, query_parser) }, query_parser)
  elsif tree.is_a? Hash
    if list = tree[:list]
      List.new(list.collect { |i| to_node_tree(i, query_parser) }, query_parser)
    elsif tree.has_key?(:and_list)
      AndList.new(tree[:and_list].collect { |i| to_node_tree(i, query_parser) }, query_parser)
    elsif tree.has_key?(:or_list)
      OrList.new(tree[:or_list].collect { |i| to_node_tree(i, query_parser) }, query_parser)
    elsif not_payload = tree[:not_expression]
      NotExpression.new(to_node_tree(not_payload, query_parser))
    elsif tree.has_key?(:mandatory)
      MandatoryClause.new(to_node_tree(tree[:mandatory], query_parser))
    elsif tree.has_key?(:excluded)
      ExcludedClause.new(to_node_tree(tree[:excluded], query_parser))
    elsif phrase = tree[:phrase]
      Phrase.new(phrase)
    elsif tree.has_key?(:token)
      Term.new(tree[:token].to_s)
    end
  end
end