Class: InterMine::PathQuery::Query

Inherits:
Object
  • Object
show all
Defined in:
lib/intermine/query.rb

Overview

A class representing a structured query against an InterMine Data-Warehouse

Queries represent structured requests for data from an InterMine data-warehouse. They consist basically of output columns you select, and a set of constraints on the results to return. These are known as the “view” and the “constraints”. In a nod to the SQL-origins of the queries, and to the syntax of ActiveRecord, there is both a method-chaining SQL-ish DSL, and a more isolating common InterMine DSL.

query = service.query("Gene").select("*").where("proteins.molecularWeight" => {">" => 10000})
query.each_result do |gene|
  puts gene.symbol
end

OR:

query = service.query("Gene")
query.add_views("*")
query.add_constraint("proteins.molecularWeight", ">", 10000)
...

The main differences from SQL are that the joining between tables is implicit and automatic. Simply by naming the column “Gene.proteins.molecularWeight” we have access to the protein table joined onto the gene table. (A consequence of this is that all queries must have a unique root that all paths descend from, and we do not permit right outer joins.)

You can define the following features of a query:

* The output column
* The filtering constraints (what values certain columns must or must not have)
* The sort order of the results
* The way constraints are combined (AND or OR)

In processing results, there are two powerful result formats available, depending on whether you want to process results row by row, or whether you would like the information grouped into logically coherent records. The latter is more similar to the ORM model, and can be seen above. The mechanisms we offer for row access allow accessing cell values of the result table transparently by index or column-name.

:include:contact_header.rdoc

Direct Known Subclasses

Template

Constant Summary collapse

LOWEST_CODE =

The first possible constraint code

"A"
HIGHEST_CODE =

The last possible constraint code

"Z"

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(model, root = nil, service = nil) ⇒ Query

Construct a new query object. You should not use this directly. Instead use the factory methods in Service.

query = service.query("Gene")


284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
# File 'lib/intermine/query.rb', line 284

def initialize(model, root=nil, service=nil)
    @model = model
    @service = service
    @url = (@service.nil?) ? nil : @service.root + Service::QUERY_RESULTS_PATH
    @list_upload_uri = (@service.nil?) ? nil : @service.root + Service::QUERY_TO_LIST_PATH
    @list_append_uri = (@service.nil?) ? nil : @service.root + Service::QUERY_APPEND_PATH
    if root
        @root = InterMine::Metadata::Path.new(root, model).rootClass
    end
    @constraints = []
    @joins = []
    @views = []
    @sort_order = []
    @used_codes = []
    @logic_parser = LogicParser.new(self)
    @constraint_factory = ConstraintFactory.new(self)
end

Instance Attribute Details

#constraintsObject (readonly)

All the current constraints on the query



262
263
264
# File 'lib/intermine/query.rb', line 262

def constraints
  @constraints
end

#joinsObject (readonly)

All the current Join objects on the query



259
260
261
# File 'lib/intermine/query.rb', line 259

def joins
  @joins
end

#list_append_uriObject (readonly)

URLs for internal consumption.



277
278
279
# File 'lib/intermine/query.rb', line 277

def list_append_uri
  @list_append_uri
end

#list_upload_uriObject (readonly)

URLs for internal consumption.



277
278
279
# File 'lib/intermine/query.rb', line 277

def list_upload_uri
  @list_upload_uri
end

#logicObject (readonly)

The current logic (as a LogicGroup)



271
272
273
# File 'lib/intermine/query.rb', line 271

def logic
  @logic
end

#modelObject (readonly)

The data model associated with the query



256
257
258
# File 'lib/intermine/query.rb', line 256

def model
  @model
end

#nameObject

The (optional) name of the query. Used in automatic access (eg: “query1”)



247
248
249
# File 'lib/intermine/query.rb', line 247

def name
  @name
end

#rootObject

The root class of the query.



253
254
255
# File 'lib/intermine/query.rb', line 253

def root
  @root
end

#serviceObject (readonly)

The service this query is associated with



274
275
276
# File 'lib/intermine/query.rb', line 274

def service
  @service
end

#sort_orderObject (readonly)

The current sort-order.



268
269
270
# File 'lib/intermine/query.rb', line 268

def sort_order
  @sort_order
end

#titleObject

A human readable title of the query (eg: “Gene –> Protein Domain”)



250
251
252
# File 'lib/intermine/query.rb', line 250

def title
  @title
end

#viewsObject (readonly)

All the columns currently selected for output.



265
266
267
# File 'lib/intermine/query.rb', line 265

def views
  @views
end

Class Method Details

.is_valid_code(str) ⇒ Object

Whether or not the argument is a valid constraint code.

to be valid, it must be a one character string between A and Z inclusive.



807
808
809
# File 'lib/intermine/query.rb', line 807

def self.is_valid_code(str)
    return (str.length == 1) && (str >= LOWEST_CODE) && (str <= HIGHEST_CODE)
end

.parser(model) ⇒ Object

Return a parser for deserialising queries.

parser = Query.parser(service.model)
query = parser.parse(string)
query.each_row |r|
  puts r.to_h
end


310
311
312
# File 'lib/intermine/query.rb', line 310

def self.parser(model)
    return QueryLoader.new(model)
end

Instance Method Details

#add_constraint(*parameters) ⇒ Object

Add a constraint to the query matching the given parameters, and return the created constraint.

con = query.add_constraint("length", ">", 500)

Note that (at least for now) the style of argument used by where and add_constraint is not compatible. This is on the TODO list.



664
665
666
667
668
# File 'lib/intermine/query.rb', line 664

def add_constraint(*parameters)
    con = @constraint_factory.make_constraint(parameters)
    @constraints << con
    return con
end

#add_join(path, style = "OUTER") ⇒ Object Also known as: join

Declare how a particular join should be treated.

The default join style is for an INNER join, but joins can optionally be declared to be LEFT OUTER joins. The difference is that with an inner join, each join in the query implicitly constrains the values of that path to be non-null, whereas an outer-join allows null values in the joined path. If the path passed to the constructor has a chain of joins, the last section is the one the join is applied to.

query = service.query("Gene")
# Allow genes without proteins
query.add_join("proteins") 
# Demand the results contain only those genes that have interactions that have interactingGenes, 
# but allow those interactingGenes to not have any proteins.
query.add_join("interactions.interactingGenes.proteins")

The valid join styles are OUTER and INNER (case-insensitive). There is never any need to declare a join to be INNER, as it is inner by default. Consider using Query#outerjoin which is more explicitly declarative.



604
605
606
607
608
609
610
611
# File 'lib/intermine/query.rb', line 604

def add_join(path, style="OUTER")
    p = InterMine::Metadata::Path.new(add_prefix(path), @model, subclasses)
    if @root.nil?
        @root = p.rootClass
    end
    @joins << Join.new(p, style)
    return self
end

#add_prefix(x) ⇒ Object

Adds the root prefix to the given string.

Arguments:

x

An object with a #to_s method

Returns the prefixed string.



817
818
819
820
821
822
823
824
# File 'lib/intermine/query.rb', line 817

def add_prefix(x)
    x = x.to_s
    if @root && !x.start_with?(@root.name)
        return @root.name + "." + x
    else 
        return x
    end
end

#add_sort_order(path, direction = "ASC") ⇒ Object Also known as: order_by, order

Add a sort order element to sort order information. A sort order consists of the name of an output column and (optionally) the direction to sort in. The default direction is “ASC”. The valid directions are “ASC” and “DESC” (case-insensitive).

query.add_sort_order("length")
query.add_sort_order("proteins.primaryIdentifier", "desc")


628
629
630
631
632
633
634
635
# File 'lib/intermine/query.rb', line 628

def add_sort_order(path, direction="ASC") 
    p = self.path(path)
    if !@views.include? p
        raise ArgumentError, "Sort order (#{p}) not in view (#{@views.map {|v| v.to_s}.inspect} in #{self.name || 'unnamed query'})"
    end
    @sort_order << SortOrder.new(p, direction)
    return self
end

#add_views(*views) ⇒ Object Also known as: add_to_select

Add the given views (output columns) to the query.

Any columns ending in “*” will be interpreted as a request to add all attribute columns from that table to the query

Any columns that name a class or reference will add the id of that object to the query. This is helpful for creating lists and other specialist services.

query = service.query("Gene")
query.add_views("*")
query.add_to_select("*")
query.add_views("proteins.*")
query.add_views("pathways.*", "organism.shortName")
query.add_views("proteins", "exons")


523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
# File 'lib/intermine/query.rb', line 523

def add_views(*views)
    views.flatten.map do |x| 
        y = add_prefix(x)
        if y.end_with?("*")
            prefix = y.chomp(".*")
            path = InterMine::Metadata::Path.new(prefix, @model, subclasses)
            attrs = path.end_cd.attributes.map {|x| prefix + "." + x.name}
            add_views(attrs)
        else
            path = InterMine::Metadata::Path.new(y, @model, subclasses)
            path = InterMine::Metadata::Path.new(y.to_s + ".id", @model, subclasses) unless path.is_attribute?
            if @root.nil?
                @root = path.rootClass
            end
            @views << path
        end
    end
    return self
end

#allObject

Return all result record objects returned by running this query.



457
458
459
# File 'lib/intermine/query.rb', line 457

def all
    return self.results
end

#all_rowsObject

Return all the rows returned by running the query



462
463
464
# File 'lib/intermine/query.rb', line 462

def all_rows
    return self.rows
end

#coded_constraintsObject

Return all the constraints that have codes and can thus participate in logic.



316
317
318
# File 'lib/intermine/query.rb', line 316

def coded_constraints
    return @constraints.select {|x| !x.is_a?(SubClassConstraint)}
end

#countObject

Return the number of result rows this query will return in its current state. This makes a very small request to the webservice, and is the most efficient method of getting the size of the result set.



417
418
419
# File 'lib/intermine/query.rb', line 417

def count
    return results_reader.get_size
end

#each_result(start = 0, size = nil) ⇒ Object

Iterate over the results, one record at a time.

query.each_result do |gene|
  puts gene.symbol
  gene.proteins.each do |prot|
    puts prot.primaryIdentifier
  end
end


408
409
410
411
412
# File 'lib/intermine/query.rb', line 408

def each_result(start=0, size=nil)
    results_reader(start, size).each_result {|row|
        yield row
    }
end

#each_row(start = 0, size = nil) ⇒ Object

Iterate over the results of this query one row at a time.

Rows support both array-like index based access as well as hash-like key based access. For key based acces you can use either the full path or the headless short version:

query.each_row do |row|
  puts r["Gene.symbol"], r["proteins.primaryIdentifier"]
  puts r[0]
  puts r.to_a # Materialize the row an an Array
  puts r.to_h # Materialize the row an a Hash
end


393
394
395
396
397
# File 'lib/intermine/query.rb', line 393

def each_row(start=0, size=nil)
    results_reader(start, size).each_row {|row|
        yield row
    }
end

#eql?(other) ⇒ Boolean

Return true if the other query has exactly the same configuration, and belongs to the same service.

Returns:

  • (Boolean)


362
363
364
365
366
367
368
# File 'lib/intermine/query.rb', line 362

def eql?(other)
    if other.is_a? Query
        return self.service == other.service && self.to_xml_to_s == other.to_xml.to_s
    else
        return false
    end
end

#first(start = 0) ⇒ Object

Get the first result record from the query, starting at the given offset. If the offset is large, then this is not an efficient way to retrieve this data, and you may with to consider a looping approach or row based access instead.



470
471
472
473
474
475
476
477
478
479
480
# File 'lib/intermine/query.rb', line 470

def first(start=0)
    current_row = 0
    # Have to iterate as start refers to row count
    results_reader.each_result { |r|
        if current_row == start
            return r
        end
        current_row += 1
    }
    return nil
end

#first_row(start = 0) ⇒ Object

Get the first row of results from the query, starting at the given offset.



483
484
485
# File 'lib/intermine/query.rb', line 483

def first_row(start = 0)
    return self.results(start, 1).first
end

#get_constraint(code) ⇒ Object

Get the constraint on the query with the given code. Raises an error if there is no such constraint.

Raises:

  • (ArgumentError)


489
490
491
492
493
494
495
496
# File 'lib/intermine/query.rb', line 489

def get_constraint(code)
    @constraints.each do |x|
        if x.respond_to?(:code) and x.code == code
            return x
        end
    end
    raise ArgumentError, "#{code} not in query"
end

#inspectObject

Return an informative textual representation of the query.



841
842
843
# File 'lib/intermine/query.rb', line 841

def inspect
    return "<#{self.class.name} query=#{self.to_s.inspect}>"
end

#next_codeObject

Get the next available code for the query.

Raises:

  • (RuntimeError)


786
787
788
789
790
791
792
793
# File 'lib/intermine/query.rb', line 786

def next_code
    c = LOWEST_CODE
    while Query.is_valid_code(c)
        return c unless used_codes.include?(c)
        c = c.next
    end
    raise RuntimeError, "Maximum number of codes reached - all 26 have been allocated"
end

#outerjoin(path) ⇒ Object

Explicitly declare a join to be an outer join.



616
617
618
# File 'lib/intermine/query.rb', line 616

def outerjoin(path)
    return add_join(path)
end

#paramsObject

Return the parameter hash for running this query in its current state.



827
828
829
830
831
832
833
# File 'lib/intermine/query.rb', line 827

def params
    hash = {"query" => self.to_xml}
    if @service and @service.token
        hash["token"] = @service.token
    end
    return hash
end

#path(pathstr) ⇒ Object

Returns a Path object constructed from the given path-string, taking the current state of the query into account (its data-model and subclass constraints).



673
674
675
# File 'lib/intermine/query.rb', line 673

def path(pathstr)
    return InterMine::Metadata::Path.new(add_prefix(pathstr), @model, subclasses)
end

#remove_constraint(code) ⇒ Object

Remove the constraint with the given code from the query. If no such constraint exists, no error will be raised.



501
502
503
504
505
# File 'lib/intermine/query.rb', line 501

def remove_constraint(code)
    @constraints.reject! do |x|
        x.respond_to?(:code) and x.code == code
    end
end

#results(start = 0, size = nil) ⇒ Object

Return objects corresponding to the type of data requested, starting at the given row offset.

genes = query.results
genes.last.symbol
=> "eve"


448
449
450
451
452
453
454
# File 'lib/intermine/query.rb', line 448

def results(start=0, size=nil)
    res = []
    results_reader(start, size).each_result {|row|
        res << row
    }
    res
end

#results_reader(start = 0, size = nil) ⇒ Object

Get your own result reader for handling the results at a low level. If no columns have been selected for output before requesting results, all attribute columns will be selected.



373
374
375
376
377
378
# File 'lib/intermine/query.rb', line 373

def results_reader(start=0, size=nil)
    if @views.empty?
        select("*")
    end
    return Results::ResultsReader.new(@url, self, start, size)
end

#rows(start = 0, size = nil) ⇒ Object

Returns an Array of ResultRow objects containing the data returned by running this query, starting at the given offset and containing up to the given maximum size.

The webservice enforces a maximum page-size of 10,000,000 rows, independent of any size you specify - this can be obviated with paging for large result sets.

rows = query.rows
rows.last["symbol"]
=> "eve"


433
434
435
436
437
438
439
# File 'lib/intermine/query.rb', line 433

def rows(start=0, size=nil)
    res = []
    results_reader(start, size).each_row {|row|
        res << row
    }
    res
end

#set_logic(value) ⇒ Object Also known as: constraintLogic=

Set the logic to the given value.

The value will be parsed for consistency is it is a logic string.

Returns self to support chaining.



774
775
776
777
778
779
780
781
# File 'lib/intermine/query.rb', line 774

def set_logic(value)
    if value.is_a?(LogicGroup)
        @logic = value
    else
        @logic = @logic_parser.parse_logic(value)
    end
    return self
end

#sortOrder=(so) ⇒ Object

Set the sort order completely, replacing the current sort order.

query.sortOrder = "Gene.length asc Gene.proteins.length desc"

The sort order expression will be parsed and checked for conformity with the current state of the query.



643
644
645
646
647
648
649
650
651
652
# File 'lib/intermine/query.rb', line 643

def sortOrder=(so)
    if so.is_a?(Array)
        sos = so
    else
        sos = so.split(/(ASC|DESC|asc|desc)/).map {|x| x.strip}.every(2)
    end
    sos.each do |args|
        add_sort_order(*args)
    end
end

#subclass_constraintsObject

Return all the constraints that restrict the class of paths in the query.



322
323
324
# File 'lib/intermine/query.rb', line 322

def subclass_constraints
    return @constraints.select {|x| x.is_a?(SubClassConstraint)}
end

#subclassesObject

Get the current sub-class map for this query.

This contains information about which fields of this query have been declared to be restricted to contain only a subclass of their normal type.

> query = service.query("Gene")
> query.where(:microArrayResults => service.model.table("FlyAtlasResult"))
> query.subclasses
=> {"Gene.microArrayResults" => "FlyAtlasResult"}


574
575
576
577
578
579
580
581
582
# File 'lib/intermine/query.rb', line 574

def subclasses
    subclasses = {}
    @constraints.each do |con|
        if con.is_a?(SubClassConstraint)
            subclasses[con.path.to_s] = con.sub_class.to_s
        end
    end
    return subclasses
end

#to_sObject

Return the textual representation of the query. Here it returns the Query XML



836
837
838
# File 'lib/intermine/query.rb', line 836

def to_s
    return to_xml.to_s
end

#to_xmlObject

Return an XML document node representing the XML form of the query.

This is the canonical serialisable form of the query.



330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
# File 'lib/intermine/query.rb', line 330

def to_xml
    doc = REXML::Document.new

    if @sort_order.empty?
        so = SortOrder.new(@views.first, "ASC")
    else
        so = @sort_order.join(" ")
    end

    query = doc.add_element("query", {
        "name" => @name, 
        "model" => @model.name, 
        "title" => @title, 
        "sortOrder" => so,
        "view" => @views.join(" "),
        "constraintLogic" => @logic
    }.delete_if { |k, v | !v })
    @joins.each { |join| 
        query.add_element("join", join.attrs) 
    }
    subclass_constraints.each { |con|
        query.add_element(con.to_elem) 
    }
    coded_constraints.each { |con|
        query.add_element(con.to_elem) 
    }
    return doc
end

#used_codesObject

Return the list of currently used codes by the query.



796
797
798
799
800
801
802
# File 'lib/intermine/query.rb', line 796

def used_codes
    if @constraints.empty?
        return []
    else
        return @constraints.select {|x| !x.is_a?(SubClassConstraint)}.map {|x| x.code}
    end
end

#view=(*view) ⇒ Object Also known as: select

Replace any currently existing views with the given view list. If the view is not already an Array, it will be split by commas and whitespace.



548
549
550
551
552
553
554
555
556
557
558
559
# File 'lib/intermine/query.rb', line 548

def view=(*view)
    @views = []
    view.each do |v|
        if v.is_a?(Array)
            views = v
        else
            views = v.to_s.split(/(?:,\s*|\s+)/)
        end
        add_views(*views)
    end
    return self
end

#where(*wheres) ⇒ Object

Add a constraint clause to the query.

query.where(:symbol => "eve")
query.where(:symbol => %{eve h bib zen})
query.where(:length => {:le => 100}, :symbol => "eve*")

Interprets the arguments in a style similar to that of ActiveRecord constraints, and adds them to the query. If multiple constraints are supplied in a single hash (as in the third example), then the order in which they are applied to the query (and thus the codes they will receive) is not predictable. To determine the order use chained where clauses or use multiple hashes:

query.where({:length => {:le => 100}}, {:symbol => "eve*"})

Returns self to support method chaining



695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
# File 'lib/intermine/query.rb', line 695

def where(*wheres)
   if @views.empty?
       self.select('*')
   end
   wheres.each do |w|
     w.each do |k,v|
        if v.is_a?(Hash)
            parameters = {:path => k}
            v.each do |subk, subv|
                normalised_k = subk.to_s.upcase.gsub(/_/, " ")
                if subk == :with
                    parameters[:extra_value] = subv
                elsif subk == :sub_class
                    parameters[subk] = subv
                elsif subk == :code
                    parameters[:code] = subv
                elsif LoopConstraint.valid_ops.include?(normalised_k)
                    parameters[:op] = normalised_k
                    parameters[:loopPath] = subv
                else
                    if subv.nil?
                        if subk == "="
                            parameters[:op] = "IS NULL"
                        elsif subk == "!="
                            parameters[:op] = "IS NOT NULL"
                        else
                            parameters[:op] = normalised_k
                        end
                    elsif subv.is_a?(Range) or subv.is_a?(Array)
                        if subk == "="
                            parameters[:op] = "ONE OF"
                        elsif subk == "!="
                            parameters[:op] = "NONE OF"
                        else
                            parameters[:op] = normalised_k
                        end
                        parameters[:values] = subv.to_a
                    elsif subv.is_a?(Lists::List)
                        if subk == "="
                            parameters[:op] = "IN"
                        elsif subk == "!="
                            parameters[:op] = "NOT IN"
                        else
                            parameters[:op] = normalised_k
                        end
                        parameters[:value] = subv.name
                    else
                        parameters[:op] = normalised_k
                        parameters[:value] = subv
                    end
                end
            end
            add_constraint(parameters)
        elsif v.is_a?(Range) or v.is_a?(Array)
            add_constraint(k.to_s, 'ONE OF', v.to_a)
        elsif v.is_a?(InterMine::Metadata::ClassDescriptor)
            add_constraint(:path => k.to_s, :sub_class => v.name)
        elsif v.is_a?(InterMine::Lists::List)
            add_constraint(k.to_s, 'IN', v.name)
        elsif v.nil?
            add_constraint(k.to_s, "IS NULL")
        else
            if path(k.to_s).is_attribute?
                add_constraint(k.to_s, '=', v)
            else
                add_constraint(k.to_s, 'LOOKUP', v)
            end
        end
     end
   end
   return self
end