Class: Sphinx::Client

Inherits:
Object
  • Object
show all
Defined in:
lib/sphinx/sphinx/client.rb

Overview

:startdoc:

Direct Known Subclasses

Zinx::Client

Constant Summary collapse

SEARCHD_COMMAND_SEARCH =

search command

0
SEARCHD_COMMAND_EXCERPT =

excerpt command

1
SEARCHD_COMMAND_UPDATE =

update command

2
SEARCHD_COMMAND_KEYWORDS =

keywords command

3
VER_COMMAND_SEARCH =

search command version

0x119
VER_COMMAND_EXCERPT =

excerpt command version

0x102
VER_COMMAND_UPDATE =

update command version

0x102
VER_COMMAND_KEYWORDS =

keywords command version

0x100
SEARCHD_OK =

general success, command-specific reply follows

0
SEARCHD_ERROR =

general failure, command-specific reply may follow

1
SEARCHD_RETRY =

temporaty failure, client should retry later

2
SEARCHD_WARNING =

general success, warning message and command-specific reply follow

3
SPH_MATCH_ALL =

match all query words

0
SPH_MATCH_ANY =

match any query word

1
SPH_MATCH_PHRASE =

match this exact phrase

2
SPH_MATCH_BOOLEAN =

match this boolean query

3
SPH_MATCH_EXTENDED =

match this extended query

4
SPH_MATCH_FULLSCAN =

match all document IDs w/o fulltext query, apply filters

5
SPH_MATCH_EXTENDED2 =

extended engine V2 (TEMPORARY, WILL BE REMOVED IN 0.9.8-RELEASE)

6
SPH_RANK_PROXIMITY_BM25 =

default mode, phrase proximity major factor and BM25 minor one

0
SPH_RANK_BM25 =

statistical mode, BM25 ranking only (faster but worse quality)

1
SPH_RANK_NONE =

no ranking, all matches get a weight of 1

2
SPH_RANK_WORDCOUNT =

simple word-count weighting, rank is a weighted sum of per-field keyword occurence counts

3
SPH_RANK_PROXIMITY =

phrase proximity

4
SPH_SORT_RELEVANCE =

sort by document relevance desc, then by date

0
SPH_SORT_ATTR_DESC =

sort by document date desc, then by relevance desc

1
SPH_SORT_ATTR_ASC =

sort by document date asc, then by relevance desc

2
SPH_SORT_TIME_SEGMENTS =

sort by time segments (hour/day/week/etc) desc, then by relevance desc

3
SPH_SORT_EXTENDED =

sort by SQL-like expression (eg. “@relevance DESC, price ASC, @id DESC”)

4
SPH_SORT_EXPR =

sort by arithmetic expression in descending order (eg. “@id + max(@weight,1000)*boost + log(price)”)

5
SPH_FILTER_VALUES =

filter by integer values set

0
SPH_FILTER_RANGE =

filter by integer range

1
SPH_FILTER_FLOATRANGE =

filter by float range

2
SPH_ATTR_INTEGER =

this attr is just an integer

1
SPH_ATTR_TIMESTAMP =

this attr is a timestamp

2
SPH_ATTR_ORDINAL =

this attr is an ordinal string number (integer at search time, specially handled at indexing time)

3
SPH_ATTR_BOOL =

this attr is a boolean bit field

4
SPH_ATTR_FLOAT =

this attr is a float

5
SPH_ATTR_BIGINT =

signed 64-bit integer

6
SPH_ATTR_STRING =

string

7
SPH_ATTR_MULTI =

this attr has multiple values (0 or more)

0x40000001
SPH_ATTR_MULTI64 =
0x40000002
SPH_GROUPBY_DAY =

group by day

0
SPH_GROUPBY_WEEK =

group by week

1
SPH_GROUPBY_MONTH =

group by month

2
SPH_GROUPBY_YEAR =

group by year

3
SPH_GROUPBY_ATTR =

group by attribute value

4
SPH_GROUPBY_ATTRPAIR =

group by sequential attrs pair

5

Instance Method Summary collapse

Constructor Details

#initializeClient

Constructs the Sphinx::Client object and sets options to their default values.



172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
# File 'lib/sphinx/sphinx/client.rb', line 172

def initialize
  # per-client-object settings
  @host          = 'localhost'             # searchd host (default is "localhost")
  @port          = 9312                    # searchd port (default is 9312)
  
  # per-query settings
  @offset        = 0                       # how many records to seek from result-set start (default is 0)
  @limit         = 20                      # how many records to return from result-set starting at offset (default is 20)
  @mode          = SPH_MATCH_ALL           # query matching mode (default is SPH_MATCH_ALL)
  @weights       = []                      # per-field weights (default is 1 for all fields)
  @sort          = SPH_SORT_RELEVANCE      # match sorting mode (default is SPH_SORT_RELEVANCE)
  @sortby        = ''                      # attribute to sort by (defualt is "")
  @min_id        = 0                       # min ID to match (default is 0, which means no limit)
  @max_id        = 0                       # max ID to match (default is 0, which means no limit)
  @filters       = []                      # search filters
  @groupby       = ''                      # group-by attribute name
  @groupfunc     = SPH_GROUPBY_DAY         # function to pre-process group-by attribute value with
  @groupsort     = '@group desc'           # group-by sorting clause (to sort groups in result set with)
  @groupdistinct = ''                      # group-by count-distinct attribute
  @maxmatches    = 1000                    # max matches to retrieve
  @cutoff        = 0                       # cutoff to stop searching at (default is 0)
  @retrycount    = 0                       # distributed retries count
  @retrydelay    = 0                       # distributed retries delay
  @anchor        = []                      # geographical anchor point
  @indexweights  = []                      # per-index weights
  @ranker        = SPH_RANK_PROXIMITY_BM25 # ranking mode (default is SPH_RANK_PROXIMITY_BM25)
  @maxquerytime  = 0                       # max query time, milliseconds (default is 0, do not limit) 
  @fieldweights  = {}                      # per-field-name weights
  @overrides     = []                      # per-query attribute values overrides
  @select        = '*'                     # select-list (attributes or expressions, with optional aliases)

  # per-reply fields (for single-query case)
  @error         = ''                      # last error message
  @warning       = ''                      # last warning message
  
  @reqs          = []                      # requests storage (for multi-query case)
  @mbenc         = ''                      # stored mbstring encoding
end

Instance Method Details

#AddQuery(query, index = '*', comment = '') ⇒ Object

Add query to batch.

Batch queries enable searchd to perform internal optimizations, if possible; and reduce network connection overheads in all cases.

For instance, running exactly the same query with different groupby settings will enable searched to perform expensive full-text search and ranking operation only once, but compute multiple groupby results from its output.

Parameters are exactly the same as in Query call. Returns index to results array returned by RunQueries call.



565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
# File 'lib/sphinx/sphinx/client.rb', line 565

def AddQuery(query, index = '*', comment = '')
  # build request
  
  # mode and limits
  request = Request.new
  request.put_int @offset, @limit, @mode, @ranker, @sort
  request.put_string @sortby
  # query itself
  request.put_string query
  # weights
  request.put_int_array @weights
  # indexes
  request.put_string index
  # id64 range marker
  request.put_int 1
  # id64 range
  request.put_int64 @min_id.to_i, @max_id.to_i 
  
  # filters
  request.put_int @filters.length
  @filters.each do |filter|
    request.put_string filter['attr']
    request.put_int filter['type']

    case filter['type']
      when SPH_FILTER_VALUES
        request.put_int64_array filter['values']
      when SPH_FILTER_RANGE
        request.put_int64 filter['min'], filter['max']
      when SPH_FILTER_FLOATRANGE
        request.put_float filter['min'], filter['max']
      else
        raise SphinxInternalError, 'Internal error: unhandled filter type'
    end
    request.put_int filter['exclude'] ? 1 : 0
  end
  
  # group-by clause, max-matches count, group-sort clause, cutoff count
  request.put_int @groupfunc
  request.put_string @groupby
  request.put_int @maxmatches
  request.put_string @groupsort
  request.put_int @cutoff, @retrycount, @retrydelay
  request.put_string @groupdistinct
  
  # anchor point
  if @anchor.empty?
    request.put_int 0
  else
    request.put_int 1
    request.put_string @anchor['attrlat'], @anchor['attrlong']
    request.put_float @anchor['lat'], @anchor['long']
  end
  
  # per-index weights
  request.put_int @indexweights.length
  @indexweights.each do |idx, weight|
    request.put_string idx
    request.put_int weight
  end
  
  # max query time
  request.put_int @maxquerytime
  
  # per-field weights
  request.put_int @fieldweights.length
  @fieldweights.each do |field, weight|
    request.put_string field
    request.put_int weight
  end
  
  # comment
  request.put_string comment
  
  # attribute overrides
  request.put_int @overrides.length
  for entry in @overrides do
    request.put_string entry['attr']
    request.put_int entry['type'], entry['values'].size
    entry['values'].each do |id, val|
      assert { id.instance_of?(Fixnum) || id.instance_of?(Bignum) }
      assert { val.instance_of?(Fixnum) || val.instance_of?(Bignum) || val.instance_of?(Float) }
      
      request.put_int64 id
      case entry['type']
        when SPH_ATTR_FLOAT
          request.put_float val
        when SPH_ATTR_BIGINT
          request.put_int64 val
        else
          request.put_int val
      end
    end
  end
  
  # select-list
  request.put_string @select
  
  # store request to requests array
  @reqs << request.to_s;
  return @reqs.length - 1
end

#BuildExcerpts(docs, index, words, opts = {}) ⇒ Object

Connect to searchd server and generate exceprts from given documents.

  • docs – an array of strings which represent the documents’ contents

  • index – a string specifiying the index which settings will be used

for stemming, lexing and case folding

  • words – a string which contains the words to highlight

  • opts is a hash which contains additional optional highlighting parameters.

You can use following parameters:

  • 'before_match' – a string to insert before a set of matching words, default is “<b>”

  • 'after_match' – a string to insert after a set of matching words, default is “<b>”

  • 'chunk_separator' – a string to insert between excerpts chunks, default is “ … ”

  • 'limit' – max excerpt size in symbols (codepoints), default is 256

  • 'around' – how much words to highlight around each match, default is 5

  • 'exact_phrase' – whether to highlight exact phrase matches only, default is false

  • 'single_passage' – whether to extract single best passage only, default is false

  • 'use_boundaries' – whether to extract passages by phrase boundaries setup in tokenizer

  • 'weight_order' – whether to order best passages in document (default) or weight order

Returns false on failure. Returns an array of string excerpts on success.



830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
# File 'lib/sphinx/sphinx/client.rb', line 830

def BuildExcerpts(docs, index, words, opts = {})
  assert { docs.instance_of? Array }
  assert { index.instance_of? String }
  assert { words.instance_of? String }
  assert { opts.instance_of? Hash }

  # fixup options
  opts['before_match'] ||= '<b>';
  opts['after_match'] ||= '</b>';
  opts['chunk_separator'] ||= ' ... ';
	  opts['html_strip_mode'] ||= 'index';
  opts['limit'] ||= 256;
	  opts['limit_passages'] ||= 0;
	  opts['limit_words'] ||= 0;
  opts['around'] ||= 5;
	  opts['start_passage_id'] ||= 1;
  opts['exact_phrase'] ||= false
  opts['single_passage'] ||= false
  opts['use_boundaries'] ||= false
  opts['weight_order'] ||= false
	  opts['load_files'] ||= false
	  opts['allow_empty'] ||= false
  
  # build request
  
  # v.1.0 req
  flags = 1
  flags |= 2  if opts['exact_phrase']
  flags |= 4  if opts['single_passage']
  flags |= 8  if opts['use_boundaries']
  flags |= 16 if opts['weight_order']
	  flags |= 32 if opts['query_mode']
	  flags |= 64 if opts['force_all_words']
	  flags |= 128 if opts['load_files']
	  flags |= 256 if opts['allow_empty']
  
  request = Request.new
  request.put_int 0, flags # mode=0, flags=1 (remove spaces)
  # req index
  request.put_string index
  # req words
  request.put_string words
  
  # options
  request.put_string opts['before_match']
  request.put_string opts['after_match']
  request.put_string opts['chunk_separator']
  request.put_int opts['limit'].to_i, opts['around'].to_i
	  
	  # options v1.2
	  request.put_int opts['limit_passages'].to_i
	  request.put_int opts['limit_words'].to_i
	  request.put_int opts['start_passage_id'].to_i
	  request.put_string opts['html_strip_mode']
  
  # documents
  request.put_int docs.size
  docs.each do |doc|
    assert { doc.instance_of? String }

    request.put_string doc
  end
  
  response = PerformRequest(:excerpt, request)
  
  # parse response
  begin
    res = []
    docs.each do |doc|
      res << response.get_string
    end
  rescue EOFError
    @error = 'incomplete reply'
    raise SphinxResponseError, @error
  end
  return res
end

#BuildKeywords(query, index, hits) ⇒ Object

Connect to searchd server, and generate keyword list for a given query.

Returns an array of words on success.



911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
# File 'lib/sphinx/sphinx/client.rb', line 911

def BuildKeywords(query, index, hits)
  assert { query.instance_of? String }
  assert { index.instance_of? String }
  assert { hits.instance_of?(TrueClass) || hits.instance_of?(FalseClass) }
  
  # build request
  request = Request.new
  # v.1.0 req
  request.put_string query # req query
  request.put_string index # req index
  request.put_int hits ? 1 : 0

  response = PerformRequest(:keywords, request)
  
  # parse response
  begin
    res = []
    nwords = response.get_int
    0.upto(nwords - 1) do |i|
      tokenized = response.get_string
      normalized = response.get_string
      
      entry = { 'tokenized' => tokenized, 'normalized' => normalized }
      entry['docs'], entry['hits'] = response.get_ints(2) if hits
      
      res << entry
    end
  rescue EOFError
    @error = 'incomplete reply'
    raise SphinxResponseError, @error
  end
  
  return res
end

#GetLastErrorObject

Get last error message.



212
213
214
# File 'lib/sphinx/sphinx/client.rb', line 212

def GetLastError
  @error
end

#GetLastWarningObject

Get last warning message.



217
218
219
# File 'lib/sphinx/sphinx/client.rb', line 217

def GetLastWarning
  @warning
end

#Query(query, index = '*', comment = '') ⇒ Object

index is index name (or names) to query. default value is “*” which means to query all indexes. Accepted characters for index names are letters, numbers, dash, and underscore; everything else is considered a separator. Therefore, all the following calls are valid and will search two indexes:

sphinx.Query('test query', 'main delta')
sphinx.Query('test query', 'main;delta')
sphinx.Query('test query', 'main, delta')

Index order matters. If identical IDs are found in two or more indexes, weight and attribute values from the very last matching index will be used for sorting and returning to client. Therefore, in the example above, matches from “delta” index will always “win” over matches from “main”.

Returns false on failure. Returns hash which has the following keys on success:

  • 'matches' – array of hashes ‘group’, ‘id’, where ‘id’ is document_id.

  • 'total' – total amount of matches retrieved (upto SPH_MAX_MATCHES, see sphinx.h)

  • 'total_found' – total amount of matching documents in index

  • 'time' – search time

  • 'words' – hash which maps query terms (stemmed!) to (‘docs’, ‘hits’) hash



536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
# File 'lib/sphinx/sphinx/client.rb', line 536

def Query(query, index = '*', comment = '')
  assert { @reqs.empty? }
  @reqs = []
  
  self.AddQuery(query, index, comment)
  results = self.RunQueries
  
  # probably network error; error message should be already filled
  return false unless results.instance_of?(Array)
  
  @error = results[0]['error']
  @warning = results[0]['warning']
  
  return false if results[0]['status'] == SEARCHD_ERROR
  return results[0]
end

#ResetFiltersObject

Clear all filters (for multi-queries).



492
493
494
495
# File 'lib/sphinx/sphinx/client.rb', line 492

def ResetFilters
  @filters = []
  @anchor = []
end

#ResetGroupByObject

Clear groupby settings (for multi-queries).



498
499
500
501
502
503
# File 'lib/sphinx/sphinx/client.rb', line 498

def ResetGroupBy
  @groupby       = ''
  @groupfunc     = SPH_GROUPBY_DAY
  @groupsort     = '@group desc'
  @groupdistinct = ''
end

#ResetOverridesObject

Clear all attribute value overrides (for multi-queries).



506
507
508
# File 'lib/sphinx/sphinx/client.rb', line 506

def ResetOverrides
  @overrides = []
end

#RunQueriesObject

Run queries batch.

Returns an array of result sets on success. Returns false on network IO failure.

Each result set in returned array is a hash which containts the same keys as the hash returned by Query, plus:

  • 'error' – search error for this query

  • 'words' – hash which maps query terms (stemmed!) to ( “docs”, “hits” ) hash



678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
# File 'lib/sphinx/sphinx/client.rb', line 678

def RunQueries
  if @reqs.empty?
    @error = 'No queries defined, issue AddQuery() first'
    return false
  end

  req = @reqs.join('')
  nreqs = @reqs.length
  @reqs = []
  response = PerformRequest(:search, req, nreqs)
 
  # parse response
  begin
    results = []
    ires = 0
    while ires < nreqs
      ires += 1
      result = {}
      
      result['error'] = ''
      result['warning'] = ''
      
      # extract status
      status = result['status'] = response.get_int
      if status != SEARCHD_OK
        message = response.get_string
        if status == SEARCHD_WARNING
          result['warning'] = message
        else
          result['error'] = message
          results << result
          next
        end
      end
  
      # read schema
      fields = []
      attrs = {}
      attrs_names_in_order = []
      
      nfields = response.get_int
      while nfields > 0
        nfields -= 1
        fields << response.get_string
      end
      result['fields'] = fields
  
      nattrs = response.get_int
      while nattrs > 0
        nattrs -= 1
        attr = response.get_string
        type = response.get_int
        attrs[attr] = type
        attrs_names_in_order << attr
      end
      result['attrs'] = attrs
      
      # read match count
      count = response.get_int
      id64 = response.get_int
      
      # read matches
      result['matches'] = []
      while count > 0
        count -= 1
        
        if id64 != 0
          doc = response.get_int64
          weight = response.get_int
        else
          doc, weight = response.get_ints(2)
        end
  
        r = {} # This is a single result put in the result['matches'] array
        r['id'] = doc
        r['weight'] = weight
        attrs_names_in_order.each do |a|
          r['attrs'] ||= {}
  
          case attrs[a]
            when SPH_ATTR_BIGINT
              # handle 64-bit ints
              r['attrs'][a] = response.get_int64
            when SPH_ATTR_FLOAT
              # handle floats
              r['attrs'][a] = response.get_float
when SPH_ATTR_STRING
  # handle string
  r['attrs'][a] = response.get_string
            else
              # handle everything else as unsigned ints
              val = response.get_int
              if attrs[a]==SPH_ATTR_MULTI
                r['attrs'][a] = []
                1.upto(val) do
                  r['attrs'][a] << response.get_int
                end
              elsif attrs[a]==SPH_ATTR_MULTI64
                r['attrs'][a] = []
	val = val/2
                1.upto(val) do
                  r['attrs'][a] << response.get_int64
                end
              else
                r['attrs'][a] = val
              end
          end
        end
        result['matches'] << r
      end
      result['total'], result['total_found'], msecs, words = response.get_ints(4)
      result['time'] = '%.3f' % (msecs / 1000.0)
  
      result['words'] = {}
      while words > 0
        words -= 1
        word = response.get_string
        docs, hits = response.get_ints(2)
        result['words'][word] = { 'docs' => docs, 'hits' => hits }
      end
      
      results << result
    end
  #rescue EOFError
  #  @error = 'incomplete reply'
  #  raise SphinxResponseError, @error
  end
  
  return results
end

#SetFieldWeights(weights) ⇒ Object

Bind per-field weights by name.

Takes string (field name) to integer name (field weight) hash as an argument.

  • Takes precedence over SetWeights().

  • Unknown names will be silently ignored.

  • Unbound fields will be silently given a weight of 1.



311
312
313
314
315
316
317
318
319
# File 'lib/sphinx/sphinx/client.rb', line 311

def SetFieldWeights(weights)
  assert { weights.instance_of? Hash }
  weights.each do |name, weight|
    assert { name.instance_of? String }
    assert { weight.instance_of? Fixnum }
  end

  @fieldweights = weights
end

#SetFilter(attribute, values, exclude = false) ⇒ Object

Set values filter.

Only match those records where attribute column values are in specified set.



348
349
350
351
352
353
354
355
356
357
358
359
360
# File 'lib/sphinx/sphinx/client.rb', line 348

def SetFilter(attribute, values, exclude = false)
  assert { attribute.instance_of? String }
  assert { values.instance_of? Array }
  assert { !values.empty? }

  if values.instance_of?(Array) && values.size > 0
    values.each do |value|
      assert { value.instance_of? Fixnum }
    end
  
    @filters << { 'type' => SPH_FILTER_VALUES, 'attr' => attribute, 'exclude' => exclude, 'values' => values }
  end
end

#SetFilterFloatRange(attribute, min, max, exclude = false) ⇒ Object

Set float range filter.

Only match those records where attribute column value is beetwen min and max (including min and max).



379
380
381
382
383
384
385
386
# File 'lib/sphinx/sphinx/client.rb', line 379

def SetFilterFloatRange(attribute, min, max, exclude = false)
  assert { attribute.instance_of? String }
  assert { min.instance_of? Float }
  assert { max.instance_of? Float }
  assert { min <= max }

  @filters << { 'type' => SPH_FILTER_FLOATRANGE, 'attr' => attribute, 'exclude' => exclude, 'min' => min, 'max' => max }
end

#SetFilterRange(attribute, min, max, exclude = false) ⇒ Object

Set range filter.

Only match those records where attribute column value is beetwen min and max (including min and max).



366
367
368
369
370
371
372
373
# File 'lib/sphinx/sphinx/client.rb', line 366

def SetFilterRange(attribute, min, max, exclude = false)
  assert { attribute.instance_of? String }
  assert { min.instance_of? Fixnum or min.instance_of? Bignum }
  assert { max.instance_of? Fixnum or max.instance_of? Bignum }
  assert { min <= max }

  @filters << { 'type' => SPH_FILTER_RANGE, 'attr' => attribute, 'exclude' => exclude, 'min' => min, 'max' => max }
end

#SetGeoAnchor(attrlat, attrlong, lat, long) ⇒ Object

Setup anchor point for geosphere distance calculations.

Required to use @geodist in filters and sorting distance will be computed to this point. Latitude and longitude must be in radians.

  • attrlat – is the name of latitude attribute

  • attrlong – is the name of longitude attribute

  • lat – is anchor point latitude, in radians

  • long – is anchor point longitude, in radians



398
399
400
401
402
403
404
405
# File 'lib/sphinx/sphinx/client.rb', line 398

def SetGeoAnchor(attrlat, attrlong, lat, long)
  assert { attrlat.instance_of? String }
  assert { attrlong.instance_of? String }
  assert { lat.instance_of? Float }
  assert { long.instance_of? Float }

  @anchor = { 'attrlat' => attrlat, 'attrlong' => attrlong, 'lat' => lat, 'long' => long }
end

#SetGroupBy(attribute, func, groupsort = '@group desc') ⇒ Object

Set grouping attribute and function.

In grouping mode, all matches are assigned to different groups based on grouping function value.

Each group keeps track of the total match count, and the best match (in this group) according to current sorting function.

The final result set contains one best match per group, with grouping function value and matches count attached.

Groups in result set could be sorted by any sorting clause, including both document attributes and the following special internal Sphinx attributes:

  • @id - match document ID;

  • @weight, @rank, @relevance - match weight;

  • @group - groupby function value;

  • @count - amount of matches in group.

the default mode is to sort by groupby value in descending order, ie. by ‘@group desc’.

‘total_found’ would contain total amount of matching groups over the whole index.

WARNING: grouping is done in fixed memory and thus its results are only approximate; so there might be more groups reported in total_found than actually present. @count might also be underestimated.

For example, if sorting by relevance and grouping by “published” attribute with SPH_GROUPBY_DAY function, then the result set will contain one most relevant match per each day when there were any matches published, with day number and per-day match count attached, and sorted by day number in descending order (ie. recent days first).



443
444
445
446
447
448
449
450
451
452
453
454
455
456
# File 'lib/sphinx/sphinx/client.rb', line 443

def SetGroupBy(attribute, func, groupsort = '@group desc')
  assert { attribute.instance_of? String }
  assert { groupsort.instance_of? String }
  assert { func == SPH_GROUPBY_DAY \
        || func == SPH_GROUPBY_WEEK \
        || func == SPH_GROUPBY_MONTH \
        || func == SPH_GROUPBY_YEAR \
        || func == SPH_GROUPBY_ATTR \
        || func == SPH_GROUPBY_ATTRPAIR }

  @groupby = attribute
  @groupfunc = func
  @groupsort = groupsort
end

#SetGroupDistinct(attribute) ⇒ Object

Set count-distinct attribute for group-by queries.



459
460
461
462
# File 'lib/sphinx/sphinx/client.rb', line 459

def SetGroupDistinct(attribute)
  assert { attribute.instance_of? String }
  @groupdistinct = attribute
end

#SetIDRange(min, max) ⇒ Object

Set IDs range to match.

Only match records if document ID is beetwen min_id and max_id (inclusive).



335
336
337
338
339
340
341
342
# File 'lib/sphinx/sphinx/client.rb', line 335

def SetIDRange(min, max)
  assert { min.instance_of?(Fixnum) or min.instance_of?(Bignum) }
  assert { max.instance_of?(Fixnum) or max.instance_of?(Bignum) }
  assert { min <= max }

  @min_id = min
  @max_id = max
end

#SetIndexWeights(weights) ⇒ Object

Bind per-index weights by name.



322
323
324
325
326
327
328
329
330
# File 'lib/sphinx/sphinx/client.rb', line 322

def SetIndexWeights(weights)
  assert { weights.instance_of? Hash }
  weights.each do |index, weight|
    assert { index.instance_of? String }
    assert { weight.instance_of? Fixnum }
  end
  
  @indexweights = weights
end

#SetLimits(offset, limit, max = 0, cutoff = 0) ⇒ Object

Set offset and count into result set, and optionally set max-matches and cutoff limits.



232
233
234
235
236
237
238
239
240
241
242
243
244
# File 'lib/sphinx/sphinx/client.rb', line 232

def SetLimits(offset, limit, max = 0, cutoff = 0)
  assert { offset.instance_of? Fixnum }
  assert { limit.instance_of? Fixnum }
  assert { max.instance_of? Fixnum }
  assert { offset >= 0 }
  assert { limit > 0 }
  assert { max >= 0 }

  @offset = offset
  @limit = limit
  @maxmatches = max if max > 0
  @cutoff = cutoff if cutoff > 0
end

#SetMatchMode(mode) ⇒ Object

Set matching mode.



255
256
257
258
259
260
261
262
263
264
265
# File 'lib/sphinx/sphinx/client.rb', line 255

def SetMatchMode(mode)
  assert { mode == SPH_MATCH_ALL \
        || mode == SPH_MATCH_ANY \
        || mode == SPH_MATCH_PHRASE \
        || mode == SPH_MATCH_BOOLEAN \
        || mode == SPH_MATCH_EXTENDED \
        || mode == SPH_MATCH_FULLSCAN \
        || mode == SPH_MATCH_EXTENDED2 }

  @mode = mode
end

#SetMaxQueryTime(max) ⇒ Object

Set maximum query time, in milliseconds, per-index, integer, 0 means “do not limit”



248
249
250
251
252
# File 'lib/sphinx/sphinx/client.rb', line 248

def SetMaxQueryTime(max)
  assert { max.instance_of? Fixnum }
  assert { max >= 0 }
  @maxquerytime = max
end

#SetOverride(attrname, attrtype, values) ⇒ Object

There can be only one override per attribute. values must be a hash that maps document IDs to attribute values.



477
478
479
480
481
482
483
# File 'lib/sphinx/sphinx/client.rb', line 477

def SetOverride(attrname, attrtype, values)
   assert { attrname.instance_of? String }
   assert { [SPH_ATTR_INTEGER, SPH_ATTR_TIMESTAMP, SPH_ATTR_BOOL, SPH_ATTR_FLOAT, SPH_ATTR_BIGINT].include?(attrtype) }
   assert { values.instance_of? Hash }

   @overrides << { 'attr' => attrname, 'type' => attrtype, 'values' => values }
end

#SetRankingMode(ranker) ⇒ Object

Set ranking mode.



268
269
270
271
272
273
274
275
276
# File 'lib/sphinx/sphinx/client.rb', line 268

def SetRankingMode(ranker)
  assert { ranker == SPH_RANK_PROXIMITY_BM25 \
        || ranker == SPH_RANK_BM25 \
        || ranker == SPH_RANK_NONE \
        || ranker == SPH_RANK_WORDCOUNT \
        || ranker == SPH_RANK_PROXIMITY }

  @ranker = ranker
end

#SetRetries(count, delay = 0) ⇒ Object

Set distributed retries count and delay.



465
466
467
468
469
470
471
# File 'lib/sphinx/sphinx/client.rb', line 465

def SetRetries(count, delay = 0)
  assert { count.instance_of? Fixnum }
  assert { delay.instance_of? Fixnum }
  
  @retrycount = count
  @retrydelay = delay
end

#SetSelect(select) ⇒ Object

Set select-list (attributes or expressions), SQL-like syntax.



486
487
488
489
# File 'lib/sphinx/sphinx/client.rb', line 486

def SetSelect(select)
  assert { select.instance_of? String }
  @select = select
end

#SetServer(host, port) ⇒ Object

Set searchd host name (string) and port (integer).



222
223
224
225
226
227
228
# File 'lib/sphinx/sphinx/client.rb', line 222

def SetServer(host, port)
  assert { host.instance_of? String }
  assert { port.instance_of? Fixnum }

  @host = host
  @port = port
end

#SetSortMode(mode, sortby = '') ⇒ Object

Set matches sorting mode.



279
280
281
282
283
284
285
286
287
288
289
290
291
# File 'lib/sphinx/sphinx/client.rb', line 279

def SetSortMode(mode, sortby = '')
  assert { mode == SPH_SORT_RELEVANCE \
        || mode == SPH_SORT_ATTR_DESC \
        || mode == SPH_SORT_ATTR_ASC \
        || mode == SPH_SORT_TIME_SEGMENTS \
        || mode == SPH_SORT_EXTENDED \
        || mode == SPH_SORT_EXPR }
  assert { sortby.instance_of? String }
  assert { mode == SPH_SORT_RELEVANCE || !sortby.empty? }

  @sort = mode
  @sortby = sortby
end

#SetWeights(weights) ⇒ Object

Bind per-field weights by order.

DEPRECATED; use SetFieldWeights() instead.



296
297
298
299
300
301
302
303
# File 'lib/sphinx/sphinx/client.rb', line 296

def SetWeights(weights)
  assert { weights.instance_of? Array }
  weights.each do |weight|
    assert { weight.instance_of? Fixnum }
  end

  @weights = weights
end

#UpdateAttributes(index, attrs, values, mva = false) ⇒ Object

Batch update given attributes in given rows in given indexes.

  • index is a name of the index to be updated

  • attrs is an array of attribute name strings.

  • values is a hash where key is document id, and value is an array of

  • mva identifies whether update MVA

new attribute values

Returns number of actually updated documents (0 or more) on success. Returns -1 on failure.

Usage example:

sphinx.UpdateAttributes('test1', ['group_id'], { 1 => [456] })


959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
# File 'lib/sphinx/sphinx/client.rb', line 959

def UpdateAttributes(index, attrs, values, mva = false)
  # verify everything
  assert { index.instance_of? String }
  assert { mva.instance_of?(TrueClass) || mva.instance_of?(FalseClass) }
  
  assert { attrs.instance_of? Array }
  attrs.each do |attr|
    assert { attr.instance_of? String }
  end
  
  assert { values.instance_of? Hash }
  values.each do |id, entry|
    assert { id.instance_of? Fixnum }
    assert { entry.instance_of? Array }
    assert { entry.length == attrs.length }
    entry.each do |v|
      if mva
        assert { v.instance_of? Array }
        v.each { |vv| assert { vv.instance_of? Fixnum } }
      else
        assert { v.instance_of? Fixnum }
      end
    end
  end
  
  # build request
  request = Request.new
  request.put_string index
  
  request.put_int attrs.length
  for attr in attrs
    request.put_string attr
    request.put_int mva ? 1 : 0
  end
  
  request.put_int values.length
  values.each do |id, entry|
    request.put_int64 id
    if mva
      entry.each { |v| request.put_int_array v }
    else
      request.put_int(*entry)
    end
  end
  
  response = PerformRequest(:update, request)
  
  # parse response
  begin
    return response.get_int
  rescue EOFError
    @error = 'incomplete reply'
    raise SphinxResponseError, @error
  end
end