Class: IndexBuilder

Inherits:
Object
  • Object
show all
Defined in:
lib/index_builder.rb

Overview

Class containing methods for building an search index.

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(samples, mismatches_max) ⇒ IndexBuilder

Internal: Constructor method for IndexBuilder object. The given Array of samples and mismatches_max are saved as an instance variable.

samples - Array of Sample objects. mismatches_max - Integer denoting the maximum number of misses allowed in

an index sequence.

Examples

IndexBuilder.new(samples, 2)
  # => <IndexBuilder>

Returns an IndexBuilder object.



64
65
66
67
# File 'lib/index_builder.rb', line 64

def initialize(samples, mismatches_max)
  @samples        = samples
  @mismatches_max = mismatches_max
end

Class Method Details

.build(samples, mismatches_max) ⇒ Object

Internal: Class method that build a search index from a given Array of samples. The index consists of a Google Hash, which don’t have Ruby’s garbage collection and therefore is much more efficient. The Hash keys consists of index1 and index2 concatenated, and furthermore, if mismatches_max is given index1, and index2 are permutated accordingly. The Hash values are the sample number.

samples - Array of samples (Sample objects with id, index1 and index2).

Examples

IndexBuilder.build(samples)
  # => <Google Hash>

Returns a Google Hash where the key is the index and the value is sample number.



45
46
47
48
49
# File 'lib/index_builder.rb', line 45

def self.build(samples, mismatches_max)
  index_builder = new(samples, mismatches_max)
  index_hash    = index_builder.index_init
  index_builder.index_populate(index_hash)
end

Instance Method Details

#index_initObject

Internal: Method to initialize the index. If @mismatches_max is <= then GoogleHashSparseLongToInt is used else GoogleHashDenseLongToInt due to memory and performance.

Returns a Google Hash.



74
75
76
77
78
79
80
81
82
# File 'lib/index_builder.rb', line 74

def index_init
  if @mismatches_max <= 1
    index_hash = GoogleHashSparseLongToInt.new
  else
    index_hash = GoogleHashDenseLongToInt.new
  end

  index_hash
end

#index_populate(index_hash) ⇒ Object

Internal: Method to populate the index.

index_hash - Google Hash with initialized index.

Returns a Google Hash.



89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# File 'lib/index_builder.rb', line 89

def index_populate(index_hash)
  @samples.each_with_index do |sample, i|
    index_list1 = permutate([sample.index1], @mismatches_max)
    index_list2 = permutate([sample.index2], @mismatches_max)

    index_list1.product(index_list2).each do |index1, index2|
      key = "#{index1}#{index2}".hash

      index_check_existing(index_hash, key, sample, index1, index2)

      index_hash[key] = i
    end
  end

  index_hash
end