Class: PEROBS::BigArray

Inherits:
Object show all
Defined in:
lib/perobs/BigArray.rb

Overview

The BigArray class implements an Array that stores the data in segments. It only loads the currently needed parts of the Array into memory. To provide an efficient access to the data by index a B+Tree like data structure is used. Each segment is stored in a leaf node of the B+Tree.

Defined Under Namespace

Classes: Stats

Constant Summary

Constants inherited from ObjectBase

ObjectBase::NATIVE_CLASSES

Instance Attribute Summary

Attributes inherited from Object

#attributes

Attributes inherited from ObjectBase

#_id, #myself, #store

Instance Method Summary collapse

Methods inherited from Object

#_delete_reference_to_id, #_deserialize, #_referenced_object_ids, #attr_init, attr_persist, #init_attr, #inspect, #mark_as_modified

Methods inherited from ObjectBase

#==, #_check_assignment_value, _finalize, #_initialize, #_restore, #_stash, #_sync, #_transfer, read, #restore

Constructor Details

#initialize(p, node_size = 150) ⇒ BigArray

Internal constructor. Use Store.new() instead.

Parameters:

  • p (Handle)
  • node_size (Integer) (defaults to: 150)

    The size of the tree nodes. This determines how many entries must be read/written for each operation. The default of 150 was emperically found to be a performance sweet spot. Smaller values will improve write operations. Larger values will improve read operations. 20 - 500 is a reasonable range to try.



54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/perobs/BigArray.rb', line 54

def initialize(p, node_size = 150)
  super(p)
  unless node_size > 3
    PEROBS.log.fatal "Node size (#{node_size}) must be larger than 3"
  end
  unless node_size % 2 == 0
    PEROBS.log.fatal "Node size (#{node_size}) must be an even number"
  end

  self.node_size = node_size
  clear
end

Instance Method Details

#<<(value) ⇒ Object



100
101
102
# File 'lib/perobs/BigArray.rb', line 100

def <<(value)
  self[@entry_counter] = value
end

#[](index) ⇒ Integer or nil

Return the value stored at the given index.

Parameters:

  • index (Integer)

    Position in the array

Returns:

  • (Integer or nil)

    found value or nil



125
126
127
128
129
130
131
# File 'lib/perobs/BigArray.rb', line 125

def [](index)
  index = validate_index_range(index)

  return nil if index >= @entry_counter

  @root.get(index)
end

#[]=(index, value) ⇒ Object

Store the value at the given index. If the index already exists the old value will be overwritten.

Parameters:

  • index (Integer)

    Position in the array

  • value (Integer)

    value



78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# File 'lib/perobs/BigArray.rb', line 78

def []=(index, value)
  index = validate_index_range(index)

  @store.transaction do
    if index < @entry_counter
      # Overwrite of an existing element
      @root.set(index, value)
    elsif index == @entry_counter
      # Append right at the end
      @root.insert(index, value)
      self.entry_counter += 1
    else
      # Append with nil padding
      @entry_counter.upto(index - 1) do |i|
        @root.insert(i, nil)
      end
      @root.insert(index, value)
      self.entry_counter = index + 1
    end
  end
end

#check(&block) ⇒ Boolean

Check if the tree file contains any errors.

Returns:

  • (Boolean)

    true if no erros were found, false otherwise



237
238
239
# File 'lib/perobs/BigArray.rb', line 237

def check(&block)
  @root.check(&block)
end

#clearObject

Remove all entries from the BigArray.



68
69
70
71
72
# File 'lib/perobs/BigArray.rb', line 68

def clear
  self.root = self.first_leaf = self.last_leaf =
    @store.new(BigArrayNode, myself, true)
  self.entry_counter = 0
end

#delete_at(index) ⇒ Object

Delete the element at the specified index, returning that element, or nil if the index is out of range.

Parameters:

  • index (Integer)

    Index in the BigArray

Returns:

  • (Object)

    found value or nil



144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# File 'lib/perobs/BigArray.rb', line 144

def delete_at(index)
  if index < 0
    index = @entry_counter + index
  end

  return nil if index < 0 || index >= @entry_counter

  deleted_value = nil
  @store.transaction do
    deleted_value = @root.delete_at(index)
    self.entry_counter -= 1

    # Eliminate single entry nodes at the top.
    while !@root.is_leaf? && @root.size == 1
      @root = @root.children.first
      @root.parent = nil
    end
  end

  deleted_value
end

#delete_if {|key, value| ... } ⇒ Object

Delete all entries for which the passed block yields true. The implementation is optimized for large bulk deletes. It rebuilds a new BTree for the elements to keep. If only few elements are deleted the overhead of rebuilding the BTree is rather high.

Yields:

  • (key, value)


171
172
173
174
175
176
177
178
179
# File 'lib/perobs/BigArray.rb', line 171

def delete_if
  old_root = @root
  clear
  old_root.each do |k, v|
    if !yield(k, v)
      insert(k, v)
    end
  end
end

#each {|key, value| ... } ⇒ Object

Iterate over all entries in the tree. Entries are always sorted by the key.

Yields:

  • (key, value)


196
197
198
199
200
201
202
# File 'lib/perobs/BigArray.rb', line 196

def each(&block)
  node = @first_leaf
  while node
    break unless node.each(&block)
    node = node.next_sibling
  end
end

#empty?Boolean

Return true if the BigArray has no stored entries.

Returns:

  • (Boolean)


189
190
191
# File 'lib/perobs/BigArray.rb', line 189

def empty?
  @entry_counter == 0
end

#has_key?(key) ⇒ Boolean

Check if there is an entry for the given key.

Parameters:

  • key (Integer)

    Unique key

Returns:

  • (Boolean)

    True if key is present, false otherwise.



136
137
138
# File 'lib/perobs/BigArray.rb', line 136

def has_key?(key)
  @root.has_key?(key)
end

#insert(index, value) ⇒ Object

Insert the value at the given index. If the index already exists the old value will be overwritten.

Parameters:

  • index (Integer)

    Position in the array

  • value (Integer)

    value



108
109
110
111
112
113
114
115
116
117
118
119
120
# File 'lib/perobs/BigArray.rb', line 108

def insert(index, value)
  index = validate_index_range(index)

  if index < @entry_counter
    # Insert in between existing elements
    @store.transaction do
      @root.insert(index, value)
      self.entry_counter += 1
    end
  else
    self[index] = value
  end
end

#lengthInteger Also known as: size

Returns The number of entries stored in the tree.

Returns:

  • (Integer)

    The number of entries stored in the tree.



182
183
184
# File 'lib/perobs/BigArray.rb', line 182

def length
  @entry_counter
end

#reverse_each {|key, value| ... } ⇒ Object

Iterate over all entries in the tree in reverse order. Entries are always sorted by the key.

Yields:

  • (key, value)


207
208
209
210
211
212
213
# File 'lib/perobs/BigArray.rb', line 207

def reverse_each(&block)
  node = @last_leaf
  while node
    break unless node.reverse_each(&block)
    node = node.prev_sibling
  end
end

#statisticsStats

Gather some statistics regarding the tree structure.

Returns:

  • (Stats)

    Structs with gathered data



243
244
245
246
247
# File 'lib/perobs/BigArray.rb', line 243

def statistics
  stats = Stats.new(0, 0, nil, nil)
  @root.statistics(stats)
  stats
end

#to_aObject

Convert the BigArray into a Ruby Array. This is primarily intended for debugging as real-world BigArray objects are likely too big to fit into memory.



218
219
220
221
222
223
224
225
226
227
# File 'lib/perobs/BigArray.rb', line 218

def to_a
  ary = []
  node = @first_leaf
  while node do
    ary += node.values
    node = node.next_sibling
  end

  ary
end

#to_sString

for debugging and should only be used with small BigArray objects.

Returns:

  • (String)

    Human reable form of the tree. This is only intended



231
232
233
# File 'lib/perobs/BigArray.rb', line 231

def to_s
  @root.to_s
end