Class: PEROBS::BigArray

Inherits:
Object show all
Defined in:
lib/perobs/BigArray.rb

Overview

The BigArray class implements an Array that stores the data in segments. It only loads the currently needed parts of the Array into memory. To provide an efficient access to the data by index a B+Tree like data structure is used. Each segment is stored in a leaf node of the B+Tree.

Defined Under Namespace

Classes: Stats

Constant Summary

Constants inherited from ObjectBase

ObjectBase::NATIVE_CLASSES

Instance Attribute Summary

Attributes inherited from Object

#attributes

Attributes inherited from ObjectBase

#_id, #myself, #store

Instance Method Summary collapse

Methods inherited from Object

#_delete_reference_to_id, #_deserialize, #_referenced_object_ids, #attr_init, attr_persist, #init_attr, #inspect, #mark_as_modified

Methods inherited from ObjectBase

#==, #_check_assignment_value, _finalize, #_initialize, #_restore, #_stash, #_sync, #_transfer, read, #restore

Constructor Details

#initialize(p, node_size = 150) ⇒ BigArray

Internal constructor. Use Store.new() instead.

Parameters:

  • p (Handle)
  • node_size (Integer) (defaults to: 150)

    The size of the tree nodes. This determines how many entries must be read/written for each operation. The default of 150 was emperically found to be a performance sweet spot. Smaller values will improve write operations. Larger values will improve read operations. 20 - 500 is a reasonable range to try.



50
51
52
53
54
55
56
57
# File 'lib/perobs/BigArray.rb', line 50

def initialize(p, node_size = 150)
  super(p)
  PEROBS.log.fatal "Node size (#{node_size}) must be larger than 3" unless node_size > 3
  PEROBS.log.fatal "Node size (#{node_size}) must be an even number" unless (node_size % 2).zero?

  self.node_size = node_size
  clear
end

Instance Method Details

#<<(value) ⇒ Object



92
93
94
# File 'lib/perobs/BigArray.rb', line 92

def <<(value)
  self[@entry_counter] = value
end

#[](index) ⇒ Integer or nil

Return the value stored at the given index.

Parameters:

  • index (Integer)

    Position in the array

Returns:

  • (Integer or nil)

    found value or nil



117
118
119
120
121
122
123
124
125
126
127
# File 'lib/perobs/BigArray.rb', line 117

def [](index)
  begin
    index = validate_index_range(index)
  rescue IndexError
    return nil
  end

  return nil if index >= @entry_counter

  @root.get(index)
end

#[]=(index, value) ⇒ Object

Store the value at the given index. If the index already exists the old value will be overwritten.

Parameters:

  • index (Integer)

    Position in the array

  • value (Integer)

    value



70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'lib/perobs/BigArray.rb', line 70

def []=(index, value)
  index = validate_index_range(index)

  @store.transaction do
    if index < @entry_counter
      # Overwrite of an existing element
      @root.set(index, value)
    elsif index == @entry_counter
      # Append right at the end
      @root.insert(index, value)
      self.entry_counter += 1
    else
      # Append with nil padding
      @entry_counter.upto(index - 1) do |i|
        @root.insert(i, nil)
      end
      @root.insert(index, value)
      self.entry_counter = index + 1
    end
  end
end

#check(&block) ⇒ Boolean

Check if the tree file contains any errors.

Returns:

  • (Boolean)

    true if no erros were found, false otherwise



245
246
247
# File 'lib/perobs/BigArray.rb', line 245

def check(&block)
  @root.check(&block)
end

#clearObject

Remove all entries from the BigArray.



60
61
62
63
64
# File 'lib/perobs/BigArray.rb', line 60

def clear
  self.root = self.first_leaf = self.last_leaf =
    @store.new(BigArrayNode, myself, true)
  self.entry_counter = 0
end

#delete_at(index) ⇒ Object

Delete the element at the specified index, returning that element, or nil if the index is out of range.

Parameters:

  • index (Integer)

    Index in the BigArray

Returns:

  • (Object)

    found value or nil



140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
# File 'lib/perobs/BigArray.rb', line 140

def delete_at(index)
  index = @entry_counter + index if index.negative?

  return nil if index.negative? || index >= @entry_counter

  deleted_value = nil
  @store.transaction do
    deleted_value = @root.delete_at(index)
    self.entry_counter -= 1

    # Eliminate single entry nodes at the top.
    while !@root.is_leaf? && @root.size == 1
      @root = @root.children.first
      @root.parent = nil
    end
  end

  deleted_value
end

#delete_if {|key, value| ... } ⇒ Object

Delete all entries for which the passed block yields true. The implementation is optimized for large bulk deletes. It rebuilds a new BTree for the elements to keep. If only few elements are deleted the overhead of rebuilding the BTree is rather high.

Yields:

  • (key, value)


165
166
167
168
169
170
171
# File 'lib/perobs/BigArray.rb', line 165

def delete_if
  old_root = @root
  clear
  old_root.each do |k, v|
    insert(k, v) unless yield(k, v)
  end
end

#each {|key, value| ... } ⇒ Object

Iterate over all entries in the tree. Entries are always sorted by the key.

Yields:

  • (key, value)


202
203
204
205
206
207
208
209
# File 'lib/perobs/BigArray.rb', line 202

def each(&block)
  node = @first_leaf
  while node
    break unless node.each(&block)

    node = node.next_sibling
  end
end

#empty?Boolean

Return true if the BigArray has no stored entries.

Returns:

  • (Boolean)


181
182
183
# File 'lib/perobs/BigArray.rb', line 181

def empty?
  @entry_counter.zero?
end

#firstObject

Return the first entry of the Array.



186
187
188
189
190
# File 'lib/perobs/BigArray.rb', line 186

def first
  return nil unless @first_leaf

  @first_leaf.values.first
end

#has_key?(key) ⇒ Boolean

Check if there is an entry for the given key.

Parameters:

  • key (Integer)

    Unique key

Returns:

  • (Boolean)

    True if key is present, false otherwise.



132
133
134
# File 'lib/perobs/BigArray.rb', line 132

def has_key?(key)
  @root.has_key?(key)
end

#insert(index, value) ⇒ Object

Insert the value at the given index. If the index already exists the old value will be overwritten.

Parameters:

  • index (Integer)

    Position in the array

  • value (Integer)

    value



100
101
102
103
104
105
106
107
108
109
110
111
112
# File 'lib/perobs/BigArray.rb', line 100

def insert(index, value)
  index = validate_index_range(index)

  if index < @entry_counter
    # Insert in between existing elements
    @store.transaction do
      @root.insert(index, value)
      self.entry_counter += 1
    end
  else
    self[index] = value
  end
end

#lastObject

Return the last entry of the Array.



193
194
195
196
197
# File 'lib/perobs/BigArray.rb', line 193

def last
  return nil unless @last_leaf

  @last_leaf.values.last
end

#lengthInteger Also known as: size

Returns The number of entries stored in the tree.

Returns:

  • (Integer)

    The number of entries stored in the tree.



174
175
176
# File 'lib/perobs/BigArray.rb', line 174

def length
  @entry_counter
end

#reverse_each {|key, value| ... } ⇒ Object

Iterate over all entries in the tree in reverse order. Entries are always sorted by the key.

Yields:

  • (key, value)


214
215
216
217
218
219
220
221
# File 'lib/perobs/BigArray.rb', line 214

def reverse_each(&block)
  node = @last_leaf
  while node
    break unless node.reverse_each(&block)

    node = node.prev_sibling
  end
end

#statisticsStats

Gather some statistics regarding the tree structure.

Returns:

  • (Stats)

    Structs with gathered data



251
252
253
254
255
# File 'lib/perobs/BigArray.rb', line 251

def statistics
  stats = Stats.new(0, 0, nil, nil)
  @root.statistics(stats)
  stats
end

#to_aObject

Convert the BigArray into a Ruby Array. This is primarily intended for debugging as real-world BigArray objects are likely too big to fit into memory.



226
227
228
229
230
231
232
233
234
235
# File 'lib/perobs/BigArray.rb', line 226

def to_a
  ary = []
  node = @first_leaf
  while node
    ary += node.values
    node = node.next_sibling
  end

  ary
end

#to_sString

for debugging and should only be used with small BigArray objects.

Returns:

  • (String)

    Human reable form of the tree. This is only intended



239
240
241
# File 'lib/perobs/BigArray.rb', line 239

def to_s
  @root.to_s
end