Class: PEROBS::BTree

Inherits:
Object
  • Object
show all
Defined in:
lib/perobs/BTree.rb

Overview

This BTree class is very similar to a classic B+Tree implementation. It manages a tree that is always balanced. The BTree is stored in the specified directory and partially kept in memory to speed up operations. The order of the tree specifies how many keys each node will be able to hold. Leaf nodes will have a value associated with each key. Branch nodes have N + 1 references to child nodes instead.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(dir, name, order, progressmeter) ⇒ BTree

Create a new BTree object.

Parameters:

  • dir (String)

    Directory to store the tree file

  • name (String)

    Base name of the BTree related files in ‘dir’

  • order (Integer)

    The maximum number of keys per node. This number must be odd and larger than 2 and smaller than 2**16 - 1.

  • progressmeter (ProgressMeter)

    reference to a ProgressMeter object



51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# File 'lib/perobs/BTree.rb', line 51

def initialize(dir, name, order, progressmeter)
  @dir = dir
  @name = name
  @progressmeter = progressmeter

  unless order > 2
    PEROBS.log.fatal "BTree order must be larger than 2, not #{order}"
  end
  unless order % 2 == 1
    PEROBS.log.fatal "BTree order must be an uneven number, not #{order}"
  end
  unless order < 2 ** 16 - 1
    PEROBS.log.fatal "BTree order must be smaller than #{2**16 - 1}"
  end
  @order = order

  # This EquiBlobsFile contains the nodes of the BTree.
  @nodes = EquiBlobsFile.new(@dir, @name, @progressmeter,
                             BTreeNode::node_bytes(@order))
  @nodes.register_custom_data('first_leaf')
  @nodes.register_custom_data('last_leaf')
  @nodes.register_custom_data('btree_size')
  @node_cache = PersistentObjectCache.new(16384, 5000, BTreeNode, self)
  @root = @first_leaf = @last_leaf = nil
  @size = 0

  # This BTree implementation uses a write cache to improve write
  # performance of multiple successive read/write operations. This also
  # means that data may not be written on the backing store until the
  # sync() or close() methods have been called. A bug in the program or a
  # premature program termination can lead to data loss. To detect such
  # situations, we use a lock file whenever there are pending writes.
  @is_dirty = false
  @dirty_flag = LockFile.new(File.join(@dir, name + '.dirty'),
                             { :timeout_secs => 0 })
end

Instance Attribute Details

#first_leafObject (readonly)

Returns the value of attribute first_leaf.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def first_leaf
  @first_leaf
end

#last_leafObject (readonly)

Returns the value of attribute last_leaf.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def last_leaf
  @last_leaf
end

#node_cacheObject (readonly)

Returns the value of attribute node_cache.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def node_cache
  @node_cache
end

#nodesObject (readonly)

Returns the value of attribute nodes.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def nodes
  @nodes
end

#orderObject (readonly)

Returns the value of attribute order.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def order
  @order
end

#sizeObject (readonly)

Returns the value of attribute size.



43
44
45
# File 'lib/perobs/BTree.rb', line 43

def size
  @size
end

Instance Method Details

#check(&block) ⇒ Boolean

Check if the tree file contains any errors.

Returns:

  • (Boolean)

    true if no erros were found, false otherwise



163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
# File 'lib/perobs/BTree.rb', line 163

def check(&block)
  sync
  return false unless @nodes.check

  entries = 0
  res = true
  @progressmeter.start('Checking index structure', @size) do |pm|
    res = @root.check do |k, v|
      pm.update(entries += 1)
      block_given? ? yield(k, v) : true
    end
  end

  unless entries == @size
    PEROBS.log.error "The BTree size (#{@size}) and the number of " +
      "found entries (#{entries}) don't match"
    return false
  end

  res
end

#clearObject

Clear all pools and forget any registered spaces.



136
137
138
139
140
141
# File 'lib/perobs/BTree.rb', line 136

def clear
  @node_cache.clear
  @nodes.clear
  @size = 0
  set_root(BTreeNode::create(self))
end

#closeObject

Close the tree file.



119
120
121
122
123
124
125
126
127
128
# File 'lib/perobs/BTree.rb', line 119

def close

  def val_perc(value, total)
    "#{value} (#{(value.to_f / total*100.0).to_i}%)"
  end

  sync
  @nodes.close
  @root = nil
end

#delete_node(address) ⇒ Object

Delete the node at the given address in the BTree file.

Parameters:

  • address (Integer)

    address in file



251
252
253
254
# File 'lib/perobs/BTree.rb', line 251

def delete_node(address)
  @node_cache.delete(address)
  @nodes.delete_blob(address)
end

#each {|key, value| ... } ⇒ Object

Iterate over all key/value pairs that are stored in the tree.

Yields:

  • (key, value)


245
246
247
# File 'lib/perobs/BTree.rb', line 245

def each(&block)
  @root.each(&block)
end

#entries_countInteger

Returns The number of entries stored in the tree.

Returns:

  • (Integer)

    The number of entries stored in the tree.



257
258
259
# File 'lib/perobs/BTree.rb', line 257

def entries_count
  @size
end

#eraseObject

Erase the backing store of the BTree. This method should only be called when not having the BTree open. And it obviously and permanently erases all stored data from the BTree.



146
147
148
149
150
151
# File 'lib/perobs/BTree.rb', line 146

def erase
  @nodes.erase
  @size = 0
  @root = nil
  @dirty_flag.forced_unlock
end

#get(key) ⇒ Integer or nil

Retrieve the value associated with the given key. If no entry was found, return nil.

Parameters:

  • key (Integer)

    Unique key

Returns:

  • (Integer or nil)

    found value or nil



219
220
221
# File 'lib/perobs/BTree.rb', line 219

def get(key)
  @root.get(key)
end

#insert(key, value) ⇒ Object

Insert a new value into the tree using the key as a unique index. If the key already exists the old value will be overwritten.

Parameters:

  • key (Integer)

    Unique key

  • value (Integer)

    value



210
211
212
213
# File 'lib/perobs/BTree.rb', line 210

def insert(key, value)
  @size += 1 if @root.insert(key, value)
  @node_cache.flush
end

#is_open?Boolean

Returns true if file is currently open.

Returns:

  • (Boolean)

    true if file is currently open



131
132
133
# File 'lib/perobs/BTree.rb', line 131

def is_open?
  !@root.nil?
end

#open(file_must_exist = false) ⇒ Object

Open the tree file.



89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
# File 'lib/perobs/BTree.rb', line 89

def open(file_must_exist = false)
  if @dirty_flag.is_locked?
    PEROBS.log.fatal "Index file #{@nodes.file_name} is already " +
      "locked"
  end
  if file_must_exist && !@nodes.file_exist?
    PEROBS.log.fatal "Index file #{@nodes.file_name} does not exist"
  end

  @node_cache.clear
  @nodes.open

  if @nodes.total_entries == 0
    # We've created a new nodes file
    node = BTreeNode::create(self)
  else
    # We are loading an existing tree.
    node = BTreeNode::load_and_link(self, @nodes.first_entry)
    @first_leaf = BTreeNode::load_and_link(
      self, @nodes.get_custom_data('first_leaf'))
    @last_leaf = BTreeNode::load_and_link(
      self, @nodes.get_custom_data('last_leaf'))
  end
  set_root(node)

  # Get the total number of entries that are stored in the tree.
  @size = @nodes.get_custom_data('btree_size')
end

#remove(key) ⇒ Object

Find and remove the value associated with the given key. If no entry was found, return nil, otherwise the found value.



225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
# File 'lib/perobs/BTree.rb', line 225

def remove(key)
  @size -= 1 unless (removed_value = @root.remove(key)).nil?

  # Check if the root node only contains one child link after the delete
  # operation. Then we can delete that node and pull the tree one level
  # up. This could happen for a sequence of nodes that all got merged to
  # single child nodes.
  while !@root.is_leaf && @root.children.size == 1
    old_root = @root
    set_root(@root.children.first)
    @root.parent = nil
    delete_node(old_root.node_address)
  end

  @node_cache.flush
  removed_value
end

#set_first_leaf(node) ⇒ Object

Set the address of the first leaf node.

Parameters:



194
195
196
197
# File 'lib/perobs/BTree.rb', line 194

def set_first_leaf(node)
  @first_leaf = node
  @nodes.set_custom_data('first_leaf', node.node_address)
end

#set_last_leaf(node) ⇒ Object

Set the address of the last leaf node.

Parameters:



201
202
203
204
# File 'lib/perobs/BTree.rb', line 201

def set_last_leaf(node)
  @last_leaf = node
  @nodes.set_custom_data('last_leaf', node.node_address)
end

#set_root(node) ⇒ Object

Register a new node as root node of the tree.

Parameters:



187
188
189
190
# File 'lib/perobs/BTree.rb', line 187

def set_root(node)
  @root = node
  @nodes.first_entry = node.node_address
end

#syncObject

Flush all pending modifications into the tree file.



154
155
156
157
158
159
# File 'lib/perobs/BTree.rb', line 154

def sync
  @node_cache.flush(true)
  @nodes.set_custom_data('btree_size', @size)
  @nodes.sync
  @dirty_flag.unlock if @dirty_flag.is_locked?
end

#to_sString

Returns Human reable form of the tree.

Returns:

  • (String)

    Human reable form of the tree.



262
263
264
# File 'lib/perobs/BTree.rb', line 262

def to_s
  @root.to_s
end

#val_perc(value, total) ⇒ Object



121
122
123
# File 'lib/perobs/BTree.rb', line 121

def val_perc(value, total)
  "#{value} (#{(value.to_f / total*100.0).to_i}%)"
end