Class: PEROBS::BTreeDB

Inherits:
DataBase show all
Defined in:
lib/perobs/BTreeDB.rb

Overview

This class implements a BTree database using filesystem directories as nodes and blob files as leafs. The BTree grows with the number of stored entries. Each leaf node blob can hold a fixed number of entries. If more entries need to be stored, the blob is replaced by a node with multiple new leafs that store the entries of the previous node. The leafs are implemented by the BTreeBlob class.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from DataBase

#check_option, #close, #deserialize, #open, #serialize

Constructor Details

#initialize(db_name, options = {}) ⇒ BTreeDB

Create a new BTreeDB object.

Parameters:

  • db_name (String)

    name of the DB directory

  • options (Hash) (defaults to: {})

    options to customize the behavior. Currently only the following options are supported: :serializer : Can be :marshal, :json, :yaml :dir_bits : The number of bits to use for the BTree nodes.

    The value must be between 4 and 14. The larger
    the number the more back-end directories are
    being used. The default is 12 which results in
    4096 directories per node.
    

    :max_blob_size : The maximum number of entries in the BTree leaf

    nodes. The insert/find/delete time grows
    linearly with the size.
    


61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# File 'lib/perobs/BTreeDB.rb', line 61

def initialize(db_name, options = {})
  super(options)

  @db_dir = db_name
  # Create the database directory if it doesn't exist yet.
  ensure_dir_exists(@db_dir)

  # Read the existing DB config.
  @config = get_hash('config')
  check_option('serializer')

  # Check and set @dir_bits, the number of bits used for each tree level.
  @dir_bits = options[:dir_bits] || 12
  if @dir_bits < 4 || @dir_bits > 14
    PEROBS.log.fatal "dir_bits option (#{@dir_bits}) must be between 4 " +
      "and 12"
  end
  check_option('dir_bits')

  @max_blob_size = options[:max_blob_size] || 32
  if @max_blob_size < 4 || @max_blob_size > 128
    PEROBS.log.fatal "max_blob_size option (#{@max_blob_size}) must be " +
      "between 4 and 128"
  end
  check_option('max_blob_size')

  put_hash('config', @config)

  # This format string is used to create the directory name.
  @dir_format_string = "%0#{(@dir_bits / 4) +
                            (@dir_bits % 4 == 0 ? 0 : 1)}X"
  # Bit mask to extract the dir_bits LSBs.
  @dir_mask = 2 ** @dir_bits - 1
end

Instance Attribute Details

#max_blob_sizeObject (readonly)

Returns the value of attribute max_blob_size.



46
47
48
# File 'lib/perobs/BTreeDB.rb', line 46

def max_blob_size
  @max_blob_size
end

Class Method Details

.delete_db(db_name) ⇒ Object



102
103
104
# File 'lib/perobs/BTreeDB.rb', line 102

def BTreeDB::delete_db(db_name)
  FileUtils.rm_rf(db_name)
end

Instance Method Details

#check(id, repair) ⇒ TrueClass/FalseClass

Check if the stored object is syntactically correct.

Parameters:

  • id (Integer)

    Object ID

  • repair (TrueClass/FalseClass)

    True if an repair attempt should be made.

Returns:

  • (TrueClass/FalseClass)

    True if the object is OK, otherwise false.



198
199
200
201
202
203
204
205
206
207
# File 'lib/perobs/BTreeDB.rb', line 198

def check(id, repair)
  begin
    get_object(id)
  rescue => e
    $stderr.puts "Cannot read object with ID #{id}: #{e.message}"
    return false
  end

  true
end

#check_db(repair = false) ⇒ Object

Basic consistency check.

Parameters:

  • repair (TrueClass/FalseClass) (defaults to: false)

    True if found errors should be repaired.

Returns:

  • number of errors found



187
188
189
190
# File 'lib/perobs/BTreeDB.rb', line 187

def check_db(repair = false)
  each_blob { |blob| blob.check(repair) }
  0
end

#clear_marksObject

This method must be called to initiate the marking process.



156
157
158
# File 'lib/perobs/BTreeDB.rb', line 156

def clear_marks
  each_blob { |blob| blob.clear_marks }
end

#delete_databaseObject

Delete the entire database. The database is no longer usable after this method was called.



98
99
100
# File 'lib/perobs/BTreeDB.rb', line 98

def delete_database
  FileUtils.rm_rf(@db_dir)
end

#delete_unmarked_objects(&block) ⇒ Array

Permanently delete all objects that have not been marked. Those are orphaned and are no longer referenced by any actively used object.

Returns:

  • (Array)

    List of IDs that have been removed from the DB.



163
164
165
166
167
# File 'lib/perobs/BTreeDB.rb', line 163

def delete_unmarked_objects(&block)
  deleted_ids = []
  each_blob { |blob| deleted_ids += blob.delete_unmarked_entries(&block) }
  deleted_ids
end

#get_hash(name) ⇒ Hash

Load the Hash with the given name.

Parameters:

  • name (String)

    Name of the hash.

Returns:

  • (Hash)

    A Hash that maps String objects to strings or numbers.



128
129
130
131
132
133
134
135
136
137
138
# File 'lib/perobs/BTreeDB.rb', line 128

def get_hash(name)
  file_name = File.join(@db_dir, name + '.json')
  return ::Hash.new unless File.exist?(file_name)

  begin
    json = File.read(file_name)
  rescue => e
    PEROBS.log.fatal "Cannot read hash file '#{file_name}': #{e.message}"
  end
  JSON.parse(json, :create_additions => true)
end

#get_object(id) ⇒ Hash

Load the given object from the filesystem.

Parameters:

  • id (Integer)

    object ID

Returns:

  • (Hash)

    Object as defined by PEROBS::ObjectBase or nil if ID does not exist



150
151
152
153
# File 'lib/perobs/BTreeDB.rb', line 150

def get_object(id)
  return nil unless (blob = find_blob(id)) && (obj = blob.read_object(id))
  deserialize(obj)
end

#include?(id) ⇒ Boolean

Return true if the object with given ID exists

Parameters:

  • id (Integer)

Returns:

  • (Boolean)


108
109
110
# File 'lib/perobs/BTreeDB.rb', line 108

def include?(id)
  !(blob = find_blob(id)).nil? && !blob.find(id).nil?
end

#is_marked?(id, ignore_errors = false) ⇒ Boolean

Check if the object is marked.

Parameters:

  • id (Integer)

    ID of the object to check

  • ignore_errors (Boolean) (defaults to: false)

    If set to true no errors will be raised for non-existing objects.

Returns:

  • (Boolean)


179
180
181
# File 'lib/perobs/BTreeDB.rb', line 179

def is_marked?(id, ignore_errors = false)
  (blob = find_blob(id)) && blob.is_marked?(id, ignore_errors)
end

#mark(id) ⇒ Object

Mark an object.

Parameters:

  • id (Integer)

    ID of the object to mark



171
172
173
# File 'lib/perobs/BTreeDB.rb', line 171

def mark(id)
  (blob = find_blob(id)) && blob.mark(id)
end

#put_hash(name, hash) ⇒ Object

Store a simple Hash as a JSON encoded file into the DB directory. numbers.

Parameters:

  • name (String)

    Name of the hash. Will be used as file name.

  • hash (Hash)

    A Hash that maps String objects to strings or



116
117
118
119
120
121
122
123
# File 'lib/perobs/BTreeDB.rb', line 116

def put_hash(name, hash)
  file_name = File.join(@db_dir, name + '.json')
  begin
    RobustFile.write(file_name, hash.to_json)
  rescue IOError => e
    PEROBS.log.fatal "Cannot write hash file '#{file_name}': #{e.message}"
  end
end

#put_object(obj, id) ⇒ Object

Store the given object into the cluster files.

Parameters:

  • obj (Hash)

    Object as defined by PEROBS::ObjectBase



142
143
144
# File 'lib/perobs/BTreeDB.rb', line 142

def put_object(obj, id)
  find_blob(id, true).write_object(id, serialize(obj))
end

#put_raw_object(raw, id) ⇒ Object

Store the given serialized object into the cluster files. This method is for internal use only!

Parameters:

  • raw (String)

    Serialized Object as defined by PEROBS::ObjectBase

  • id (Integer)

    Object ID



213
214
215
# File 'lib/perobs/BTreeDB.rb', line 213

def put_raw_object(raw, id)
  find_blob(id, true).write_object(id, raw)
end