Class: PEROBS::BTreeDB

Inherits:
DataBase show all
Defined in:
lib/perobs/BTreeDB.rb

Overview

This class implements a BTree database using filesystem directories as nodes and blob files as leafs. The BTree grows with the number of stored entries. Each leaf node blob can hold a fixed number of entries. If more entries need to be stored, the blob is replaced by a node with multiple new leafs that store the entries of the previous node. The leafs are implemented by the BTreeBlob class.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from DataBase

#check_option, #close, #deserialize, #open, #serialize

Constructor Details

#initialize(db_name, options = {}) ⇒ BTreeDB

Create a new BTreeDB object.

Parameters:

  • db_name (String)

    name of the DB directory

  • options (Hash) (defaults to: {})

    options to customize the behavior. Currently only the following options are supported: :serializer : Can be :marshal, :json, :yaml :dir_bits : The number of bits to use for the BTree nodes.

    The value must be between 4 and 14. The larger
    the number the more back-end directories are
    being used. The default is 12 which results in
    4096 directories per node.
    

    :max_blob_size : The maximum number of entries in the BTree leaf

    nodes. The insert/find/delete time grows
    linearly with the size.
    


60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
# File 'lib/perobs/BTreeDB.rb', line 60

def initialize(db_name, options = {})
  super(options)

  @db_dir = db_name
  # Create the database directory if it doesn't exist yet.
  ensure_dir_exists(@db_dir)

  # Read the existing DB config.
  @config = get_hash('config')
  check_option('serializer')

  # Check and set @dir_bits, the number of bits used for each tree level.
  @dir_bits = options[:dir_bits] || 12
  if @dir_bits < 4 || @dir_bits > 14
    PEROBS.log.fatal "dir_bits option (#{@dir_bits}) must be between 4 " +
      "and 12"
  end
  check_option('dir_bits')

  @max_blob_size = options[:max_blob_size] || 32
  if @max_blob_size < 4 || @max_blob_size > 128
    PEROBS.log.fatal "max_blob_size option (#{@max_blob_size}) must be " +
      "between 4 and 128"
  end
  check_option('max_blob_size')

  put_hash('config', @config)

  # This format string is used to create the directory name.
  @dir_format_string = "%0#{(@dir_bits / 4) +
                            (@dir_bits % 4 == 0 ? 0 : 1)}X"
  # Bit mask to extract the dir_bits LSBs.
  @dir_mask = 2 ** @dir_bits - 1
end

Instance Attribute Details

#max_blob_sizeObject (readonly)

Returns the value of attribute max_blob_size.



45
46
47
# File 'lib/perobs/BTreeDB.rb', line 45

def max_blob_size
  @max_blob_size
end

Class Method Details

.delete_db(db_name) ⇒ Object



101
102
103
# File 'lib/perobs/BTreeDB.rb', line 101

def BTreeDB::delete_db(db_name)
  FileUtils.rm_rf(db_name)
end

Instance Method Details

#check(id, repair) ⇒ TrueClass/FalseClass

Check if the stored object is syntactically correct.

Parameters:

  • id (Integer)

    Object ID

  • repair (TrueClass/FalseClass)

    True if an repair attempt should be made.

Returns:

  • (TrueClass/FalseClass)

    True if the object is OK, otherwise false.



197
198
199
200
201
202
203
204
205
206
# File 'lib/perobs/BTreeDB.rb', line 197

def check(id, repair)
  begin
    get_object(id)
  rescue => e
    $stderr.puts "Cannot read object with ID #{id}: #{e.message}"
    return false
  end

  true
end

#check_db(repair = false) ⇒ Object

Basic consistency check.

Parameters:

  • repair (TrueClass/FalseClass) (defaults to: false)

    True if found errors should be repaired.

Returns:

  • number of errors found



186
187
188
189
# File 'lib/perobs/BTreeDB.rb', line 186

def check_db(repair = false)
  each_blob { |blob| blob.check(repair) }
  0
end

#clear_marksObject

This method must be called to initiate the marking process.



155
156
157
# File 'lib/perobs/BTreeDB.rb', line 155

def clear_marks
  each_blob { |blob| blob.clear_marks }
end

#delete_databaseObject

Delete the entire database. The database is no longer usable after this method was called.



97
98
99
# File 'lib/perobs/BTreeDB.rb', line 97

def delete_database
  FileUtils.rm_rf(@db_dir)
end

#delete_unmarked_objectsArray

Permanently delete all objects that have not been marked. Those are orphaned and are no longer referenced by any actively used object.

Returns:

  • (Array)

    List of IDs that have been removed from the DB.



162
163
164
165
166
# File 'lib/perobs/BTreeDB.rb', line 162

def delete_unmarked_objects
  deleted_ids = []
  each_blob { |blob| deleted_ids += blob.delete_unmarked_entries }
  deleted_ids
end

#get_hash(name) ⇒ Hash

Load the Hash with the given name.

Parameters:

  • name (String)

    Name of the hash.

Returns:

  • (Hash)

    A Hash that maps String objects to strings or numbers.



127
128
129
130
131
132
133
134
135
136
137
# File 'lib/perobs/BTreeDB.rb', line 127

def get_hash(name)
  file_name = File.join(@db_dir, name + '.json')
  return ::Hash.new unless File.exist?(file_name)

  begin
    json = File.read(file_name)
  rescue => e
    PEROBS.log.fatal "Cannot read hash file '#{file_name}': #{e.message}"
  end
  JSON.parse(json, :create_additions => true)
end

#get_object(id) ⇒ Hash

Load the given object from the filesystem.

Parameters:

  • id (Integer)

    object ID

Returns:

  • (Hash)

    Object as defined by PEROBS::ObjectBase or nil if ID does not exist



149
150
151
152
# File 'lib/perobs/BTreeDB.rb', line 149

def get_object(id)
  return nil unless (blob = find_blob(id)) && (obj = blob.read_object(id))
  deserialize(obj)
end

#include?(id) ⇒ Boolean

Return true if the object with given ID exists

Parameters:

  • id (Integer)

Returns:

  • (Boolean)


107
108
109
# File 'lib/perobs/BTreeDB.rb', line 107

def include?(id)
  !(blob = find_blob(id)).nil? && !blob.find(id).nil?
end

#is_marked?(id, ignore_errors = false) ⇒ Boolean

Check if the object is marked.

Parameters:

  • id (Integer)

    ID of the object to check

  • ignore_errors (Boolean) (defaults to: false)

    If set to true no errors will be raised for non-existing objects.

Returns:

  • (Boolean)


178
179
180
# File 'lib/perobs/BTreeDB.rb', line 178

def is_marked?(id, ignore_errors = false)
  (blob = find_blob(id)) && blob.is_marked?(id, ignore_errors)
end

#mark(id) ⇒ Object

Mark an object.

Parameters:

  • id (Integer)

    ID of the object to mark



170
171
172
# File 'lib/perobs/BTreeDB.rb', line 170

def mark(id)
  (blob = find_blob(id)) && blob.mark(id)
end

#put_hash(name, hash) ⇒ Object

Store a simple Hash as a JSON encoded file into the DB directory. numbers.

Parameters:

  • name (String)

    Name of the hash. Will be used as file name.

  • hash (Hash)

    A Hash that maps String objects to strings or



115
116
117
118
119
120
121
122
# File 'lib/perobs/BTreeDB.rb', line 115

def put_hash(name, hash)
  file_name = File.join(@db_dir, name + '.json')
  begin
    RobustFile.write(file_name, hash.to_json)
  rescue IOError => e
    PEROBS.log.fatal "Cannot write hash file '#{file_name}': #{e.message}"
  end
end

#put_object(obj, id) ⇒ Object

Store the given object into the cluster files.

Parameters:

  • obj (Hash)

    Object as defined by PEROBS::ObjectBase



141
142
143
# File 'lib/perobs/BTreeDB.rb', line 141

def put_object(obj, id)
  find_blob(id, true).write_object(id, serialize(obj))
end

#put_raw_object(raw, id) ⇒ Object

Store the given serialized object into the cluster files. This method is for internal use only!

Parameters:

  • raw (String)

    Serialized Object as defined by PEROBS::ObjectBase

  • id (Integer)

    Object ID



212
213
214
# File 'lib/perobs/BTreeDB.rb', line 212

def put_raw_object(raw, id)
  find_blob(id, true).write_object(id, raw)
end