Class: PEROBS::Store

Inherits:
Object
  • Object
show all
Defined in:
lib/perobs/Store.rb

Overview

PEROBS::Store is a persistent storage system for Ruby objects. Regular Ruby objects are transparently stored in a back-end storage and retrieved when needed. It features a garbage collector that removes all objects that are no longer in use. A build-in cache keeps access latencies to recently used objects low and lazily flushes modified objects into the persistend back-end. The default back-end is a filesystem based database. Alternatively, an Amazon DynamoDB can be used as well. Adding support for other key/value stores is fairly trivial to do. See PEROBS::DynamoDB for an example

Persistent objects must be defined by deriving your class from PEROBS::Object, PERBOS::Array or PEROBS::Hash. Only instance variables that are declared via po_attr will be persistent. It is recommended that references to other objects are all going to persistent objects again. TO create a new persistent object you must call Store.new(). Don’t use the constructors of persistent classes directly. Store.new() will return a proxy or delegator object that can be used like the actual object. By using delegators we can disconnect the actual object from the delegator handle.

require ‘perobs’

class Person < PEROBS::Object

po_attr :name, :mother, :father, :kids

def initialize(cf, name)
  super(cf)
  attr_init(:name, name)
  attr_init(:kids, @store.new(PEROBS::Array))
end

def to_s
  "#{@name} is the child of #{self.mother ? self.mother.name : 'unknown'} " +
  "and #{self.father ? self.father.name : 'unknown'}.
end

end

store = PEROBS::Store.new(‘family’) store = joe = store.new(Person, ‘Joe’) store = jane = store.new(Person, ‘Jane’) jim = store.new(Person, ‘Jim’) jim.father = joe joe.kids << jim jim.mother = jane jane.kids << jim store.sync

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data_base, options = {}) ⇒ Store

Create a new Store.

Parameters:

  • data_base (String)

    the name of the database

  • options (Hash) (defaults to: {})

    various options to affect the operation of the database. Currently the following options are supported: :engine : The class that provides the back-end storage

    engine. By default BTreeDB is used. A user
    can provide it's own storage engine that must
    conform to the same API exposed by BTreeBlobsDB.
    

    :cache_bits : the number of bits used for cache indexing. The

    cache will hold 2 to the power of bits number of
    objects. We have separate caches for reading and
    writing. The default value is 16. It probably makes
    little sense to use much larger numbers than that.
    

    :serializer : select the format used to serialize the data. There

    are 3 different options:
    :marshal : Native Ruby serializer. Fastest option
    that can handle most Ruby data types. Big
    disadvantate is the stability of the format. Data
    written with one Ruby version may not be readable
    with another version.
    :json : About half as fast as marshal, but the
    format is rock solid and portable between
    languages. It only supports basic Ruby data types
    like String, Fixnum, Float, Array, Hash. This is
    the default option.
    :yaml : Can also handle most Ruby data types and is
    portable between Ruby versions (1.9 and later).
    Unfortunately, it is 10x slower than marshal.
    


126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
# File 'lib/perobs/Store.rb', line 126

def initialize(data_base, options = {})
  # Create a backing store handler
  @db = (options[:engine] || BTreeDB).new(data_base, options)
  # Create a map that can translate classes to numerical IDs and vice
  # versa.
  @class_map = ClassMap.new(@db)

  # List of PEROBS objects that are currently available as Ruby objects
  # hashed by their ID.
  @in_memory_objects = {}

  # This objects keeps some counters of interest.
  @stats = Statistics.new

  # The Cache reduces read and write latencies by keeping a subset of the
  # objects in memory.
  @cache = Cache.new(options[:cache_bits] || 16)

  # The named (global) objects IDs hashed by their name
  unless (@root_objects = object_by_id(0))
    # The root object hash always has the object ID 0.
    @root_objects = _construct_po(Hash, 0)
    # Mark the root_objects object as modified.
    @cache.cache_write(@root_objects)
  end
end

Instance Attribute Details

#cacheObject (readonly)

Returns the value of attribute cache.



96
97
98
# File 'lib/perobs/Store.rb', line 96

def cache
  @cache
end

#class_mapObject (readonly)

Returns the value of attribute class_map.



96
97
98
# File 'lib/perobs/Store.rb', line 96

def class_map
  @class_map
end

#dbObject (readonly)

Returns the value of attribute db.



96
97
98
# File 'lib/perobs/Store.rb', line 96

def db
  @db
end

Instance Method Details

#[](name) ⇒ Object

Return the object with the provided name.

Parameters:

  • name (Symbol)

    A Symbol specifies the name of the object to be returned.

Returns:

  • The requested object or nil if it doesn’t exist.



225
226
227
228
229
230
# File 'lib/perobs/Store.rb', line 225

def [](name)
  # Return nil if there is no object with that name.
  return nil unless (id = @root_objects[name])

  POXReference.new(self, id)
end

#[]=(name, obj) ⇒ PEROBS::Object

Store the provided object under the given name. Use this to make the object a root or top-level object (think global variable). Each store should have at least one root object. Objects that are not directly or indirectly reachable via any of the root objects are no longer accessible and will be garbage collected.

Parameters:

  • name (Symbol)

    The name to use.

  • obj (PEROBS::Object)

    The object to store

Returns:



197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
# File 'lib/perobs/Store.rb', line 197

def []=(name, obj)
  # If the passed object is nil, we delete the entry if it exists.
  if obj.nil?
    @root_objects.delete(name)
    return nil
  end

  # We only allow derivatives of PEROBS::Object to be stored in the
  # store.
  unless obj.is_a?(ObjectBase)
    raise ArgumentError, 'Object must be of class PEROBS::Object but ' +
                         "is of class #{obj.class}"
  end

  unless obj.store == self
    raise ArgumentError, 'The object does not belong to this store.'
  end

  # Store the name and mark the name list as modified.
  @root_objects[name] = obj._id

  obj
end

#_collect(id, ignore_errors = false) ⇒ Object

Remove the object from the in-memory list. This is an internal method and should never be called from user code.

Parameters:

  • id (Fixnum or Bignum)

    Object ID of object to remove from the list



384
385
386
387
388
389
# File 'lib/perobs/Store.rb', line 384

def _collect(id, ignore_errors = false)
  unless ignore_errors || @in_memory_objects.include?(id)
    raise RuntimeError, "Object with id #{id} is currently not in memory"
  end
  @in_memory_objects.delete(id)
end

#_construct_po(klass, id, *args) ⇒ BasicObject

For library internal use only! This method will create a new PEROBS object.

Parameters:

  • klass (BasicObject)

    Class of the object to create

  • id (Fixnum, Bignum)

    Requested object ID

  • args (Array)

    Arguments to pass to the object constructor.

Returns:

  • (BasicObject)

    Newly constructed PEROBS object



178
179
180
# File 'lib/perobs/Store.rb', line 178

def _construct_po(klass, id, *args)
  klass.new(Handle.new(self, id), *args)
end

#_new_idFixnum or Bignum

Internal method. Don’t use this outside of this library! Generate a new unique ID that is not used by any other object. It uses random numbers between 0 and 2**64 - 1.

Returns:

  • (Fixnum or Bignum)


359
360
361
362
363
364
365
366
367
368
# File 'lib/perobs/Store.rb', line 359

def _new_id
  begin
    # Generate a random number. It's recommended to not store more than
    # 2**62 objects in the same store.
    id = rand(2**64)
    # Ensure that we don't have already another object with this ID.
  end while @in_memory_objects.include?(id) || @db.include?(id)

  id
end

#_register_in_memory(obj, id) ⇒ Object

Internal method. Don’t use this outside of this library! Add the new object to the in-memory list. We only store a weak reference to the object so it can be garbage collected. When this happens the object finalizer is triggered and calls _forget() to remove the object from this hash again.

Parameters:

  • obj (BasicObject)

    Object to register

  • id (Fixnum or Bignum)

    object ID



377
378
379
# File 'lib/perobs/Store.rb', line 377

def _register_in_memory(obj, id)
  @in_memory_objects[id] = WeakRef.new(obj)
end

#check(repair = false) ⇒ Fixnum

This method can be used to check the database and optionally repair it. The repair is a pure structural repair. It cannot ensure that the stored data is still correct. E. g. if a reference to a non-existing or unreadable object is found, the reference will simply be deleted.

Parameters:

  • repair (TrueClass/FalseClass) (defaults to: false)

    true if a repair attempt should be made.

Returns:

  • (Fixnum)

    The number of references to bad objects found.



291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
# File 'lib/perobs/Store.rb', line 291

def check(repair = false)
  # All objects must have in-db version.
  sync
  # Run basic consistency checks first.
  @db.check_db(repair)

  # We will use the mark to mark all objects that we have checked already.
  # Before we start, we need to clear all marks.
  @db.clear_marks

  errors = 0
  @root_objects.each do |name, id|
    errors += check_object(id, repair)
  end
  @root_objects.delete_if { |name, id| !@db.check(id, false) }

  errors
end

#delete_storeObject

Delete the entire store. The store is no longer usable after this method was called.



184
185
186
187
# File 'lib/perobs/Store.rb', line 184

def delete_store
  @db.delete_database
  @db = @class_map = @cache = @root_objects = nil
end

#eachObject

Calls the given block once for each object, passing that object as a parameter.



329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
# File 'lib/perobs/Store.rb', line 329

def each
  @db.clear_marks
  # Start with the object 0 and the indexes of the root objects. Push them
  # onto the work stack.
  stack = [ 0 ] + @root_objects.values
  while !stack.empty?
    # Get an object index from the stack.
    unless (obj = object_by_id(id = stack.pop))
      raise RuntimeError, "Database is corrupted. Object with ID #{id} " +
                          "not found."
    end
    # Mark the object so it will never be pushed to the stack again.
    @db.mark(id)
    yield(obj.myself) if block_given?
    # Push the IDs of all unmarked referenced objects onto the stack
    obj._referenced_object_ids.each do |r_id|
      stack << r_id unless @db.is_marked?(r_id)
    end
  end
end

#gcFixnum

Discard all objects that are not somehow connected to the root objects from the back-end storage. The garbage collector is not invoked automatically. Depending on your usage pattern, you need to call this method periodically.

Returns:

  • (Fixnum)

    The number of collected objects



246
247
248
249
250
251
252
253
# File 'lib/perobs/Store.rb', line 246

def gc
  if @cache.in_transaction?
    raise RuntimeError, 'You cannot call gc() during a transaction'
  end
  sync
  mark
  sweep
end

#new(klass, *args) ⇒ POXReference

You need to call this method to create new PEROBS objects that belong to this Store.

Parameters:

  • klass (Class)

    The class of the object you want to create. This must be a derivative of ObjectBase.

  • args

    Optional list of other arguments that are passed to the constructor of the specified class.

Returns:

  • (POXReference)

    A reference to the newly created object.



160
161
162
163
164
165
166
167
168
169
170
# File 'lib/perobs/Store.rb', line 160

def new(klass, *args)
  unless klass.is_a?(BasicObject)
    raise ArgumentError, "#{klass} is not a BasicObject derivative"
  end

  obj = _construct_po(klass, _new_id, *args)
  # Mark the new object as modified so it gets pushed into the database.
  @cache.cache_write(obj)
  # Return a POXReference proxy for the newly created object.
  obj.myself
end

#object_by_id(id) ⇒ Object

Return the object with the provided ID. This method is not part of the public API and should never be called by outside users. It’s purely intended for internal use.



258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
# File 'lib/perobs/Store.rb', line 258

def object_by_id(id)
  if (obj = @in_memory_objects[id])
    # We have the object in memory so we can just return it.
    begin
      return obj.__getobj__
    rescue WeakRef::RefError
      # Due to a race condition the object can still be in the
      # @in_memory_objects list but has been collected already by the Ruby
      # GC. In that case we need to load it again.
    end
  end

  # We don't have the object in memory. Let's find it in the storage.
  if @db.include?(id)
    # Great, object found. Read it into memory and return it.
    obj = ObjectBase::read(self, id)
    # Add the object to the in-memory storage list.
    @cache.cache_read(obj)

    return obj
  end

  # The requested object does not exist. Return nil.
  nil
end

#rename_classes(rename_map) ⇒ Object

Rename classes of objects stored in the data base.

Parameters:

  • rename_map (Hash)

    Hash that maps the old name to the new name



352
353
354
# File 'lib/perobs/Store.rb', line 352

def rename_classes(rename_map)
  @class_map.rename(rename_map)
end

#statisticsObject

This method returns a Hash with some statistics about this store.



392
393
394
395
396
397
# File 'lib/perobs/Store.rb', line 392

def statistics
  @stats.in_memory_objects = @in_memory_objects.length
  @stats.root_objects = @root_objects.length

  @stats
end

#syncObject

Flush out all modified objects to disk and shrink the in-memory list if needed.



234
235
236
237
238
239
# File 'lib/perobs/Store.rb', line 234

def sync
  if @cache.in_transaction?
    raise RuntimeError, 'You cannot call sync() during a transaction'
  end
  @cache.flush
end

#transactionObject

This method will execute the provided block as an atomic transaction regarding the manipulation of all objects associated with this Store. In case the execution of the block generates an exception, the transaction is aborted and all PEROBS objects are restored to the state at the beginning of the transaction. The exception is passed on to the enclosing scope, so you probably want to handle it accordingly.



316
317
318
319
320
321
322
323
324
325
# File 'lib/perobs/Store.rb', line 316

def transaction
  @cache.begin_transaction
  begin
    yield if block_given?
  rescue => e
    @cache.abort_transaction
    raise e
  end
  @cache.end_transaction
end