Class: PEROBS::Store

Inherits:
Object
  • Object
show all
Defined in:
lib/perobs/Store.rb

Overview

PEROBS::Store is a persistent storage system for Ruby objects. Regular Ruby objects are transparently stored in a back-end storage and retrieved when needed. It features a garbage collector that removes all objects that are no longer in use. A build-in cache keeps access latencies to recently used objects low and lazily flushes modified objects into the persistend back-end. The default back-end is a filesystem based database. Alternatively, an Amazon DynamoDB can be used as well. Adding support for other key/value stores is fairly trivial to do. See PEROBS::DynamoDB for an example

Persistent objects must be defined by deriving your class from PEROBS::Object, PERBOS::Array or PEROBS::Hash. Only instance variables that are declared via po_attr will be persistent. It is recommended that references to other objects are all going to persistent objects again. TO create a new persistent object you must call Store.new(). Don’t use the constructors of persistent classes directly. Store.new() will return a proxy or delegator object that can be used like the actual object. By using delegators we can disconnect the actual object from the delegator handle.

require ‘perobs’

class Person < PEROBS::Object

attr_persist :name, :mother, :father, :kids

# The contructor is only called for the creation of a new object. It is
# not called when the object is restored from the database. In that case
# only restore() is called.
def initialize(cf, name)
  super(cf)
  self.name = name
  self.kids = @store.new(PEROBS::Array)
end

def restore
  # In case you need to do any checks or massaging (e. g. for additional
  # attributes) you can provide this method.
end

def to_s
  "#{@name} is the child of #{self.mother ? self.mother.name : 'unknown'} " +
  "and #{self.father ? self.father.name : 'unknown'}.
end

end

store = PEROBS::Store.new(‘family’) store = joe = store.new(Person, ‘Joe’) store = jane = store.new(Person, ‘Jane’) jim = store.new(Person, ‘Jim’) jim.father = joe joe.kids << jim jim.mother = jane jane.kids << jim store.exit

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data_base, options = {}) ⇒ Store

Create a new Store.

  :progressmeter : reference to a ProgressMeter object that receives
                   progress information during longer running tasks.
                   It defaults to ProgressMeter which only logs into
                   the log. Use ConsoleProgressMeter or a derived
                   class for more fancy progress reporting.
:no_root_objects : Create a new store without root objects. This only
                   makes sense if you want to copy the objects of
                   another store into this store.

Parameters:

  • data_base (String)

    the name of the database

  • options (Hash) (defaults to: {})

    various options to affect the operation of the database. Currently the following options are supported: :engine : The class that provides the back-end storage

    engine. By default FlatFileDB is used. A user
    can provide it's own storage engine that must
    conform to the same API exposed by FlatFileDB.
    

    :cache_bits : the number of bits used for cache indexing. The

    cache will hold 2 to the power of bits number of
    objects. We have separate caches for reading and
    writing. The default value is 16. It probably makes
    little sense to use much larger numbers than that.
    

    :serializer : select the format used to serialize the data. There

    are 3 different options:
    :marshal : Native Ruby serializer. Fastest option
    that can handle most Ruby data types. Big
    disadvantate is the stability of the format. Data
    written with one Ruby version may not be readable
    with another version.
    :json : About half as fast as marshal, but the
    format is rock solid and portable between
    languages. It only supports basic Ruby data types
    like String, Integer, Float, Array, Hash. This is
    the default option.
    :yaml : Can also handle most Ruby data types and is
    portable between Ruby versions (1.9 and later).
    Unfortunately, it is 10x slower than marshal.
    


152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
# File 'lib/perobs/Store.rb', line 152

def initialize(data_base, options = {})
  # Create a backing store handler
  @progressmeter = (options[:progressmeter] ||= ProgressMeter.new)
  @db = (options[:engine] || FlatFileDB).new(data_base, options)
  @db.open
  # Create a map that can translate classes to numerical IDs and vice
  # versa.
  @class_map = ClassMap.new(@db)
  @db.register_class_map(@class_map)

  # List of PEROBS objects that are currently available as Ruby objects
  # hashed by their ID.
  @in_memory_objects = {}

  # This objects keeps some counters of interest.
  @stats = Statistics.new
  @stats[:created_objects] = 0
  @stats[:collected_objects] = 0

  # The Cache reduces read and write latencies by keeping a subset of the
  # objects in memory.
  @cache = Cache.new(options[:cache_bits] || 16)

  # Lock to serialize access to the Store and all stored data.
  @lock = Monitor.new

  # The named (global) objects IDs hashed by their name
  unless options[:no_root_objects]
    unless (@root_objects = object_by_id(0))
      PEROBS.log.debug "Initializing the PEROBS store"
      # The root object hash always has the object ID 0.
      @root_objects = _construct_po(Hash, 0)
      # Mark the root_objects object as modified.
      @cache.cache_write(@root_objects)
    end
    unless @root_objects.is_a?(Hash)
      PEROBS.log.fatal "Database corrupted: Root objects must be a Hash " +
        "but is a #{@root_objects.class}"
    end
  end
end

Instance Attribute Details

#cacheObject (readonly)

Returns the value of attribute cache.



113
114
115
# File 'lib/perobs/Store.rb', line 113

def cache
  @cache
end

#class_mapObject (readonly)

Returns the value of attribute class_map.



113
114
115
# File 'lib/perobs/Store.rb', line 113

def class_map
  @class_map
end

#dbObject (readonly)

Returns the value of attribute db.



113
114
115
# File 'lib/perobs/Store.rb', line 113

def db
  @db
end

#root_objects=(value) ⇒ Object (writeonly)

Sets the attribute root_objects

Parameters:

  • value

    the value to set the attribute root_objects to.



114
115
116
# File 'lib/perobs/Store.rb', line 114

def root_objects=(value)
  @root_objects = value
end

Instance Method Details

#[](name) ⇒ Object

Return the object with the provided name.

Parameters:

  • name (Symbol)

    A Symbol specifies the name of the object to be returned.

Returns:

  • The requested object or nil if it doesn’t exist.



332
333
334
335
336
337
338
339
# File 'lib/perobs/Store.rb', line 332

def [](name)
  @lock.synchronize do
    # Return nil if there is no object with that name.
    return nil unless (id = @root_objects[name])

    POXReference.new(self, id)
  end
end

#[]=(name, obj) ⇒ PEROBS::Object

Store the provided object under the given name. Use this to make the object a root or top-level object (think global variable). Each store should have at least one root object. Objects that are not directly or indirectly reachable via any of the root objects are no longer accessible and will be garbage collected.

Parameters:

  • name (Symbol)

    The name to use.

  • obj (PEROBS::Object)

    The object to store

Returns:



302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
# File 'lib/perobs/Store.rb', line 302

def []=(name, obj)
  @lock.synchronize do
    # If the passed object is nil, we delete the entry if it exists.
    if obj.nil?
      @root_objects.delete(name)
      return nil
    end

    # We only allow derivatives of PEROBS::Object to be stored in the
    # store.
    unless obj.is_a?(ObjectBase)
      PEROBS.log.fatal 'Object must be of class PEROBS::Object but ' +
        "is of class #{obj.class}"
    end

    unless obj.store == self
      PEROBS.log.fatal 'The object does not belong to this store.'
    end

    # Store the name and mark the name list as modified.
    @root_objects[name] = obj._id
  end

  obj
end

#_collect(id, ruby_object_id) ⇒ Object

Remove the object from the in-memory list. This is an internal method and should never be called from user code. It will be called from a finalizer, so many restrictions apply!

Parameters:

  • id (Integer)

    Object ID of object to remove from the list



560
561
562
563
564
565
566
567
568
569
# File 'lib/perobs/Store.rb', line 560

def _collect(id, ruby_object_id)
  # This method should only be called from the Ruby garbage collector.
  # Therefor no locking is needed or even possible. The GC can kick in at
  # any time and we could be anywhere in the code. So there is a small
  # risk for a race here, but it should not have any serious consequences.
  if @in_memory_objects && @in_memory_objects[id] == ruby_object_id
    @in_memory_objects.delete(id)
    @stats[:collected_objects] += 1
  end
end

#_construct_po(klass, id, *args) ⇒ BasicObject

For library internal use only! This method will create a new PEROBS object.

Parameters:

  • klass (BasicObject)

    Class of the object to create

  • id (Integer)

    Requested object ID

  • args (Array)

    Arguments to pass to the object constructor.

Returns:

  • (BasicObject)

    Newly constructed PEROBS object



279
280
281
# File 'lib/perobs/Store.rb', line 279

def _construct_po(klass, id, *args)
  klass.new(Handle.new(self, id), *args)
end

#_new_idInteger

Internal method. Don’t use this outside of this library! Generate a new unique ID that is not used by any other object. It uses random numbers between 0 and 2**64 - 1.

Returns:

  • (Integer)


521
522
523
524
525
526
527
528
529
530
531
532
# File 'lib/perobs/Store.rb', line 521

def _new_id
  @lock.synchronize do
    begin
      # Generate a random number. It's recommended to not store more than
      # 2**62 objects in the same store.
      id = rand(2**64)
      # Ensure that we don't have already another object with this ID.
    end while @in_memory_objects.include?(id) || @db.include?(id)

    id
  end
end

#_register_in_memory(obj, id) ⇒ Object

Internal method. Don’t use this outside of this library! Add the new object to the in-memory list. We only store a weak reference to the object so it can be garbage collected. When this happens the object finalizer is triggered and calls _forget() to remove the object from this hash again.

Parameters:

  • obj (BasicObject)

    Object to register

  • id (Integer)

    object ID



541
542
543
544
545
546
547
548
549
550
551
552
553
554
# File 'lib/perobs/Store.rb', line 541

def _register_in_memory(obj, id)
  @lock.synchronize do
    unless obj.is_a?(ObjectBase)
      PEROBS.log.fatal "You can only register ObjectBase objects"
    end
    if @in_memory_objects.include?(id)
      PEROBS.log.fatal "The Store::_in_memory_objects list already " +
        "contains an object for ID #{id}"
    end

    @in_memory_objects[id] = obj.object_id
    @stats[:created_objects] += 1
  end
end

#check(repair = false) ⇒ Integer

This method can be used to check the database and optionally repair it. The repair is a pure structural repair. It cannot ensure that the stored data is still correct. E. g. if a reference to a non-existing or unreadable object is found, the reference will simply be deleted.

Parameters:

  • repair (TrueClass/FalseClass) (defaults to: false)

    true if a repair attempt should be made.

Returns:

  • (Integer)

    The number of references to bad objects found.



404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
# File 'lib/perobs/Store.rb', line 404

def check(repair = false)
  stats = { :errors => 0, :object_cnt => 0 }

  # All objects must have in-db version.
  sync
  # Run basic consistency checks first.
  stats[:errors] += @db.check_db(repair)

  # We will use the mark to mark all objects that we have checked already.
  # Before we start, we need to clear all marks.
  @db.clear_marks

  @progressmeter.start("Checking object link structure",
                       @db.item_counter) do
    @root_objects.each do |name, id|
      check_object(id, repair, stats)
    end
  end

  # Delete all broken root objects.
  if repair
    @root_objects.delete_if do |name, id|
      unless @db.check(id, repair)
        PEROBS.log.error "Discarding broken root object '#{name}' " +
          "with ID #{id}"
        stats[:errors] += 1
      end
    end
  end

  if stats[:errors] > 0
    if repair
      PEROBS.log.error "#{stats[:errors]} errors found in " +
        "#{stats[:object_cnt]} objects"
    else
      PEROBS.log.fatal "#{stats[:errors]} errors found in " +
        "#{stats[:object_cnt]} objects"
    end
  else
    PEROBS.log.debug "No errors found"
  end

  # Ensure that any fixes are written into the DB.
  sync if repair

  stats[:errors]
end

#copy(dir, options = {}) ⇒ Object

Copy the store content into a new Store. The arguments are identical to Store.new().

Parameters:

  • options (Hash) (defaults to: {})

    various options to affect the operation of the



197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
# File 'lib/perobs/Store.rb', line 197

def copy(dir, options = {})
  # Make sure all objects are persisted.
  sync

  # Create a new store with the specified directory and options.
  new_options = options.clone
  new_options[:no_root_objects] = true
  new_db = Store.new(dir, new_options)
  # Clear the cache.
  new_db.sync
  # Copy all objects of the existing store to the new store.
  i = 0
  each do |ref_obj|
    obj = ref_obj._referenced_object
    obj._transfer(new_db)
    obj._sync
    i += 1
  end
  new_db.root_objects = new_db.object_by_id(0)
  PEROBS.log.debug "Copied #{i} objects into new database at #{dir}"
  # Flush the new store and close it.
  new_db.exit

  true
end

#delete_storeObject

Delete the entire store. The store is no longer usable after this method was called. This is an alternative to exit() that additionaly deletes the entire database.



286
287
288
289
290
291
292
# File 'lib/perobs/Store.rb', line 286

def delete_store
  @lock.synchronize do
    @db.delete_database
    @db = @class_map = @in_memory_objects = @stats = @cache =
      @root_objects = nil
  end
end

#eachObject

Calls the given block once for each object, passing that object as a parameter.



485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
# File 'lib/perobs/Store.rb', line 485

def each
  @lock.synchronize do
    @db.clear_marks
    # Start with the object 0 and the indexes of the root objects. Push them
    # onto the work stack.
    stack = [ 0 ] + @root_objects.values
    while !stack.empty?
      # Get an object index from the stack.
      id = stack.pop
      next if @db.is_marked?(id)

      unless (obj = object_by_id_internal(id))
        PEROBS.log.fatal "Database is corrupted. Object with ID #{id} " +
          "not found."
      end
      # Mark the object so it will never be pushed to the stack again.
      @db.mark(id)
      yield(obj.myself) if block_given?
      # Push the IDs of all unmarked referenced objects onto the stack
      obj._referenced_object_ids.each do |r_id|
        stack << r_id unless @db.is_marked?(r_id)
      end
    end
  end
end

#exitObject

Close the store and ensure that all in-memory objects are written out to the storage backend. The Store object is no longer usable after this method was called.



226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
# File 'lib/perobs/Store.rb', line 226

def exit
  if @cache && @cache.in_transaction?
    @cache.abort_transaction
    @cache.flush
    @db.close if @db
    PEROBS.log.fatal "You cannot call exit() during a transaction: #{Kernel.caller}"
  end
  @cache.flush if @cache
  @db.close if @db

  GC.start
  if @stats
    unless @stats[:created_objects] == @stats[:collected_objects] +
        @in_memory_objects.length
      PEROGS.log.fatal "Created objects count " +
        "(#{@stats[:created_objects]})" +
        " is not equal to the collected count " +
        "(#{@stats[:collected_objects]}) + in_memory_objects count " +
        "(#{@in_memory_objects.length})"
    end
  end

  @db = @class_map = @in_memory_objects = @stats = @cache =
    @root_objects = nil
end

#gcInteger

Discard all objects that are not somehow connected to the root objects from the back-end storage. The garbage collector is not invoked automatically. Depending on your usage pattern, you need to call this method periodically.

Returns:

  • (Integer)

    The number of collected objects



380
381
382
383
384
385
386
# File 'lib/perobs/Store.rb', line 380

def gc
  @lock.synchronize do
    sync
    mark
    sweep
  end
end

#namesArray of Symbols

Return a list with all the names of the root objects.

Returns:



343
344
345
346
347
# File 'lib/perobs/Store.rb', line 343

def names
  @lock.synchronize do
    @root_objects.keys
  end
end

#new(klass, *args) ⇒ POXReference

You need to call this method to create new PEROBS objects that belong to this Store.

Parameters:

  • klass (Class)

    The class of the object you want to create. This must be a derivative of ObjectBase.

  • args

    Optional list of other arguments that are passed to the constructor of the specified class.

Returns:

  • (POXReference)

    A reference to the newly created object.



259
260
261
262
263
264
265
266
267
268
269
270
271
# File 'lib/perobs/Store.rb', line 259

def new(klass, *args)
  unless klass.is_a?(BasicObject)
    PEROBS.log.fatal "#{klass} is not a BasicObject derivative"
  end

  @lock.synchronize do
    obj = _construct_po(klass, _new_id, *args)
    # Mark the new object as modified so it gets pushed into the database.
    @cache.cache_write(obj)
    # Return a POXReference proxy for the newly created object.
    obj.myself
  end
end

#object_by_id(id) ⇒ Object

Return the object with the provided ID. This method is not part of the public API and should never be called by outside users. It’s purely intended for internal use.



391
392
393
394
395
# File 'lib/perobs/Store.rb', line 391

def object_by_id(id)
  @lock.synchronize do
    object_by_id_internal(id)
  end
end

#rename_classes(rename_map) ⇒ Object

Rename classes of objects stored in the data base.

Parameters:

  • rename_map (Hash)

    Hash that maps the old name to the new name



513
514
515
# File 'lib/perobs/Store.rb', line 513

def rename_classes(rename_map)
  @lock.synchronize { @class_map.rename(rename_map) }
end

#sizeInteger

Return the number of object stored in the store. CAVEAT: This method will only return correct values when it is separated from any mutating call by a call to sync().

Returns:

  • (Integer)

    Number of persistently stored objects in the Store.



367
368
369
370
371
372
373
# File 'lib/perobs/Store.rb', line 367

def size
  # We don't include the Hash that stores the root objects into the object
  # count.
  @lock.synchronize do
    @db.item_counter - 1
  end
end

#statisticsObject

This method returns a Hash with some statistics about this store.



572
573
574
575
576
577
578
579
# File 'lib/perobs/Store.rb', line 572

def statistics
  @lock.synchronize do
    @stats.in_memory_objects = @in_memory_objects.length
    @stats.root_objects = @root_objects.length
  end

  @stats
end

#syncObject

Flush out all modified objects to disk and shrink the in-memory list if needed.



351
352
353
354
355
356
357
358
359
360
361
# File 'lib/perobs/Store.rb', line 351

def sync
  @lock.synchronize do
    if @cache.in_transaction?
      @cache.abort_transaction
      @cache.flush
      PEROBS.log.fatal "You cannot call sync() during a transaction: \n" +
        Kernel.caller.join("\n")
    end
    @cache.flush
  end
end

#transactionObject

This method will execute the provided block as an atomic transaction regarding the manipulation of all objects associated with this Store. In case the execution of the block generates an exception, the transaction is aborted and all PEROBS objects are restored to the state at the beginning of the transaction. The exception is passed on to the enclosing scope, so you probably want to handle it accordingly.



458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
# File 'lib/perobs/Store.rb', line 458

def transaction
  transaction_not_started = true
  while transaction_not_started do
    begin
      @lock.synchronize do
        @cache.begin_transaction
        # If we get to this point, the transaction was successfully
        # started. We can exit the loop.
        transaction_not_started = false
      end
    rescue TransactionInOtherThread
      # sleep up to 50ms
      sleep(rand(50) / 1000.0)
    end
  end

  begin
    yield if block_given?
  rescue => e
    @lock.synchronize { @cache.abort_transaction }
    raise e
  end
  @lock.synchronize { @cache.end_transaction }
end