Class: Assimilate::Catalog

Inherits:
Object
  • Object
show all
Defined in:
lib/assimilate/catalog.rb

Overview

Catalog configuration:

db              name of mongo database
catalog         name of the catalog collection
batch           name of the batches collection (e.g. "files")
domain          key to use for specifying record domains (will be prefixed with _)
deletion_marker key to use to marker records that have disappeared from the source file

Records in each catalog acquire the following internal attributes:

_id               Unique ID, assigned by mongo
[domain]          Domain key, specified with :domainkey attribute when initializing catalog
_dt_first_seen    Batch datestamp reference for when this record was first captured
_dt_last_seen     Batch datestamp reference for when this record was most recently affirmed
_dt_last_update   Batch datestamp reference for when this record was most recently altered
[deletion_marker] Batch datestamp reference for when this record was removed from input

Inbound records must not have attributes named with leading underscores.

A “domain” here is a namespace of identifiers.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(args) ⇒ Catalog

Returns a new instance of Catalog.



25
26
27
28
29
30
31
32
# File 'lib/assimilate/catalog.rb', line 25

def initialize(args)
  @config = YAML.load(File.open(args[:config]))
  check_config

  @db = Mongo::Connection.new.db(@config[:db])
  @catalog = @db.collection(@config[:catalog])
  @batches = @db.collection(@config[:batch])
end

Instance Attribute Details

#batchesObject (readonly)

Returns the value of attribute batches.



23
24
25
# File 'lib/assimilate/catalog.rb', line 23

def batches
  @batches
end

#catalogObject (readonly)

Returns the value of attribute catalog.



23
24
25
# File 'lib/assimilate/catalog.rb', line 23

def catalog
  @catalog
end

#configObject (readonly)

Returns the value of attribute config.



23
24
25
# File 'lib/assimilate/catalog.rb', line 23

def config
  @config
end

Instance Method Details

#active_countObject



62
63
64
# File 'lib/assimilate/catalog.rb', line 62

def active_count
  @catalog.find(config[:deletion_marker] => nil).count
end

#check_configObject



34
35
36
37
38
39
40
41
42
43
# File 'lib/assimilate/catalog.rb', line 34

def check_config
  config.symbolize_keys!
  [:db, :catalog, :batch, :domain, :deletion_marker, :insertion_marker, :update_marker].each do |key|
    raise Assimilate::InvalidConfiguration, "missing required parameter: #{key}" unless config[key]
  end
  [:domain, :deletion_marker, :insertion_marker, :update_marker].each do |key|
    # enforce leading underscore on internal attributes
    config[key] = "_#{config[key]}" unless config[key] =~ /^_/
  end
end

#extend_data(args) ⇒ Object



49
50
51
# File 'lib/assimilate/catalog.rb', line 49

def extend_data(args)
  Assimilate::Extender.new(args.merge(:catalog => self))
end

#start_batch(args) ⇒ Object



45
46
47
# File 'lib/assimilate/catalog.rb', line 45

def start_batch(args)
  Assimilate::Batch.new(args.merge(:catalog => self))
end

#where(params) ⇒ Object



53
54
55
56
57
58
59
60
# File 'lib/assimilate/catalog.rb', line 53

def where(params)
  records = @catalog.find(params).to_a #.map {|rec| rec.select {|k,v| k !~ /^_/}}
  if records.count == 1
    records.first
  else
    records
  end
end