Class: LogStash::Inputs::CloudStorage::ProcessedDb

Inherits:
Object
  • Object
show all
Defined in:
lib/logstash/inputs/cloud_storage/processed_db.rb

Overview

ProcessedDb tracks files and generations that have already been processed. File names and generations are concatenated then SHA1 hashed. The directory structure is git-like the first 3 characters of the hash are used as a top level directory, and the rest is stored as a directory name within that. This keeps the directory count manageable.

Instance Method Summary collapse

Constructor Details

#initialize(db_directory) ⇒ ProcessedDb

Returns a new instance of ProcessedDb.



28
29
30
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 28

def initialize(db_directory)
  @db_directory = db_directory
end

Instance Method Details

#already_processed?(blob) ⇒ Boolean

Returns:

  • (Boolean)


32
33
34
35
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 32

def already_processed?(blob)
  path = encode_path(blob)
  ::File.exist?(path)
end

#encode_path(blob) ⇒ Object



42
43
44
45
46
47
48
49
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 42

def encode_path(blob)
  key = "#{blob.generation}|#{blob.name}"
  encoded = Digest::SHA1.hexdigest(key)
  prefix = encoded.slice(0, 3)
  suffix = encoded.slice(3..-1)

  ::File.join(@db_directory, prefix, suffix)
end

#mark_processed(blob) ⇒ Object



37
38
39
40
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 37

def mark_processed(blob)
  path = encode_path(blob)
  FileUtils.mkdir_p(path)
end