Class: LogStash::Inputs::CloudStorage::ProcessedDb

Inherits:
Object
  • Object
show all
Defined in:
lib/logstash/inputs/cloud_storage/processed_db.rb

Overview

ProcessedDb tracks files and generations that have already been processed. File names and generations are concatenated then SHA1 hashed. The directory structure is git-like the first 3 characters of the hash are used as a top level directory, and the rest is stored as a directory name within that. This keeps the directory count manageable.

Instance Method Summary collapse

Constructor Details

#initialize(db_directory) ⇒ ProcessedDb

Returns a new instance of ProcessedDb.



15
16
17
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 15

def initialize(db_directory)
  @db_directory = db_directory
end

Instance Method Details

#already_processed?(blob) ⇒ Boolean

Returns:

  • (Boolean)


19
20
21
22
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 19

def already_processed?(blob)
  path = encode_path(blob)
  ::File.exist?(path)
end

#encode_path(blob) ⇒ Object



29
30
31
32
33
34
35
36
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 29

def encode_path(blob)
  key = "#{blob.generation}|#{blob.name}"
  encoded = Digest::SHA1.hexdigest(key)
  prefix = encoded.slice(0, 3)
  suffix = encoded.slice(3..-1)

  ::File.join(@db_directory, prefix, suffix)
end

#mark_processed(blob) ⇒ Object



24
25
26
27
# File 'lib/logstash/inputs/cloud_storage/processed_db.rb', line 24

def mark_processed(blob)
  path = encode_path(blob)
  FileUtils.mkdir_p(path)
end