Class: Remi::Extractor::LocalFile

Inherits:
FileSystem show all
Defined in:
lib/remi/data_subjects/local_file.rb

Overview

Local file extractor Used to "extract" a file from a local filesystem. Note that even though the file is local, we still use the parameter remote_path to indicate the path. This makes this class consistent with Remi::FileSystem.

Examples:


class MyJob < Remi::Job
  source :some_file do
    extractor Remi::Extractor::LocalFile.new(
      remote_path: 'some_file.csv'
    )
    parser Remi::Parser::CsvFile.new(
      csv_options: {
        headers: true,
        col_sep: '|'
      }
    )
  end
end

job = MyJob.new
job.some_file.df
# =>#<Daru::DataFrame:70153153438500 @name = 4c59cfdd-7de7-4264-8666-83153f46a9e4 @size = 3>
#                    id       name
#          0          1     Albert
#          1          2      Betsy
#          2          3       Camu

Instance Attribute Summary

Attributes inherited from FileSystem

#created_within, #group_by, #local_path, #most_recent_by, #most_recent_only, #pattern, #remote_path

Attributes inherited from Remi::Extractor

#logger

Instance Method Summary collapse

Methods inherited from FileSystem

#entries, #get_created_within, #matching_entries, #most_recent_matching_entry, #most_recent_matching_entry_in_group

Constructor Details

#initialize(*args, **kargs) ⇒ LocalFile

Returns a new instance of LocalFile.



32
33
34
35
# File 'lib/remi/data_subjects/local_file.rb', line 32

def initialize(*args, **kargs)
  super
  init_local_file(*args, **kargs)
end

Instance Method Details

#all_entriesArray<Extractor::FileSystemEntry>

Returns List of objects in the remote path.

Returns:



44
45
46
# File 'lib/remi/data_subjects/local_file.rb', line 44

def all_entries
  @all_entries ||= all_entries!
end

#all_entries!Array<Extractor::FileSystemEntry>

Returns List of objects in the remote path.

Returns:



49
50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/remi/data_subjects/local_file.rb', line 49

def all_entries!
  dir = @remote_path.directory? ? @remote_path + '*' : @remote_path
  Dir[dir].map do |entry|
    path = Pathname.new(entry)
    if path.file?
      Extractor::FileSystemEntry.new(
        pathname: path.realpath.to_s,
        create_time: path.ctime,
        modified_time: path.mtime
      )
    end
  end.compact
end

#extractArray<String>

Called to extract files from the source filesystem.

Returns:

  • (Array<String>)

    An array of paths to a local copy of the files extacted



39
40
41
# File 'lib/remi/data_subjects/local_file.rb', line 39

def extract
  entries.map(&:pathname)
end