Class: Remi::DataSource::Postgres

Inherits:
Remi::DataSource show all
Includes:
Remi::DataSubject::Postgres
Defined in:
lib/remi/data_subject/postgres.rb

Instance Attribute Summary

Attributes inherited from Remi::DataSubject

#fields

Instance Method Summary collapse

Methods included from Remi::DataSubject::Postgres

#connection

Methods inherited from Remi::DataSource

#df, #extract

Methods included from Testing::DataStub

#empty_stub_df, #stub_boolean, #stub_date, #stub_datetime, #stub_decimal, #stub_df, #stub_float, #stub_integer, #stub_json, #stub_row_array, #stub_string, #stub_values

Methods inherited from Remi::DataSubject

#df, #df=, #enforce_types, #field_symbolizer

Constructor Details

#initialize(*args, **kargs, &block) ⇒ Postgres

Returns a new instance of Postgres.



20
21
22
23
# File 'lib/remi/data_subject/postgres.rb', line 20

def initialize(*args, **kargs, &block)
  super
  init_postgres(*args, **kargs, &block)
end

Instance Method Details

#extract!Object

Public: Called to extract data from the source.

Returns data in a format that can be used to create a dataframe.



28
29
30
31
# File 'lib/remi/data_subject/postgres.rb', line 28

def extract!
  @logger.info "Executing query #{@query}"
  @extract = connection.exec @query
end

#to_dataframeObject

Public: Converts extracted data to a dataframe. Currently only supports Daru DataFrames.

Returns a Remi::DataFrame



37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# File 'lib/remi/data_subject/postgres.rb', line 37

def to_dataframe
  # Performance for larger sets could be improved by using bulk query (via COPY)
  @logger.info "Converting query to a dataframe"

  hash_array = {}
  extract.each do |row|
    row.each do |field, value|
      (hash_array[field_symbolizer.call(field)] ||= []) << value
    end
  end

  # After converting to DF, clear the PG results to save memory.
  extract.clear

  Remi::DataFrame.create(@remi_df_type, hash_array, order: hash_array.keys)
end