Class: Remi::DataSubject
- Inherits:
-
Object
- Object
- Remi::DataSubject
- Defined in:
- lib/remi/data_subject.rb
Overview
The DataSubject is the parent class for DataSource and DataTarget. It is not intended to be used as a standalone class.
A DataSubject is either a source or a target. It is largely used to associate a dataframe with a set of "fields" containing metadata describing how the vectors of the dataframe are meant to be interpreted. For example, one of the fields might represent a date with MM-DD-YYYY format.
DataSubjects can be defined either using the standard DataSubject.new(<args>)
convention, or through a DSL, which is convenient for data subjects defined
in as part of job class definition.
Direct Known Subclasses
Defined Under Namespace
Modules: CsvFile, Postgres, Salesforce
Instance Attribute Summary collapse
-
#context ⇒ Object
Returns the value of attribute context.
-
#name ⇒ Object
Returns the value of attribute name.
Instance Method Summary collapse
-
#df ⇒ Remi::DataFrame
The dataframe associated with this DataSubject.
-
#df=(new_dataframe) ⇒ Remi::DataFrame
Reassigns the dataframe associated with this DataSubject.
-
#df_type(arg = nil) ⇒ Symbol
The type of dataframe (defaults to
:daru
if not explicitly set). -
#dsl_eval ⇒ self
Defines the subject using the DSL in the block provided.
- #dsl_eval! ⇒ Object
-
#enforce_types(*types) ⇒ self
Enforces the types defined in the field metadata.
-
#field_symbolizer(arg = nil) ⇒ Proc
Field symbolizer used to convert field names into symbols.
-
#fields(arg = nil) ⇒ Remi::Fields
The field metadata for this data subject.
-
#fields=(arg) ⇒ Remi::Fields
The field metadata for this data subject.
-
#initialize(context = nil, name: 'NOT DEFINED', **kargs, &block) ⇒ DataSubject
constructor
A new instance of DataSubject.
Constructor Details
#initialize(context = nil, name: 'NOT DEFINED', **kargs, &block) ⇒ DataSubject
Returns a new instance of DataSubject.
19 20 21 22 23 24 25 26 |
# File 'lib/remi/data_subject.rb', line 19 def initialize(context=nil, name: 'NOT DEFINED', **kargs, &block) @context = context @name = name @block = block @df_type = :daru @fields = Remi::Fields.new @field_symbolizer = Remi::FieldSymbolizers[:standard] end |
Instance Attribute Details
#context ⇒ Object
Returns the value of attribute context.
28 29 30 |
# File 'lib/remi/data_subject.rb', line 28 def context @context end |
#name ⇒ Object
Returns the value of attribute name.
28 29 30 |
# File 'lib/remi/data_subject.rb', line 28 def name @name end |
Instance Method Details
#df ⇒ Remi::DataFrame
Returns the dataframe associated with this DataSubject.
66 67 68 |
# File 'lib/remi/data_subject.rb', line 66 def df @dataframe ||= Remi::DataFrame.create(df_type, [], order: fields.keys) end |
#df=(new_dataframe) ⇒ Remi::DataFrame
Reassigns the dataframe associated with this DataSubject.
73 74 75 76 77 78 79 |
# File 'lib/remi/data_subject.rb', line 73 def df=(new_dataframe) if new_dataframe.respond_to? :df_type @dataframe = new_dataframe else @dataframe = Remi::DataFrame.create(df_type, new_dataframe) end end |
#df_type(arg = nil) ⇒ Symbol
Returns the type of dataframe (defaults to :daru
if not explicitly set).
33 34 35 36 |
# File 'lib/remi/data_subject.rb', line 33 def df_type(arg = nil) return get_df_type unless arg set_df_type arg end |
#dsl_eval ⇒ self
Defines the subject using the DSL in the block provided
103 104 105 106 107 |
# File 'lib/remi/data_subject.rb', line 103 def dsl_eval dsl_eval! unless @dsl_evaluated @dsl_evaluated = true self end |
#dsl_eval! ⇒ Object
109 110 111 112 |
# File 'lib/remi/data_subject.rb', line 109 def dsl_eval! return self unless @block Dsl.dsl_eval(self, @context, &@block) end |
#enforce_types(*types) ⇒ self
Enforces the types defined in the field metadata. Throws an error if a data element does not conform to the type. For example, if a field has metadata with type: :date, then the type enforcer will convert data in that field into a date, and will throw an error if it is unable to parse any of the values.
90 91 92 93 94 95 96 97 98 |
# File 'lib/remi/data_subject.rb', line 90 def enforce_types(*types) sttm = SourceToTargetMap.new(df, source_metadata: fields) fields.keys.each do |field| next unless (types.size == 0 || types.include?(fields[field][:type])) && df.vectors.include?(field) sttm.source(field).target(field).transform(Remi::Transform::EnforceType.new).execute end self end |
#field_symbolizer(arg = nil) ⇒ Proc
Field symbolizer used to convert field names into symbols. This method sets the symbolizer for the data subject and also sets the symbolizers for any associated parser and encoders.
56 57 58 59 60 61 62 63 |
# File 'lib/remi/data_subject.rb', line 56 def field_symbolizer(arg = nil) return @field_symbolizer unless arg @field_symbolizer = if arg.is_a? Symbol Remi::FieldSymbolizers[arg] else arg end end |
#fields(arg = nil) ⇒ Remi::Fields
Returns the field metadata for this data subject.
40 41 42 43 |
# File 'lib/remi/data_subject.rb', line 40 def fields(arg = nil) return get_fields unless arg set_fields arg end |
#fields=(arg) ⇒ Remi::Fields
Returns the field metadata for this data subject.
47 48 49 |
# File 'lib/remi/data_subject.rb', line 47 def fields=(arg) @fields = Remi::Fields.new(arg) end |