Module: Chicago::ETL
- Defined in:
- lib/chicago/etl.rb,
lib/chicago/etl/sink.rb,
lib/chicago/etl/batch.rb,
lib/chicago/etl/stage.rb,
lib/chicago/etl/tasks.rb,
lib/chicago/etl/errors.rb,
lib/chicago/etl/filter.rb,
lib/chicago/etl/counter.rb,
lib/chicago/etl/pipeline.rb,
lib/chicago/etl/null_sink.rb,
lib/chicago/etl/array_sink.rb,
lib/chicago/etl/stage_name.rb,
lib/chicago/etl/key_builder.rb,
lib/chicago/etl/array_source.rb,
lib/chicago/etl/stage_builder.rb,
lib/chicago/etl/table_builder.rb,
lib/chicago/etl/dataset_source.rb,
lib/chicago/etl/transformation.rb,
lib/chicago/etl/dataset_builder.rb,
lib/chicago/etl/mysql_file_sink.rb,
lib/chicago/etl/task_invocation.rb,
lib/chicago/etl/transformations.rb,
lib/chicago/etl/pipeline_endpoint.rb,
lib/chicago/etl/load_dataset_builder.rb,
lib/chicago/etl/transformation_chain.rb,
lib/chicago/etl/mysql_file_serializer.rb,
lib/chicago/etl/screens/column_screen.rb,
lib/chicago/etl/screens/missing_value.rb,
lib/chicago/etl/screens/out_of_bounds.rb,
lib/chicago/etl/screens/invalid_element.rb,
lib/chicago/etl/sequel/dependant_tables.rb,
lib/chicago/etl/row_transformation_stage.rb,
lib/chicago/etl/schema_table_sink_factory.rb,
lib/chicago/etl/schema_table_stage_builder.rb,
lib/chicago/etl/sequel/filter_to_etl_batch.rb,
lib/chicago/etl/transformations/uk_post_code.rb,
lib/chicago/etl/transformations/deduplicate_rows.rb,
lib/chicago/etl/transformations/uk_post_code_field.rb,
lib/chicago/etl/schema_sinks_and_transformations_builder.rb
Overview
Contains classes related to ETL processing.
Defined Under Namespace
Modules: Screens, SequelExtensions, Transformations Classes: ArraySink, ArraySource, Batch, Counter, DatasetBuilder, DatasetSource, DeduplicateRows, Error, ExistingHashColumnKeyBuilder, FactKeyBuilder, Filter, HashingKeyBuilder, IdentifiableDimensionKeyBuilder, KeyBuilder, LoadDatasetBuilder, LoadDimensionStageBuilder, LoadFactStageBuilder, MysqlFileSerializer, MysqlFileSink, NullSink, Pipeline, PipelineEndpoint, RaisingErrorHandler, RakeTasks, RowTransformationStage, SchemaSinksAndTransformationsBuilder, SchemaTableSinkFactory, SchemaTableStageBuilder, Sink, Stage, StageBuilder, StageName, TableBuilder, TaskInvocation, Transformation, TransformationChain
Constant Summary collapse
- STREAM =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
The key used to store the stream in the row.
:_stream
Class Method Summary collapse
-
.execute(stage, etl_batch, logger) ⇒ Object
Executes a pipeline stage in the context of an ETL Batch.
Class Method Details
.execute(stage, etl_batch, logger) ⇒ Object
Executes a pipeline stage in the context of an ETL Batch.
Tasks execution status is stored in a database etl task invocations table - this ensures tasks aren’t run more than once within a batch.
63 64 65 66 67 68 69 70 71 72 73 |
# File 'lib/chicago/etl.rb', line 63 def self.execute(stage, etl_batch, logger) etl_batch.perform_task(:load, stage.name) do if stage.executable? logger.debug "Starting executing stage: #{stage.name}" stage.execute etl_batch logger.info "Finished executing stage: #{stage.name}" else logger.info "Skipping stage #{stage.name}" end end end |