Class: Elasticrawl::JobStep
- Inherits:
-
ActiveRecord::Base
- Object
- ActiveRecord::Base
- Elasticrawl::JobStep
- Defined in:
- lib/elasticrawl/job_step.rb
Overview
Represents an Elastic MapReduce job flow step. For a parse job this will process a single Common Crawl segment. For a combine job a single step will aggregate the results of multiple parse jobs.
Instance Method Summary collapse
-
#job_flow_step(job_config) ⇒ Object
Returns a custom jar step that is configured with the jar location, class name and input and output paths.
Instance Method Details
#job_flow_step(job_config) ⇒ Object
Returns a custom jar step that is configured with the jar location, class name and input and output paths.
For parse jobs optionally specifies the maximum # of Common Crawl data files to process before the job exits.
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# File 'lib/elasticrawl/job_step.rb', line 14 def job_flow_step(job_config) jar = job_config['jar'] max_files = self.job.max_files step_args = [] step_args[0] = job_config['class'] step_args[1] = self.input_paths step_args[2] = self.output_path # All arguments must be strings. step_args[3] = max_files.to_s if max_files.present? step = Elasticity::CustomJarStep.new(jar) step.name = set_step_name step.arguments = step_args step end |