Class: Sycsvpro::Join
Overview
Join joins two files based on a join key value. Example File 1 (infile)
|Name |ID |
|-----|---|
|Hank |123|
|Frank|234|
|Mia |345|
|Moira|234|
File 2 (source)
|Company|Phone|ID |
|-------|-----|---|
|Siem |4848 |123|
|Helo |993 |345|
|Wara |3333 |234|
File 3 (outfile)
|Name |ID |Company|Phone|
|-----|---|-------|-----|
|Hank |123|Siem |4848 |
|Frank|234|Wara |3333 |
|Mia |345|Helo |993 |
|Moira|234|Wara |3333 |
Constant Summary
Constants included from Dsl
Instance Attribute Summary collapse
-
#header ⇒ Object
readonly
header of the outfile.
-
#headerless ⇒ Object
readonly
indicates whether the infile is headerless.
-
#infile ⇒ Object
readonly
infile contains the data that is operated on.
-
#outfile ⇒ Object
readonly
outfile is the file where the result is written to.
-
#positions ⇒ Object
readonly
posititon where to insert the columns into the infile.
-
#row_filter ⇒ Object
readonly
filter that is used for rows.
-
#source ⇒ Object
readonly
source file from where columns are inserted into infile.
Instance Method Summary collapse
-
#execute ⇒ Object
Executes the join.
-
#initialize(options = {}) ⇒ Join
constructor
Creates a Join which can be invoked as follows :call-seq: Sycsvpro::Join.new(infile: “in.csv”, outfile: “out.csv”, source: “source.csv”, rows: “1-eof”, cols: “0,1”, pos: “2,3”, joins: “2=1”, headerless: true, header: “*”, insert_header: “Company,Phone”).execute.
Methods included from Dsl
#clean_up, #params, #rows, #split_by_comma_regex, #str2utf8, #unstring, #write_to
Constructor Details
#initialize(options = {}) ⇒ Join
Creates a Join which can be invoked as follows :call-seq:
Sycsvpro::Join.new(infile: "in.csv",
outfile: "out.csv",
source: "source.csv",
rows: "1-eof",
cols: "0,1",
pos: "2,3",
joins: "2=1",
headerless: true,
header: "*",
insert_header: "Company,Phone").execute
- infile
-
csv file to operate on
- outfile
-
csv file with the result
- source
-
csv file that contains the values to join to infile
- rows
-
rows to consider for operation. Rows that don’t match the pattern will be skipped for operation
- cols
-
columns to insert from the source to the infile
- pos
-
column positions where to insert the values and the insert_header columns
- joins
-
columns that match in infile and source. source_column=infile_column
- headerless
-
indicates whether the infile has a header (default true)
- header
-
Header of the csv file
- insert_header
-
column names of the to be inserted values
80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
# File 'lib/sycsvpro/join.rb', line 80 def initialize( = {}) @infile = [:infile] @outfile = [:outfile] @source = [:source] @row_filter = RowFilter.new([:rows], df: [:df]) @positions = create_joiners([:joins], [:cols], [:pos]) @headerless = [:headerless].nil? ? false : [:headerless] @header = Header.new([:header] || '*', pos: @positions, insert: [:insert_header]) create_lookup_table end |
Instance Attribute Details
#header ⇒ Object (readonly)
header of the outfile
50 51 52 |
# File 'lib/sycsvpro/join.rb', line 50 def header @header end |
#headerless ⇒ Object (readonly)
indicates whether the infile is headerless
52 53 54 |
# File 'lib/sycsvpro/join.rb', line 52 def headerless @headerless end |
#infile ⇒ Object (readonly)
infile contains the data that is operated on
40 41 42 |
# File 'lib/sycsvpro/join.rb', line 40 def infile @infile end |
#outfile ⇒ Object (readonly)
outfile is the file where the result is written to
42 43 44 |
# File 'lib/sycsvpro/join.rb', line 42 def outfile @outfile end |
#positions ⇒ Object (readonly)
posititon where to insert the columns into the infile
48 49 50 |
# File 'lib/sycsvpro/join.rb', line 48 def positions @positions end |
#row_filter ⇒ Object (readonly)
filter that is used for rows
46 47 48 |
# File 'lib/sycsvpro/join.rb', line 46 def row_filter @row_filter end |
#source ⇒ Object (readonly)
source file from where columns are inserted into infile
44 45 46 |
# File 'lib/sycsvpro/join.rb', line 44 def source @source end |
Instance Method Details
#execute ⇒ Object
Executes the join
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
# File 'lib/sycsvpro/join.rb', line 96 def execute processed_header = headerless ? true : false File.open(outfile, 'w') do |out| File.open(infile).each_with_index do |line, index| line = line.chomp next if line.empty? line = unstring(line).chomp unless processed_header header_line = header.process(line) out.puts header unless header_line.empty? processed_header = true next end next if row_filter.process(line, row: index).nil? values = line.split(';') target = values.dup @positions.sort.each { |p| target.insert(p, "") } @joiners.each do |joiner| key = values[joiner.join[1]] row = joiner.lookup[:rows][key] || [] joiner.pos.each_with_index { |p,i| target[p] = row[i] } end out.puts target.join(';') end end end |