Module: Dsl
- Included in:
- Sycsvpro::Aggregator, Sycsvpro::Calculator, Sycsvpro::Counter, Sycsvpro::Header, Sycsvpro::Join, Sycsvpro::Mapper, Sycsvpro::Merger, Sycsvpro::Profiler, Sycsvpro::RowFilter, Sycsvpro::Sorter, Sycsvpro::Table, Sycsvpro::Transposer, Sycsvpro::Unique
- Defined in:
- lib/sycsvpro/dsl.rb
Overview
Methods to be used in customer specific script files
Constant Summary collapse
- COMMA_SPLITTER_REGEX =
Splits comma separated strings that contain commas within the value. Such values have to be enclosed between BEGIN and END Example:
Year,c1+c2,c1=~/[A-Z]{1,2}/,Month /(?<=,|^)(BEGIN.*?END|\/.*?\/|.*?)(?=,|$)/i
Instance Method Summary collapse
-
#clean_up(files) ⇒ Object
Delete obsolete files :call-seq: clean_up(%w{ file1 file2 }) -> nil.
-
#params ⇒ Object
read arguments provided at invocation :call-seq: params => infile, Result, other_params.
-
#rows(options = {}) ⇒ Object
Retrieves rows and columns from the file and returns them to the block provided by the caller.
-
#split_by_comma_regex(values) ⇒ Object
Retrieves the values scanned by a COMMA_SPLITTER_REGEX.
-
#str2utf8(str) ⇒ Object
Remove non-UTF chars from string.
-
#unstring(line) ⇒ Object
Remove leading and trailing “ and spaces as well as reducing more than 2 spaces between words from csv values.
-
#write_to(file) ⇒ Object
writes values provided by a block to the given file.
Instance Method Details
#clean_up(files) ⇒ Object
Delete obsolete files :call-seq:
clean_up(%w{ file1 file2 }) -> nil
49 50 51 52 53 |
# File 'lib/sycsvpro/dsl.rb', line 49 def clean_up(files) puts; print "Cleaning up directory..." files.each { |file| File.delete(file) } end |
#params ⇒ Object
read arguments provided at invocation :call-seq:
params => infile, Result, other_params
Result methods are #cols, #col_count, #row_count, #sample_row
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
# File 'lib/sycsvpro/dsl.rb', line 17 def params script = ARGV.shift method = ARGV.shift infile = ARGV.shift if infile.nil? STDERR.puts "You must provide an input file" exit -1 elsif !File.exists? infile STDERR.puts "#{infile} does not exist. You must provide a valid input file" exit -1 end if ARGV.empty? print "#{method}(#{infile})" else print "#{method}(#{infile}, #{ARGV.join(', ')})" end puts; print "Analyzing #{infile}..." result = Sycsvpro::Analyzer.new(infile).result puts; print "> #{result.col_count} cols | #{result.row_count} rows" [infile, result, ARGV].flatten end |
#rows(options = {}) ⇒ Object
Retrieves rows and columns from the file and returns them to the block provided by the caller
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
# File 'lib/sycsvpro/dsl.rb', line 56 def rows(={}) infile = File.([:infile]) row_filter = Sycsvpro::RowFilter.new([:row_filter]) if [:row_filter] File.new(infile).each_with_index do |line, index| next if line.chomp.empty? next if !row_filter.nil? and row_filter.process(line.chomp, row: index).nil? values = line.chomp.split(';') params = [] .each { |k,v| params << extract_values(values, k, v) if k =~ /column$|columns$/ } yield *params end end |
#split_by_comma_regex(values) ⇒ Object
Retrieves the values scanned by a COMMA_SPLITTER_REGEX
96 97 98 99 |
# File 'lib/sycsvpro/dsl.rb', line 96 def split_by_comma_regex(values) values.scan(COMMA_SPLITTER_REGEX).flatten.each. collect { |h| h.gsub(/BEGIN|END/, "") } end |
#str2utf8(str) ⇒ Object
Remove non-UTF chars from string
91 92 93 |
# File 'lib/sycsvpro/dsl.rb', line 91 def str2utf8(str) str.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '') end |
#unstring(line) ⇒ Object
Remove leading and trailing “ and spaces as well as reducing more than 2 spaces between words from csv values. Replace ; with , from values as ; is used as value separator
82 83 84 85 86 87 88 |
# File 'lib/sycsvpro/dsl.rb', line 82 def unstring(line) line = str2utf8(line) line.scan(/(?<=^"|;")[^"]+(?=;)+[^"]*|;+[^"](?=";|"$)/).each do |value| line = line.gsub(value, value.gsub(';', ',')) end line.gsub(/(?<=^|;)\s*"?\s*|\s*"?\s*(?=;|$)/, "").gsub(/\s{2,}/, " ") unless line.nil? end |
#write_to(file) ⇒ Object
writes values provided by a block to the given file
73 74 75 76 77 |
# File 'lib/sycsvpro/dsl.rb', line 73 def write_to(file) File.open(file, 'w') do |out| yield out end end |