Class: Sycsvpro::Sorter

Inherits:
Object
  • Object
show all
Includes:
Dsl
Defined in:
lib/sycsvpro/sorter.rb

Overview

Sorts an input file based on a column sort filter

Constant Summary

Constants included from Dsl

Dsl::COMMA_SPLITTER_REGEX

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from Dsl

#clean_up, #params, #rows, #split_by_comma_regex, #str2utf8, #unstring, #write_to

Constructor Details

#initialize(options = {}) ⇒ Sorter

Creates a Sorter and takes as options infile, outfile, rows, cols including types and a date format for the date columns to sort (optional). :call-seq:

Sycsvrpo::Sorter.new(infile:     "infile.csv",
                     outfile:    "outfile.csv",
                     rows:       "1,2-5,12-30",
                     cols:       "n:1,s:3",
                     headerless: true,
                     df:         "%d.%m.%Y",
                     start:      "2").execute

The sorted infile will saved to outfile



45
46
47
48
49
50
51
52
53
54
# File 'lib/sycsvpro/sorter.rb', line 45

def initialize(options={})
  @infile          = options[:infile]
  @outfile         = options[:outfile]
  @headerless      = options[:headerless] || false
  @start           = options[:start]
  @desc            = options[:desc] || false
  @row_filter      = RowFilter.new(options[:rows], df: options[:df])
  @col_type_filter = ColumnTypeFilter.new(options[:cols], df: options[:df])
  @sorted_rows     = []
end

Instance Attribute Details

#col_type_filterObject (readonly)

column type filter



20
21
22
# File 'lib/sycsvpro/sorter.rb', line 20

def col_type_filter
  @col_type_filter
end

#descObject (readonly)

sort order descending or ascending



32
33
34
# File 'lib/sycsvpro/sorter.rb', line 32

def desc
  @desc
end

#headerlessObject (readonly)

file doesn’t contain a header. If not headerless then empty rows from beginning of file are discarted and first non empty row is considered as header. Subsequent rows will be sorted and added in the resulting file after the header



27
28
29
# File 'lib/sycsvpro/sorter.rb', line 27

def headerless
  @headerless
end

#infileObject (readonly)

file of the data to sort



14
15
16
# File 'lib/sycsvpro/sorter.rb', line 14

def infile
  @infile
end

#outfileObject (readonly)

file to write the sorted data to



16
17
18
# File 'lib/sycsvpro/sorter.rb', line 16

def outfile
  @outfile
end

#row_filterObject (readonly)

row filter



18
19
20
# File 'lib/sycsvpro/sorter.rb', line 18

def row_filter
  @row_filter
end

#sorted_rowsObject (readonly)

sorted rows



22
23
24
# File 'lib/sycsvpro/sorter.rb', line 22

def sorted_rows
  @sorted_rows
end

#startObject (readonly)

First row to sort. Will skip rows 0 to start - 1 and add them to top of file. Rows from start on will be sorted.



30
31
32
# File 'lib/sycsvpro/sorter.rb', line 30

def start
  @start
end

Instance Method Details

#executeObject

Sorts the data of the infile



57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# File 'lib/sycsvpro/sorter.rb', line 57

def execute
  rows = File.readlines(infile)

  skipped_rows = []

  unless headerless
    skipped_rows[0] = ""
    skipped_rows[0] = rows.shift while skipped_rows[0].chomp.strip.empty?
  end

  if start
    (0...start.to_i).each { |row| skipped_rows << rows.shift }  
  end

  rows.each_with_index do |line, index|
    filtered = col_type_filter.process(row_filter.process(line, row: index))
    next if filtered.nil?
    sorted_rows << (filtered << index)
  end

  File.open(outfile, 'w') do |out|
    skipped_rows.each { |row| out.puts unstring(row) }

    if desc
      sorted_rows.compact.sort.reverse.each do |row|
        out.puts unstring(rows[row.last])
      end
    else
      sorted_rows.compact.sort.each do |row|
        out.puts unstring(rows[row.last])
      end
    end
  end
end