Class: BioDSL::CSV

Inherits:
Object
  • Object
show all
Defined in:
lib/BioDSL/csv.rb

Overview

Class for manipulating CSV or table files. Allow reading and writing of gzip and bzip2 data. Auto-convert data types. Returns lines, arrays or hashes.

Defined Under Namespace

Classes: IO

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(io) ⇒ CSV

Constructor method for CSV.



109
110
111
112
113
114
115
# File 'lib/BioDSL/csv.rb', line 109

def initialize(io)
  @io        = io
  @delimiter = "\s"
  @header    = nil
  @fields    = nil
  @types     = nil
end

Class Method Details

.open(*args) ⇒ Object



56
57
58
59
60
61
62
63
64
# File 'lib/BioDSL/csv.rb', line 56

def self.open(*args)
  io = IO.open(*args)

  if block_given?
    yield new(io)
  else
    return new(io)
  end
end

.read_array(file, options = {}) ⇒ Object

Method that reads all CSV data from a file into an array of arrays (array of rows) which is returned. In the default mode all columns are read. Using the select option subselects the columns based on a given Array or if a heder line is present a given Hash. Visa versa for the reject option. Header lines are prefixed with ‘#’ and are returned if the include_header option is given.

Options:

* include_header
* delimiter.
* select.
* reject.


78
79
80
81
82
83
84
85
86
# File 'lib/BioDSL/csv.rb', line 78

def self.read_array(file, options = {})
  data = []

  open(file) do |ios|
    ios.each_array(options) { |row| data << row }
  end

  data
end

.read_hash(file, options = {}) ⇒ Object

Method that reads all CSV data from a file into an array of hashes (array of rows) which is returned. In the default mode all columns are read. Using the select option subselects the columns based on a given Array or if a heder line is present a given Hash. Visa versa for the reject option. Header lines are prefixed with ‘#’.

Options:

* delimiter.
* select.
* reject.


98
99
100
101
102
103
104
105
106
# File 'lib/BioDSL/csv.rb', line 98

def self.read_hash(file, options = {})
  data = []

  open(file) do |ios|
    ios.each_hash(options) { |row| data << row }
  end

  data
end

Instance Method Details

#each_array(options = {}) ⇒ Object

Method to iterate over a CSV IO object yielding arrays or an enumerator

CSV.each_array(options={}) { |item| block } -> ary
CSV.each_array(options={})                  -> Enumerator

Options:

* :include_header -
* :delimiter      -
* :select         -
* :reject         -


135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# File 'lib/BioDSL/csv.rb', line 135

def each_array(options = {})
  return to_enum :each_array unless block_given?

  delimiter = options[:delimiter] || @delimiter

  @io.each do |line|
    line.chomp!
    next if line.empty?

    fields = line.split(delimiter)

    if line[0] == '#'
      get_header(fields, options) unless @header
      get_fields(fields, options) unless @fields

      yield @header.map(&:to_s) if options[:include_header]
    else
      get_header(fields, options) unless @header
      get_fields(fields, options) unless @fields

      fields = fields.values_at(*@fields) if @fields

      determine_types(fields) unless @types

      yield fields.convert_types(@types)
    end
  end

  self
end

#each_hash(options = {}) ⇒ Object

Method to iterate over a CSV IO object yielding hashes or an enumerator

CSV.each_hash(options={}) { |item| block } -> hash
CSV.each_hash(options={})                  -> Enumerator

Options:

* :delimiter      -
* :select         -
* :reject         -


174
175
176
177
178
179
180
181
182
183
184
185
186
# File 'lib/BioDSL/csv.rb', line 174

def each_hash(options = {})
  each_array(options) do |array|
    hash = {}

    array.convert_types(@types).each_with_index do |field, i|
      hash[@header[i]] = field
    end

    yield hash
  end

  self
end

#skip(num) ⇒ Object

Method to skip a given number or non-empty lines.



118
119
120
121
122
123
124
# File 'lib/BioDSL/csv.rb', line 118

def skip(num)
  while num != 0 && (line = @io.gets)
    line.chomp!

    num -= 1 unless line.empty?
  end
end