Module: DaruLite::DataFrame::Iterable

Included in:
DaruLite::DataFrame
Defined in:
lib/daru_lite/data_frame/iterable.rb

Instance Method Summary collapse

Instance Method Details

#apply_method(method, keys: nil, by_position: true) ⇒ Object Also known as: apply_method_on_sub_df



263
264
265
266
267
268
269
270
271
272
273
# File 'lib/daru_lite/data_frame/iterable.rb', line 263

def apply_method(method, keys: nil, by_position: true)
  df = keys ? get_sub_dataframe(keys, by_position: by_position) : self

  case method
  when Symbol then df.send(method)
  when Proc   then method.call(df)
  when Array
    method.map(&:to_proc).map { |proc| proc.call(df) } # works with Array of both Symbol and/or Proc
  else raise
  end
end

#collect(axis = :vector) ⇒ Object

Iterate over a row or vector and return results in a DaruLite::Vector. Specify axis with :vector or :row. Default to :vector.

Description

The #collect iterator works similar to #map, the only difference being that it returns a DaruLite::Vector comprising of the results of each block run. The resultant Vector has the same index as that of the axis over which collect has iterated. It also accepts the optional axis argument.

Arguments

  • axis - The axis to iterate over. Can be :vector (or :column)

or :row. Default to :vector.



90
91
92
# File 'lib/daru_lite/data_frame/iterable.rb', line 90

def collect(axis = :vector, &)
  dispatch_to_axis_pl(axis, :collect, &)
end

#collect_matrix::Matrix

Generate a matrix, based on vector names of the DataFrame.

:nocov: FIXME: Even not trying to cover this: I can’t get, how it is expected to work.… – zverok

Returns:



310
311
312
313
314
315
316
317
318
319
320
321
# File 'lib/daru_lite/data_frame/iterable.rb', line 310

def collect_matrix
  return to_enum(:collect_matrix) unless block_given?

  vecs = vectors.to_a
  rows = vecs.collect do |row|
    vecs.collect do |col|
      yield row, col
    end
  end

  Matrix.rows(rows)
end

#collect_row_with_index(&block) ⇒ Object



284
285
286
287
288
# File 'lib/daru_lite/data_frame/iterable.rb', line 284

def collect_row_with_index(&block)
  return to_enum(:collect_row_with_index) unless block

  DaruLite::Vector.new(each_row_with_index.map(&block), index: @index)
end

#collect_rows(&block) ⇒ Object

Retrieves a DaruLite::Vector, based on the result of calculation performed on each row.



278
279
280
281
282
# File 'lib/daru_lite/data_frame/iterable.rb', line 278

def collect_rows(&block)
  return to_enum(:collect_rows) unless block

  DaruLite::Vector.new(each_row.map(&block), index: @index)
end

#collect_vector_with_index(&block) ⇒ Object



298
299
300
301
302
# File 'lib/daru_lite/data_frame/iterable.rb', line 298

def collect_vector_with_index(&block)
  return to_enum(:collect_vector_with_index) unless block

  DaruLite::Vector.new(each_vector_with_index.map(&block), index: @vectors)
end

#collect_vectors(&block) ⇒ Object

Retrives a DaruLite::Vector, based on the result of calculation performed on each vector.



292
293
294
295
296
# File 'lib/daru_lite/data_frame/iterable.rb', line 292

def collect_vectors(&block)
  return to_enum(:collect_vectors) unless block

  DaruLite::Vector.new(each_vector.map(&block), index: @vectors)
end

#each(axis = :vector) ⇒ Object

Iterate over each row or vector of the DataFrame. Specify axis by passing :vector or :row as the argument. Default to :vector.

Description

‘#each` works exactly like Array#each. The default mode for `each` is to iterate over the columns of the DataFrame. To iterate over rows you must pass the axis, i.e `:row` as an argument.

Arguments

  • axis - The axis to iterate over. Can be :vector (or :column)

or :row. Default to :vector.



71
72
73
# File 'lib/daru_lite/data_frame/iterable.rb', line 71

def each(axis = :vector, &)
  dispatch_to_axis(axis, :each, &)
end

#each_index(&block) ⇒ Object

Iterate over each index of the DataFrame.



5
6
7
8
9
10
11
# File 'lib/daru_lite/data_frame/iterable.rb', line 5

def each_index(&block)
  return to_enum(:each_index) unless block

  @index.each(&block)

  self
end

#each_rowObject

Iterate over each row



38
39
40
41
42
43
44
45
46
# File 'lib/daru_lite/data_frame/iterable.rb', line 38

def each_row
  return to_enum(:each_row) unless block_given?

  @index.size.times do |pos|
    yield row_at(pos)
  end

  self
end

#each_row_with_indexObject



48
49
50
51
52
53
54
55
56
# File 'lib/daru_lite/data_frame/iterable.rb', line 48

def each_row_with_index
  return to_enum(:each_row_with_index) unless block_given?

  @index.each do |index|
    yield access_row(index), index
  end

  self
end

#each_vector(&block) ⇒ Object Also known as: each_column

Iterate over each vector



14
15
16
17
18
19
20
# File 'lib/daru_lite/data_frame/iterable.rb', line 14

def each_vector(&block)
  return to_enum(:each_vector) unless block

  @data.each(&block)

  self
end

#each_vector_with_indexObject Also known as: each_column_with_index

Iterate over each vector alongwith the name of the vector



25
26
27
28
29
30
31
32
33
# File 'lib/daru_lite/data_frame/iterable.rb', line 25

def each_vector_with_index
  return to_enum(:each_vector_with_index) unless block_given?

  @vectors.each do |vector|
    yield @data[@vectors[vector]], vector
  end

  self
end

#map(axis = :vector) ⇒ Object

Map over each vector or row of the data frame according to the argument specified. Will return an Array of the resulting elements. To map over each row/vector and get a DataFrame, see #recode.

Description

The #map iterator works like Array#map. The value returned by each run of the block is added to an Array and the Array is returned. This method also accepts an axis argument, like #each. The default is :vector.

Arguments

  • axis - The axis to map over. Can be :vector (or :column) or :row.

Default to :vector.



110
111
112
# File 'lib/daru_lite/data_frame/iterable.rb', line 110

def map(axis = :vector, &)
  dispatch_to_axis_pl(axis, :map, &)
end

#map!(axis = :vector) ⇒ Object

Destructive map. Modifies the DataFrame. Each run of the block must return a DaruLite::Vector. You can specify the axis to map over as the argument. Default to :vector.

Arguments

  • axis - The axis to map over. Can be :vector (or :column) or :row.

Default to :vector.



122
123
124
125
126
127
128
# File 'lib/daru_lite/data_frame/iterable.rb', line 122

def map!(axis = :vector, &)
  if i[vector column].include?(axis)
    map_vectors!(&)
  elsif axis == :row
    map_rows!(&)
  end
end

#map_rows(&block) ⇒ Object

Map each row



241
242
243
244
245
# File 'lib/daru_lite/data_frame/iterable.rb', line 241

def map_rows(&block)
  return to_enum(:map_rows) unless block

  each_row.map(&block)
end

#map_rows!Object



253
254
255
256
257
258
259
260
261
# File 'lib/daru_lite/data_frame/iterable.rb', line 253

def map_rows!
  return to_enum(:map_rows!) unless block_given?

  index.dup.each do |i|
    row[i] = should_be_vector!(yield(row[i]))
  end

  self
end

#map_rows_with_index(&block) ⇒ Object



247
248
249
250
251
# File 'lib/daru_lite/data_frame/iterable.rb', line 247

def map_rows_with_index(&block)
  return to_enum(:map_rows_with_index) unless block

  each_row_with_index.map(&block)
end

#map_vectors(&block) ⇒ Object

Map each vector and return an Array.



216
217
218
219
220
# File 'lib/daru_lite/data_frame/iterable.rb', line 216

def map_vectors(&block)
  return to_enum(:map_vectors) unless block

  @data.map(&block)
end

#map_vectors!Object

Destructive form of #map_vectors



223
224
225
226
227
228
229
230
231
# File 'lib/daru_lite/data_frame/iterable.rb', line 223

def map_vectors!
  return to_enum(:map_vectors!) unless block_given?

  vectors.dup.each do |n|
    self[n] = should_be_vector!(yield(self[n]))
  end

  self
end

#map_vectors_with_index(&block) ⇒ Object

Map vectors alongwith the index.



234
235
236
237
238
# File 'lib/daru_lite/data_frame/iterable.rb', line 234

def map_vectors_with_index(&block)
  return to_enum(:map_vectors_with_index) unless block

  each_vector_with_index.map(&block)
end

#recode(axis = :vector) ⇒ Object

Maps over the DataFrame and returns a DataFrame. Each run of the block must return a DaruLite::Vector object. You can specify the axis to map over. Default to :vector.

Description

Recode works similarly to #map, but an important difference between the two is that recode returns a modified DaruLite::DataFrame instead of an Array. For this reason, #recode expects that every run of the block to return a DaruLite::Vector.

Just like map and each, recode also accepts an optional axis argument.

Arguments

  • axis - The axis to map over. Can be :vector (or :column) or :row.

Default to :vector.



147
148
149
# File 'lib/daru_lite/data_frame/iterable.rb', line 147

def recode(axis = :vector, &)
  dispatch_to_axis_pl(axis, :recode, &)
end

#recode_rowsObject



205
206
207
208
209
210
211
212
213
# File 'lib/daru_lite/data_frame/iterable.rb', line 205

def recode_rows
  block_given? or return to_enum(:recode_rows)

  dup.tap do |df|
    df.each_row_with_index do |r, i|
      df.row[i] = should_be_vector!(yield(r))
    end
  end
end

#recode_vectorsObject



195
196
197
198
199
200
201
202
203
# File 'lib/daru_lite/data_frame/iterable.rb', line 195

def recode_vectors
  block_given? or return to_enum(:recode_vectors)

  dup.tap do |df|
    df.each_vector_with_index do |v, i|
      df[*i] = should_be_vector!(yield(v))
    end
  end
end

#replace_values(old_values, new_value) ⇒ DaruLite::DataFrame

Replace specified values with given value

Examples:

df = DaruLite::DataFrame.new({
  a: [1,    2,          3,   nil,        Float::NAN, nil, 1,   7],
  b: [:a,  :b,          nil, Float::NAN, nil,        3,   5,   8],
  c: ['a',  Float::NAN, 3,   4,          3,          5,   nil, 7]
}, index: 11..18)
df.replace_values nil, Float::NAN
# => #<DaruLite::DataFrame(8x3)>
#       a   b   c
#   11   1   a   a
#   12   2   b NaN
#   13   3 NaN   3
#   14 NaN NaN   4
#   15 NaN NaN   3
#   16 NaN   3   5
#   17   1   5 NaN
#   18   7   8   7

Parameters:

  • old_values (Array)

    values to replace with new value

  • new_value (object)

    new value to replace with

Returns:



173
174
175
176
# File 'lib/daru_lite/data_frame/iterable.rb', line 173

def replace_values(old_values, new_value)
  @data.each { |vec| vec.replace_values old_values, new_value }
  self
end

#verify(*tests) ⇒ Object

Test each row with one or more tests. The function returns an array with all errors.

FIXME: description here is too sparse. As far as I can get, it should tell something about that each test is [descr, fields, block], and that first value may be column name to output. - zverok, 2016-05-18

Parameters:

  • tests (Proc)

    Each test is a Proc with the form *Proc.new {|row| row > 0}*



186
187
188
189
190
191
192
193
# File 'lib/daru_lite/data_frame/iterable.rb', line 186

def verify(*tests)
  id = tests.first.is_a?(Symbol) ? tests.shift : @vectors.first

  each_row_with_index.map do |row, i|
    tests.reject { |*_, block| block.call(row) }
         .map { |test| verify_error_message row, test, id, i }
  end.flatten
end