Module: DaruLite::DataFrame::Fetchable

Included in:
DaruLite::DataFrame
Defined in:
lib/daru_lite/data_frame/fetchable.rb

Instance Method Summary collapse

Instance Method Details

#[](*names) ⇒ Object

Access row or vector. Specify name of row/vector followed by axis(:row, :vector). Defaults to :vector. Use of this method is not recommended for accessing rows. Use df.row for accessing row with index ‘:a’.



7
8
9
10
# File 'lib/daru_lite/data_frame/fetchable.rb', line 7

def [](*names)
  axis = extract_axis(names, :vector)
  dispatch_to_axis axis, :access, *names
end

#access_row_tuples_by_indexs(*indexes) ⇒ Array

Returns array of row tuples at given index(s)

Examples:

Using DaruLite::Index

df = DaruLite::DataFrame.new({
  a: [1, 2, 3],
  b: ['a', 'a', 'b']
})

df.access_row_tuples_by_indexs(1,2)
# => [[2, "a"], [3, "b"]]

df.index = DaruLite::Index.new([:one,:two,:three])
df.access_row_tuples_by_indexs(:one,:three)
# => [[1, "a"], [3, "b"]]

Using DaruLite::MultiIndex

mi_idx = DaruLite::MultiIndex.from_tuples [
  [:a,:one,:bar],
  [:a,:one,:baz],
  [:b,:two,:bar],
  [:a,:two,:baz],
]
df_mi = DaruLite::DataFrame.new({
  a: 1..4,
  b: 'a'..'d'
}, index: mi_idx )

df_mi.access_row_tuples_by_indexs(:b, :two, :bar)
# => [[3, "c"]]
df_mi.access_row_tuples_by_indexs(:a)
# => [[1, "a"], [2, "b"], [4, "d"]]

Parameters:

  • indexes (Array)

    index(s) at which row tuples are retrieved

Returns:

  • (Array)

    returns array of row tuples at given index(s)



144
145
146
147
148
149
150
151
152
153
154
155
156
# File 'lib/daru_lite/data_frame/fetchable.rb', line 144

def access_row_tuples_by_indexs(*indexes)
  return get_sub_dataframe(indexes, by_position: false).map_rows(&:to_a) if
  @index.is_a?(DaruLite::MultiIndex)

  positions = @index.pos(*indexes)
  if positions.is_a? Numeric
    row = get_rows_for([positions])
    row.first.is_a?(Array) ? row : [row]
  else
    new_rows = get_rows_for(indexes, by_position: false)
    indexes.map { |index| new_rows.map { |r| r[index] } }
  end
end

#at(*positions) ⇒ DaruLite::Vector, DaruLite::DataFrame

Retrive vectors by positions

Examples:

df = DaruLite::DataFrame.new({
  a: [1, 2, 3],
  b: ['a', 'b', 'c']
})
df.at 0
# => #<DaruLite::Vector(3)>
#       a
#   0   1
#   1   2
#   2   3

Parameters:

  • positions (Array<Integer>)

    of vectors to retrive

Returns:



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# File 'lib/daru_lite/data_frame/fetchable.rb', line 58

def at(*positions)
  if AXES.include? positions.last
    axis = positions.pop
    return row_at(*positions) if axis == :row
  end

  original_positions = positions
  positions = coerce_positions(*positions, ncols)
  validate_positions(*positions, ncols)

  if positions.is_a? Integer
    @data[positions].dup
  else
    DaruLite::DataFrame.new positions.map { |pos| @data[pos].dup },
                            index: @index,
                            order: @vectors.at(*original_positions),
                            name: @name
  end
end

#get_sub_dataframe(keys, by_position: true) ⇒ DaruLite::Dataframe

Extract a dataframe given row indexes or positions

Parameters:

  • keys (Array)

    can be positions (if by_position is true) or indexes (if by_position if false)

Returns:

  • (DaruLite::Dataframe)


98
99
100
101
102
103
104
105
106
107
# File 'lib/daru_lite/data_frame/fetchable.rb', line 98

def get_sub_dataframe(keys, by_position: true)
  return DaruLite::DataFrame.new({}) if keys == []

  keys = @index.pos(*keys) unless by_position

  sub_df = row_at(*keys)
  sub_df = sub_df.to_df.transpose if sub_df.is_a?(DaruLite::Vector)

  sub_df
end

#get_vector_anyways(v) ⇒ Object



109
110
111
# File 'lib/daru_lite/data_frame/fetchable.rb', line 109

def get_vector_anyways(v)
  @vectors.include?(v) ? self[v].to_a : Array.new(size)
end

#head(quantity = 10) ⇒ Object Also known as: first

The first ten elements of the DataFrame

Parameters:

  • quantity (Fixnum) (defaults to: 10)

    (10) The number of elements to display from the top.



81
82
83
# File 'lib/daru_lite/data_frame/fetchable.rb', line 81

def head(quantity = 10)
  row.at 0..(quantity - 1)
end

#numeric_vector_namesObject



197
198
199
# File 'lib/daru_lite/data_frame/fetchable.rb', line 197

def numeric_vector_names
  @vectors.select { |v| self[v].numeric? }
end

#numeric_vectorsObject

Return the indexes of all the numeric vectors. Will include vectors with nils alongwith numbers.



190
191
192
193
194
195
# File 'lib/daru_lite/data_frame/fetchable.rb', line 190

def numeric_vectors
  # FIXME: Why _with_index ?..
  each_vector_with_index
    .select { |vec, _i| vec.numeric? }
    .map(&:last)
end

#only_numerics(opts = {}) ⇒ Object

Return a DataFrame of only the numerical Vectors. If clone: false is specified as option, only a view of the Vectors will be returned. Defaults to clone: true.



204
205
206
207
208
209
210
# File 'lib/daru_lite/data_frame/fetchable.rb', line 204

def only_numerics(opts = {})
  cln = opts[:clone] != false
  arry = numeric_vectors.map { |v| self[v] }

  order = Index.new(numeric_vectors)
  DaruLite::DataFrame.new(arry, clone: cln, order: order, index: @index)
end

#row_at(*positions) ⇒ DaruLite::Vector, DaruLite::DataFrame

Retrive rows by positions

Examples:

df = DaruLite::DataFrame.new({
  a: [1, 2, 3],
  b: ['a', 'b', 'c']
})
df.row_at 1, 2
# => #<DaruLite::DataFrame(2x2)>
#       a   b
#   1   2   b
#   2   3   c

Parameters:

  • positions (Array<Integer>)

    of rows to retrive

Returns:



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# File 'lib/daru_lite/data_frame/fetchable.rb', line 25

def row_at(*positions)
  original_positions = positions
  positions = coerce_positions(*positions, nrows)
  validate_positions(*positions, nrows)

  if positions.is_a? Integer
    row = get_rows_for([positions])
    DaruLite::Vector.new(row, index: @vectors, name: @index.at(positions))
  else
    new_rows = get_rows_for(original_positions)
    DaruLite::DataFrame.new(
      new_rows,
      index: @index.at(*original_positions),
      order: @vectors,
      name: @name
    )
  end
end

#split_by_category(cat_name) ⇒ Array

Split the dataframe into many dataframes based on category vector

Examples:

df = DaruLite::DataFrame.new({
  a: [1, 2, 3],
  b: ['a', 'a', 'b']
})
df.to_category :b
df.split_by_category :b
# => [#<DaruLite::DataFrame: a (2x1)>
#       a
#   0   1
#   1   2,
# #<DaruLite::DataFrame: b (1x1)>
#       a
#   2   3]

Parameters:

  • cat_name (object)

    name of category vector to split the dataframe

Returns:

  • (Array)

    array of dataframes split by category with category vector used to split not included

Raises:

  • (ArgumentError)


176
177
178
179
180
181
182
183
184
185
186
# File 'lib/daru_lite/data_frame/fetchable.rb', line 176

def split_by_category(cat_name)
  cat_dv = self[cat_name]
  raise ArgumentError, "#{cat_name} is not a category vector" unless
    cat_dv.category?

  cat_dv.categories.map do |cat|
    where(cat_dv.eq cat)
      .rename(cat)
      .delete_vector cat_name
  end
end

#tail(quantity = 10) ⇒ Object Also known as: last

The last ten elements of the DataFrame

Parameters:

  • quantity (Fixnum) (defaults to: 10)

    (10) The number of elements to display from the bottom.



89
90
91
92
# File 'lib/daru_lite/data_frame/fetchable.rb', line 89

def tail(quantity = 10)
  start = [-quantity, -size].max
  row.at start..-1
end