Class: Polars::ListExpr

Inherits:
Object
  • Object
show all
Defined in:
lib/polars/list_expr.rb

Overview

Namespace for list related expressions.

Instance Method Summary collapse

Instance Method Details

#[](item) ⇒ Expr

Get the value by index in the sublists.

Returns:



274
275
276
# File 'lib/polars/list_expr.rb', line 274

def [](item)
  get(item)
end

#arg_maxExpr

Retrieve the index of the maximum value in every sublist.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2], [2, 1]]
  }
)
df.select(Polars.col("a").arr.arg_max)
# =>
# shape: (2, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 1   │
# ├╌╌╌╌╌┤
# │ 0   │
# └─────┘

Returns:



428
429
430
# File 'lib/polars/list_expr.rb', line 428

def arg_max
  Utils.wrap_expr(_rbexpr.lst_arg_max)
end

#arg_minExpr

Retrieve the index of the minimal value in every sublist.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2], [2, 1]]
  }
)
df.select(Polars.col("a").arr.arg_min)
# =>
# shape: (2, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 0   │
# ├╌╌╌╌╌┤
# │ 1   │
# └─────┘

Returns:



402
403
404
# File 'lib/polars/list_expr.rb', line 402

def arg_min
  Utils.wrap_expr(_rbexpr.lst_arg_min)
end

#concat(other) ⇒ Expr

Concat the arrays in a Series dtype List in linear time.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [["a"], ["x"]],
    "b" => [["b", "c"], ["y", "z"]]
  }
)
df.select(Polars.col("a").arr.concat("b"))
# =>
# shape: (2, 1)
# ┌─────────────────┐
# │ a               │
# │ ---             │
# │ list[str]       │
# ╞═════════════════╡
# │ ["a", "b", "c"] │
# ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
# │ ["x", "y", "z"] │
# └─────────────────┘

Parameters:

  • other (Object)

    Columns to concat into a List Series

Returns:



224
225
226
227
228
229
230
231
232
233
234
235
236
237
# File 'lib/polars/list_expr.rb', line 224

def concat(other)
  if other.is_a?(Array) && ![Expr, String, Series].any? { |c| other[0].is_a?(c) }
    return concat(Series.new([other]))
  end

  if !other.is_a?(Array)
    other_list = [other]
  else
    other_list = other.dup
  end

  other_list.insert(0, Utils.wrap_expr(_rbexpr))
  Polars.concat_list(other_list)
end

#contains(item) ⇒ Expr

Check if sublists contain the given item.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").arr.contains(1))
# =>
# shape: (3, 1)
# ┌───────┐
# │ foo   │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ true  │
# ├╌╌╌╌╌╌╌┤
# │ false │
# ├╌╌╌╌╌╌╌┤
# │ true  │
# └───────┘

Parameters:

  • item (Object)

    Item that will be checked for membership

Returns:



349
350
351
# File 'lib/polars/list_expr.rb', line 349

def contains(item)
  Utils.wrap_expr(_rbexpr.arr_contains(Utils.expr_to_lit_or_expr(item)._rbexpr))
end

#diff(n: 1, null_behavior: "ignore") ⇒ Expr

Calculate the n-th discrete difference of every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.arr.diff
# =>
# shape: (2,)
# Series: 'a' [list]
# [
#         [null, 1, ... 1]
#         [null, -8, -1]
# ]

Parameters:

  • n (Integer) (defaults to: 1)

    Number of slots to shift.

  • null_behavior ("ignore", "drop") (defaults to: "ignore")

    How to handle null values.

Returns:



451
452
453
# File 'lib/polars/list_expr.rb', line 451

def diff(n: 1, null_behavior: "ignore")
  Utils.wrap_expr(_rbexpr.lst_diff(n, null_behavior))
end

#eval(expr, parallel: false) ⇒ Expr

Run any polars expression against the lists' elements.

Examples:

df = Polars::DataFrame.new({"a" => [1, 8, 3], "b" => [4, 5, 2]})
df.with_column(
  Polars.concat_list(["a", "b"]).arr.eval(Polars.element.rank).alias("rank")
)
# =>
# shape: (3, 3)
# ┌─────┬─────┬────────────┐
# │ a   ┆ b   ┆ rank       │
# │ --- ┆ --- ┆ ---        │
# │ i64 ┆ i64 ┆ list[f32]  │
# ╞═════╪═════╪════════════╡
# │ 1   ┆ 4   ┆ [1.0, 2.0] │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
# │ 8   ┆ 5   ┆ [2.0, 1.0] │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌┤
# │ 3   ┆ 2   ┆ [2.0, 1.0] │
# └─────┴─────┴────────────┘

Parameters:

  • expr (Expr)

    Expression to run. Note that you can select an element with Polars.first, or Polars.col

  • parallel (Boolean) (defaults to: false)

    Run all expression parallel. Don't activate this blindly. Parallelism is worth it if there is enough work to do per thread.

    This likely should not be use in the groupby context, because we already parallel execution per group

Returns:



606
607
608
# File 'lib/polars/list_expr.rb', line 606

def eval(expr, parallel: false)
   Utils.wrap_expr(_rbexpr.lst_eval(expr._rbexpr, parallel))
end

#firstExpr

Get the first value of the sublists.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").arr.first)
# =>
# shape: (3, 1)
# ┌──────┐
# │ foo  │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ 3    │
# ├╌╌╌╌╌╌┤
# │ null │
# ├╌╌╌╌╌╌┤
# │ 1    │
# └──────┘

Returns:



298
299
300
# File 'lib/polars/list_expr.rb', line 298

def first
  get(0)
end

#get(index) ⇒ Expr

Get the value by index in the sublists.

So index 0 would return the first item of every sublist and index -1 would return the last item of every sublist if an index is out of bounds, it will return a None.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").arr.get(0))
# =>
# shape: (3, 1)
# ┌──────┐
# │ foo  │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ 3    │
# ├╌╌╌╌╌╌┤
# │ null │
# ├╌╌╌╌╌╌┤
# │ 1    │
# └──────┘

Parameters:

  • index (Integer)

    Index to return per sublist

Returns:



266
267
268
269
# File 'lib/polars/list_expr.rb', line 266

def get(index)
  index = Utils.expr_to_lit_or_expr(index, str_to_lit: false)._rbexpr
  Utils.wrap_expr(_rbexpr.lst_get(index))
end

#head(n = 5) ⇒ Expr

Slice the first n values of every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.arr.head(2)
# =>
# shape: (2,)
# Series: 'a' [list]
# [
#         [1, 2]
#         [10, 2]
# ]

Parameters:

  • n (Integer) (defaults to: 5)

    Number of values to return for each sublist.

Returns:



519
520
521
# File 'lib/polars/list_expr.rb', line 519

def head(n = 5)
  slice(0, n)
end

#join(separator) ⇒ Expr

Join all string items in a sublist and place a separator between them.

This errors if inner type of list != :str.

Examples:

df = Polars::DataFrame.new({"s" => [["a", "b", "c"], ["x", "y"]]})
df.select(Polars.col("s").arr.join(" "))
# =>
# shape: (2, 1)
# ┌───────┐
# │ s     │
# │ ---   │
# │ str   │
# ╞═══════╡
# │ a b c │
# ├╌╌╌╌╌╌╌┤
# │ x y   │
# └───────┘

Parameters:

  • separator (String)

    string to separate the items with

Returns:



376
377
378
# File 'lib/polars/list_expr.rb', line 376

def join(separator)
  Utils.wrap_expr(_rbexpr.lst_join(separator))
end

#lastExpr

Get the last value of the sublists.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").arr.last)
# =>
# shape: (3, 1)
# ┌──────┐
# │ foo  │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ 1    │
# ├╌╌╌╌╌╌┤
# │ null │
# ├╌╌╌╌╌╌┤
# │ 2    │
# └──────┘

Returns:



322
323
324
# File 'lib/polars/list_expr.rb', line 322

def last
  get(-1)
end

#lengthsExpr

Get the length of the arrays as :u32.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2], "bar" => [["a", "b"], ["c"]]})
df.select(Polars.col("bar").arr.lengths)
# =>
# shape: (2, 1)
# ┌─────┐
# │ bar │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 2   │
# ├╌╌╌╌╌┤
# │ 1   │
# └─────┘

Returns:



30
31
32
# File 'lib/polars/list_expr.rb', line 30

def lengths
  Utils.wrap_expr(_rbexpr.arr_lengths)
end

#maxExpr

Compute the max value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").arr.max)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 1      │
# ├╌╌╌╌╌╌╌╌┤
# │ 3      │
# └────────┘

Returns:



74
75
76
# File 'lib/polars/list_expr.rb', line 74

def max
  Utils.wrap_expr(_rbexpr.lst_max)
end

#meanExpr

Compute the mean value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").arr.mean)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ f64    │
# ╞════════╡
# │ 1.0    │
# ├╌╌╌╌╌╌╌╌┤
# │ 2.5    │
# └────────┘

Returns:



118
119
120
# File 'lib/polars/list_expr.rb', line 118

def mean
  Utils.wrap_expr(_rbexpr.lst_mean)
end

#minExpr

Compute the min value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").arr.min)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 1      │
# ├╌╌╌╌╌╌╌╌┤
# │ 2      │
# └────────┘

Returns:



96
97
98
# File 'lib/polars/list_expr.rb', line 96

def min
  Utils.wrap_expr(_rbexpr.lst_min)
end

#reverseExpr

Reverse the arrays in the list.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[3, 2, 1], [9, 1, 2]]
  }
)
df.select(Polars.col("a").arr.reverse)
# =>
# shape: (2, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2, 3] │
# ├╌╌╌╌╌╌╌╌╌╌╌┤
# │ [2, 1, 9] │
# └───────────┘

Returns:



170
171
172
# File 'lib/polars/list_expr.rb', line 170

def reverse
  Utils.wrap_expr(_rbexpr.lst_reverse)
end

#shift(periods = 1) ⇒ Expr

Shift values by the given period.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.arr.shift
# =>
# shape: (2,)
# Series: 'a' [list]
# [
#         [null, 1, ... 3]
#         [null, 10, 2]
# ]

Parameters:

  • periods (Integer) (defaults to: 1)

    Number of places to shift (may be negative).

Returns:



472
473
474
# File 'lib/polars/list_expr.rb', line 472

def shift(periods = 1)
  Utils.wrap_expr(_rbexpr.lst_shift(periods))
end

#slice(offset, length = nil) ⇒ Expr

Slice every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.arr.slice(1, 2)
# =>
# shape: (2,)
# Series: 'a' [list]
# [
#         [2, 3]
#         [2, 1]
# ]

Parameters:

  • offset (Integer)

    Start index. Negative indexing is supported.

  • length (Integer) (defaults to: nil)

    Length of the slice. If set to nil (default), the slice is taken to the end of the list.

Returns:



496
497
498
499
500
# File 'lib/polars/list_expr.rb', line 496

def slice(offset, length = nil)
  offset = Utils.expr_to_lit_or_expr(offset, str_to_lit: false)._rbexpr
  length = Utils.expr_to_lit_or_expr(length, str_to_lit: false)._rbexpr
  Utils.wrap_expr(_rbexpr.lst_slice(offset, length))
end

#sort(reverse: false) ⇒ Expr

Sort the arrays in the list.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[3, 2, 1], [9, 1, 2]]
  }
)
df.select(Polars.col("a").arr.sort)
# =>
# shape: (2, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2, 3] │
# ├╌╌╌╌╌╌╌╌╌╌╌┤
# │ [1, 2, 9] │
# └───────────┘

Returns:



144
145
146
# File 'lib/polars/list_expr.rb', line 144

def sort(reverse: false)
  Utils.wrap_expr(_rbexpr.lst_sort(reverse))
end

#sumExpr

Sum all the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").arr.sum)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 1      │
# ├╌╌╌╌╌╌╌╌┤
# │ 5      │
# └────────┘

Returns:



52
53
54
# File 'lib/polars/list_expr.rb', line 52

def sum
  Utils.wrap_expr(_rbexpr.lst_sum)
end

#tail(n = 5) ⇒ Expr

Slice the last n values of every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.arr.tail(2)
# =>
# shape: (2,)
# Series: 'a' [list]
# [
#         [3, 4]
#         [2, 1]
# ]

Parameters:

  • n (Integer) (defaults to: 5)

    Number of values to return for each sublist.

Returns:



540
541
542
543
# File 'lib/polars/list_expr.rb', line 540

def tail(n = 5)
  offset = -Utils.expr_to_lit_or_expr(n, str_to_lit: false)
  slice(offset, n)
end

#to_struct(n_field_strategy: "first_non_null", name_generator: nil) ⇒ Expr

Convert the series of type List to a series of type Struct.

Examples:

df = Polars::DataFrame.new({"a" => [[1, 2, 3], [1, 2]]})
df.select([Polars.col("a").arr.to_struct])
# =>
# shape: (2, 1)
# ┌────────────┐
# │ a          │
# │ ---        │
# │ struct[3]  │
# ╞════════════╡
# │ {1,2,3}    │
# ├╌╌╌╌╌╌╌╌╌╌╌╌┤
# │ {1,2,null} │
# └────────────┘

Parameters:

  • n_field_strategy ("first_non_null", "max_width") (defaults to: "first_non_null")

    Strategy to determine the number of fields of the struct.

  • name_generator (Object) (defaults to: nil)

    A custom function that can be used to generate the field names. Default field names are field_0, field_1 .. field_n

Returns:

Raises:

  • (Todo)


569
570
571
572
# File 'lib/polars/list_expr.rb', line 569

def to_struct(n_field_strategy: "first_non_null", name_generator: nil)
  raise Todo if name_generator
  Utils.wrap_expr(_rbexpr.lst_to_struct(n_field_strategy, name_generator, 0))
end

#uniqueExpr

Get the unique/distinct values in the list.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 1, 2]]
  }
)
df.select(Polars.col("a").arr.unique)
# =>
# shape: (1, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2]    │
# └───────────┘

Returns:



194
195
196
# File 'lib/polars/list_expr.rb', line 194

def unique
  Utils.wrap_expr(_rbexpr.lst_unique)
end