Class: Polars::ListExpr

Inherits:
Object
  • Object
show all
Defined in:
lib/polars/list_expr.rb

Overview

Namespace for list related expressions.

Instance Method Summary collapse

Instance Method Details

#[](item) ⇒ Expr

Get the value by index in the sublists.

Returns:



500
501
502
# File 'lib/polars/list_expr.rb', line 500

def [](item)
  get(item)
end

#allExpr

Evaluate whether all boolean values in a list are true.

Examples:

df = Polars::DataFrame.new(
  {"a" => [[true, true], [false, true], [false, false], [nil], [], nil]}
)
df.with_columns(all: Polars.col("a").list.all)
# =>
# shape: (6, 2)
# ┌────────────────┬───────┐
# │ a              ┆ all   │
# │ ---            ┆ ---   │
# │ list[bool]     ┆ bool  │
# ╞════════════════╪═══════╡
# │ [true, true]   ┆ true  │
# │ [false, true]  ┆ false │
# │ [false, false] ┆ false │
# │ [null]         ┆ true  │
# │ []             ┆ true  │
# │ null           ┆ null  │
# └────────────────┴───────┘

Returns:



35
36
37
# File 'lib/polars/list_expr.rb', line 35

def all
  Utils.wrap_expr(_rbexpr.list_all)
end

#anyExpr

Evaluate whether any boolean value in a list is true.

Examples:

df = Polars::DataFrame.new(
  {"a" => [[true, true], [false, true], [false, false], [nil], [], nil]}
)
df.with_columns(any: Polars.col("a").list.any)
# =>
# shape: (6, 2)
# ┌────────────────┬───────┐
# │ a              ┆ any   │
# │ ---            ┆ ---   │
# │ list[bool]     ┆ bool  │
# ╞════════════════╪═══════╡
# │ [true, true]   ┆ true  │
# │ [false, true]  ┆ true  │
# │ [false, false] ┆ false │
# │ [null]         ┆ false │
# │ []             ┆ false │
# │ null           ┆ null  │
# └────────────────┴───────┘

Returns:



62
63
64
# File 'lib/polars/list_expr.rb', line 62

def any
  Utils.wrap_expr(_rbexpr.list_any)
end

#arg_maxExpr

Retrieve the index of the maximum value in every sublist.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2], [2, 1]]
  }
)
df.select(Polars.col("a").list.arg_max)
# =>
# shape: (2, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 1   │
# │ 0   │
# └─────┘

Returns:



727
728
729
# File 'lib/polars/list_expr.rb', line 727

def arg_max
  Utils.wrap_expr(_rbexpr.list_arg_max)
end

#arg_minExpr

Retrieve the index of the minimal value in every sublist.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2], [2, 1]]
  }
)
df.select(Polars.col("a").list.arg_min)
# =>
# shape: (2, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 0   │
# │ 1   │
# └─────┘

Returns:



702
703
704
# File 'lib/polars/list_expr.rb', line 702

def arg_min
  Utils.wrap_expr(_rbexpr.list_arg_min)
end

#concat(other) ⇒ Expr

Concat the arrays in a Series dtype List in linear time.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [["a"], ["x"]],
    "b" => [["b", "c"], ["y", "z"]]
  }
)
df.select(Polars.col("a").list.concat("b"))
# =>
# shape: (2, 1)
# ┌─────────────────┐
# │ a               │
# │ ---             │
# │ list[str]       │
# ╞═════════════════╡
# │ ["a", "b", "c"] │
# │ ["x", "y", "z"] │
# └─────────────────┘

Parameters:

  • other (Object)

    Columns to concat into a List Series

Returns:



448
449
450
451
452
453
454
455
456
457
458
459
460
461
# File 'lib/polars/list_expr.rb', line 448

def concat(other)
  if other.is_a?(::Array) && ![Expr, String, Series].any? { |c| other[0].is_a?(c) }
    return concat(Series.new([other]))
  end

  if !other.is_a?(::Array)
    other_list = [other]
  else
    other_list = other.dup
  end

  other_list.insert(0, Utils.wrap_expr(_rbexpr))
  Polars.concat_list(other_list)
end

#contains(item, nulls_equal: true) ⇒ Expr

Check if sublists contain the given item.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").list.contains(1))
# =>
# shape: (3, 1)
# ┌───────┐
# │ foo   │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ true  │
# │ false │
# │ true  │
# └───────┘

Parameters:

  • item (Object)

    Item that will be checked for membership

  • nulls_equal (Boolean) (defaults to: true)

    If true, treat null as a distinct value. Null values will not propagate.

Returns:



648
649
650
# File 'lib/polars/list_expr.rb', line 648

def contains(item, nulls_equal: true)
  Utils.wrap_expr(_rbexpr.list_contains(Utils.parse_into_expression(item), nulls_equal))
end

#count_matches(element) ⇒ Expr Also known as: count_match

Count how often the value produced by element occurs.

Examples:

df = Polars::DataFrame.new({"listcol" => [[0], [1], [1, 2, 3, 2], [1, 2, 1], [4, 4]]})
df.select(Polars.col("listcol").list.count_match(2).alias("number_of_twos"))
# =>
# shape: (5, 1)
# ┌────────────────┐
# │ number_of_twos │
# │ ---            │
# │ u32            │
# ╞════════════════╡
# │ 0              │
# │ 0              │
# │ 2              │
# │ 1              │
# │ 0              │
# └────────────────┘

Parameters:

  • element (Expr)

    An expression that produces a single value

Returns:



893
894
895
# File 'lib/polars/list_expr.rb', line 893

def count_matches(element)
  Utils.wrap_expr(_rbexpr.list_count_matches(Utils.parse_into_expression(element)))
end

#diff(n: 1, null_behavior: "ignore") ⇒ Expr

Calculate the n-th discrete difference of every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.list.diff
# =>
# shape: (2,)
# Series: 'a' [list[i64]]
# [
#         [null, 1, … 1]
#         [null, -8, -1]
# ]

Parameters:

  • n (Integer) (defaults to: 1)

    Number of slots to shift.

  • null_behavior ("ignore", "drop") (defaults to: "ignore")

    How to handle null values.

Returns:



750
751
752
# File 'lib/polars/list_expr.rb', line 750

def diff(n: 1, null_behavior: "ignore")
  Utils.wrap_expr(_rbexpr.list_diff(n, null_behavior))
end

#drop_nullsExpr

Drop all null values in the list.

The original order of the remaining elements is preserved.

Examples:

df = Polars::DataFrame.new({"values" => [[nil, 1, nil, 2], [nil], [3, 4]]})
df.with_columns(drop_nulls: Polars.col("values").list.drop_nulls)
# =>
# shape: (3, 2)
# ┌────────────────┬────────────┐
# │ values         ┆ drop_nulls │
# │ ---            ┆ ---        │
# │ list[i64]      ┆ list[i64]  │
# ╞════════════════╪════════════╡
# │ [null, 1, … 2] ┆ [1, 2]     │
# │ [null]         ┆ []         │
# │ [3, 4]         ┆ [3, 4]     │
# └────────────────┴────────────┘

Returns:



108
109
110
# File 'lib/polars/list_expr.rb', line 108

def drop_nulls
  Utils.wrap_expr(_rbexpr.list_drop_nulls)
end

#eval(expr) ⇒ Expr

Run any polars expression against the lists' elements.

Examples:

df = Polars::DataFrame.new({"a" => [1, 8, 3], "b" => [4, 5, 2]})
df.with_column(
  Polars.concat_list(["a", "b"]).list.eval(Polars.element.rank).alias("rank")
)
# =>
# shape: (3, 3)
# ┌─────┬─────┬────────────┐
# │ a   ┆ b   ┆ rank       │
# │ --- ┆ --- ┆ ---        │
# │ i64 ┆ i64 ┆ list[f64]  │
# ╞═════╪═════╪════════════╡
# │ 1   ┆ 4   ┆ [1.0, 2.0] │
# │ 8   ┆ 5   ┆ [2.0, 1.0] │
# │ 3   ┆ 2   ┆ [2.0, 1.0] │
# └─────┴─────┴────────────┘

Parameters:

  • expr (Expr)

    Expression to run. Note that you can select an element with Polars.first, or Polars.col

Returns:



988
989
990
# File 'lib/polars/list_expr.rb', line 988

def eval(expr)
  Utils.wrap_expr(_rbexpr.list_eval(expr._rbexpr))
end

#explodeExpr

Returns a column with a separate row for every list element.

Examples:

df = Polars::DataFrame.new({"a" => [[1, 2, 3], [4, 5, 6]]})
df.select(Polars.col("a").list.explode)
# =>
# shape: (6, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# │ 2   │
# │ 3   │
# │ 4   │
# │ 5   │
# │ 6   │
# └─────┘

Returns:



866
867
868
# File 'lib/polars/list_expr.rb', line 866

def explode
  Utils.wrap_expr(_rbexpr.explode)
end

#filter(predicate) ⇒ Expr

Filter elements in each list by a boolean expression.

Examples:

df = Polars::DataFrame.new({"a" => [1, 8, 3], "b" => [4, 5, 2]})
df.with_columns(
  evens: Polars.concat_list("a", "b").list.filter(Polars.element % 2 == 0)
)
# =>
# shape: (3, 3)
# ┌─────┬─────┬───────────┐
# │ a   ┆ b   ┆ evens     │
# │ --- ┆ --- ┆ ---       │
# │ i64 ┆ i64 ┆ list[i64] │
# ╞═════╪═════╪═══════════╡
# │ 1   ┆ 4   ┆ [4]       │
# │ 8   ┆ 5   ┆ [8]       │
# │ 3   ┆ 2   ┆ [2]       │
# └─────┴─────┴───────────┘

Parameters:

  • predicate (Object)

    A boolean expression that is evaluated per list element. You can refer to the current element with Polars.element.

Returns:



1016
1017
1018
# File 'lib/polars/list_expr.rb', line 1016

def filter(predicate)
  Utils.wrap_expr(_rbexpr.list_filter(predicate._rbexpr))
end

#firstExpr

Get the first value of the sublists.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").list.first)
# =>
# shape: (3, 1)
# ┌──────┐
# │ foo  │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ 3    │
# │ null │
# │ 1    │
# └──────┘

Returns:



599
600
601
# File 'lib/polars/list_expr.rb', line 599

def first
  get(0)
end

#gather(indices, null_on_oob: false) ⇒ Expr Also known as: take

Take sublists by multiple indices.

The indices may be defined in a single column, or by sublists in another column of dtype List.

Examples:

df = Polars::DataFrame.new({"a" => [[3, 2, 1], [], [1, 2, 3, 4, 5]]})
df.with_columns(gather: Polars.col("a").list.gather([0, 4], null_on_oob: true))
# =>
# shape: (3, 2)
# ┌─────────────┬──────────────┐
# │ a           ┆ gather       │
# │ ---         ┆ ---          │
# │ list[i64]   ┆ list[i64]    │
# ╞═════════════╪══════════════╡
# │ [3, 2, 1]   ┆ [3, null]    │
# │ []          ┆ [null, null] │
# │ [1, 2, … 5] ┆ [1, 5]       │
# └─────────────┴──────────────┘

Parameters:

  • indices (Object)

    Indices to return per sublist

  • null_on_oob (Boolean) (defaults to: false)

    Behavior if an index is out of bounds: True -> set as null False -> raise an error Note that defaulting to raising an error is much cheaper

Returns:



533
534
535
536
# File 'lib/polars/list_expr.rb', line 533

def gather(indices, null_on_oob: false)
  indices = Utils.parse_into_expression(indices)
  Utils.wrap_expr(_rbexpr.list_gather(indices, null_on_oob))
end

#gather_every(n, offset = 0) ⇒ Expr

Take every n-th value start from offset in sublists.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2, 3, 4, 5], [6, 7, 8], [9, 10, 11, 12]],
    "n" => [2, 1, 3],
    "offset" => [0, 1, 0]
  }
)
df.with_columns(
  gather_every: Polars.col("a").list.gather_every(
    Polars.col("n"), Polars.col("offset")
  )
)
# =>
# shape: (3, 4)
# ┌───────────────┬─────┬────────┬──────────────┐
# │ a             ┆ n   ┆ offset ┆ gather_every │
# │ ---           ┆ --- ┆ ---    ┆ ---          │
# │ list[i64]     ┆ i64 ┆ i64    ┆ list[i64]    │
# ╞═══════════════╪═════╪════════╪══════════════╡
# │ [1, 2, … 5]   ┆ 2   ┆ 0      ┆ [1, 3, 5]    │
# │ [6, 7, 8]     ┆ 1   ┆ 1      ┆ [7, 8]       │
# │ [9, 10, … 12] ┆ 3   ┆ 0      ┆ [9, 12]      │
# └───────────────┴─────┴────────┴──────────────┘

Parameters:

  • n (Integer)

    Gather every n-th element.

  • offset (Integer) (defaults to: 0)

    Starting index.

Returns:



572
573
574
575
576
577
578
579
# File 'lib/polars/list_expr.rb', line 572

def gather_every(
  n,
  offset = 0
)
  n = Utils.parse_into_expression(n)
  offset = Utils.parse_into_expression(offset)
  Utils.wrap_expr(_rbexpr.list_gather_every(n, offset))
end

#get(index, null_on_oob: true) ⇒ Expr

Get the value by index in the sublists.

So index 0 would return the first item of every sublist and index -1 would return the last item of every sublist if an index is out of bounds, it will return a nil.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").list.get(0))
# =>
# shape: (3, 1)
# ┌──────┐
# │ foo  │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ 3    │
# │ null │
# │ 1    │
# └──────┘

Parameters:

  • index (Integer)

    Index to return per sublist

  • null_on_oob (Boolean) (defaults to: true)

    Behavior if an index is out of bounds: true -> set as null false -> raise an error

Returns:



492
493
494
495
# File 'lib/polars/list_expr.rb', line 492

def get(index, null_on_oob: true)
  index = Utils.parse_into_expression(index)
  Utils.wrap_expr(_rbexpr.list_get(index, null_on_oob))
end

#head(n = 5) ⇒ Expr

Slice the first n values of every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.list.head(2)
# =>
# shape: (2,)
# Series: 'a' [list[i64]]
# [
#         [1, 2]
#         [10, 2]
# ]

Parameters:

  • n (Integer) (defaults to: 5)

    Number of values to return for each sublist.

Returns:



819
820
821
# File 'lib/polars/list_expr.rb', line 819

def head(n = 5)
  slice(0, n)
end

#join(separator, ignore_nulls: true) ⇒ Expr

Join all string items in a sublist and place a separator between them.

This errors if inner type of list != :str.

Examples:

df = Polars::DataFrame.new({"s" => [["a", "b", "c"], ["x", "y"]]})
df.select(Polars.col("s").list.join(" "))
# =>
# shape: (2, 1)
# ┌───────┐
# │ s     │
# │ ---   │
# │ str   │
# ╞═══════╡
# │ a b c │
# │ x y   │
# └───────┘

Parameters:

  • separator (String)

    string to separate the items with

  • ignore_nulls (Boolean) (defaults to: true)

    Ignore null values (default).

Returns:



676
677
678
679
# File 'lib/polars/list_expr.rb', line 676

def join(separator, ignore_nulls: true)
  separator = Utils.parse_into_expression(separator, str_as_lit: true)
  Utils.wrap_expr(_rbexpr.list_join(separator, ignore_nulls))
end

#lastExpr

Get the last value of the sublists.

Examples:

df = Polars::DataFrame.new({"foo" => [[3, 2, 1], [], [1, 2]]})
df.select(Polars.col("foo").list.last)
# =>
# shape: (3, 1)
# ┌──────┐
# │ foo  │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ 1    │
# │ null │
# │ 2    │
# └──────┘

Returns:



621
622
623
# File 'lib/polars/list_expr.rb', line 621

def last
  get(-1)
end

#lenExpr Also known as: lengths

Get the length of the arrays as :u32.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2], "bar" => [["a", "b"], ["c"]]})
df.select(Polars.col("bar").list.lengths)
# =>
# shape: (2, 1)
# ┌─────┐
# │ bar │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 2   │
# │ 1   │
# └─────┘

Returns:



83
84
85
# File 'lib/polars/list_expr.rb', line 83

def len
  Utils.wrap_expr(_rbexpr.list_len)
end

#maxExpr

Compute the max value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").list.max)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 1      │
# │ 3      │
# └────────┘

Returns:



200
201
202
# File 'lib/polars/list_expr.rb', line 200

def max
  Utils.wrap_expr(_rbexpr.list_max)
end

#meanExpr

Compute the mean value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").list.mean)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ f64    │
# ╞════════╡
# │ 1.0    │
# │ 2.5    │
# └────────┘

Returns:



242
243
244
# File 'lib/polars/list_expr.rb', line 242

def mean
  Utils.wrap_expr(_rbexpr.list_mean)
end

#medianExpr

Compute the median value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[-1, 0, 1], [1, 10]]})
df.with_columns(Polars.col("values").list.median.alias("median"))
# =>
# shape: (2, 2)
# ┌────────────┬────────┐
# │ values     ┆ median │
# │ ---        ┆ ---    │
# │ list[i64]  ┆ f64    │
# ╞════════════╪════════╡
# │ [-1, 0, 1] ┆ 0.0    │
# │ [1, 10]    ┆ 5.5    │
# └────────────┴────────┘

Returns:



263
264
265
# File 'lib/polars/list_expr.rb', line 263

def median
  Utils.wrap_expr(_rbexpr.list_median)
end

#minExpr

Compute the min value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").list.min)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 1      │
# │ 2      │
# └────────┘

Returns:



221
222
223
# File 'lib/polars/list_expr.rb', line 221

def min
  Utils.wrap_expr(_rbexpr.list_min)
end

#n_uniqueExpr

Count the number of unique values in every sub-lists.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 1, 2], [2, 3, 4]]
  }
)
df.with_columns(n_unique: Polars.col("a").list.n_unique)
# =>
# shape: (2, 2)
# ┌───────────┬──────────┐
# │ a         ┆ n_unique │
# │ ---       ┆ ---      │
# │ list[i64] ┆ u32      │
# ╞═══════════╪══════════╡
# │ [1, 1, 2] ┆ 2        │
# │ [2, 3, 4] ┆ 3        │
# └───────────┴──────────┘

Returns:



419
420
421
# File 'lib/polars/list_expr.rb', line 419

def n_unique
  Utils.wrap_expr(_rbexpr.list_n_unique)
end

#reverseExpr

Reverse the arrays in the list.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[3, 2, 1], [9, 1, 2]]
  }
)
df.select(Polars.col("a").list.reverse)
# =>
# shape: (2, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2, 3] │
# │ [2, 1, 9] │
# └───────────┘

Returns:



370
371
372
# File 'lib/polars/list_expr.rb', line 370

def reverse
  Utils.wrap_expr(_rbexpr.list_reverse)
end

#sample(n: nil, fraction: nil, with_replacement: false, shuffle: false, seed: nil) ⇒ Expr

Sample from this list.

Examples:

df = Polars::DataFrame.new({"values" => [[1, 2, 3], [4, 5]], "n" => [2, 1]})
df.with_columns(sample: Polars.col("values").list.sample(n: Polars.col("n"), seed: 1))
# =>
# shape: (2, 3)
# ┌───────────┬─────┬───────────┐
# │ values    ┆ n   ┆ sample    │
# │ ---       ┆ --- ┆ ---       │
# │ list[i64] ┆ i64 ┆ list[i64] │
# ╞═══════════╪═════╪═══════════╡
# │ [1, 2, 3] ┆ 2   ┆ [2, 3]    │
# │ [4, 5]    ┆ 1   ┆ [5]       │
# └───────────┴─────┴───────────┘

Parameters:

  • n (Integer) (defaults to: nil)

    Number of items to return. Cannot be used with fraction. Defaults to 1 if fraction is nil.

  • fraction (Float) (defaults to: nil)

    Fraction of items to return. Cannot be used with n.

  • with_replacement (Boolean) (defaults to: false)

    Allow values to be sampled more than once.

  • shuffle (Boolean) (defaults to: false)

    Shuffle the order of sampled data points.

  • seed (Integer) (defaults to: nil)

    Seed for the random number generator. If set to nil (default), a random seed is generated for each sample operation.

Returns:



142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
# File 'lib/polars/list_expr.rb', line 142

def sample(n: nil, fraction: nil, with_replacement: false, shuffle: false, seed: nil)
  if !n.nil? && !fraction.nil?
    msg = "cannot specify both `n` and `fraction`"
    raise ArgumentError, msg
  end

  if !fraction.nil?
    fraction = Utils.parse_into_expression(fraction)
    return Utils.wrap_expr(
      _rbexpr.list_sample_fraction(
        fraction, with_replacement, shuffle, seed
      )
    )
  end

  n = 1 if n.nil?
  n = Utils.parse_into_expression(n)
  Utils.wrap_expr(_rbexpr.list_sample_n(n, with_replacement, shuffle, seed))
end

#set_difference(other) ⇒ Expr

Compute the SET DIFFERENCE between the elements in this list and the elements of other.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2, 3], [], [nil, 3], [5, 6, 7]],
    "b" => [[2, 3, 4], [3], [3, 4, nil], [6, 8]]
  }
)
df.with_columns(difference: Polars.col("a").list.set_difference("b"))
# =>
# shape: (4, 3)
# ┌───────────┬──────────────┬────────────┐
# │ a         ┆ b            ┆ difference │
# │ ---       ┆ ---          ┆ ---        │
# │ list[i64] ┆ list[i64]    ┆ list[i64]  │
# ╞═══════════╪══════════════╪════════════╡
# │ [1, 2, 3] ┆ [2, 3, 4]    ┆ [1]        │
# │ []        ┆ [3]          ┆ []         │
# │ [null, 3] ┆ [3, 4, null] ┆ []         │
# │ [5, 6, 7] ┆ [6, 8]       ┆ [5, 7]     │
# └───────────┴──────────────┴────────────┘

Parameters:

  • other (Object)

    Right hand side of the set operation.

Returns:



1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
# File 'lib/polars/list_expr.rb', line 1088

def set_difference(other)
  if other.respond_to?(:each)
    if !other.is_a?(::Array) && !other.is_a?(Series) && !other.is_a?(DataFrame)
      other = other.to_a
    end
    other = F.lit(other)._rbexpr
  else
    other = Utils.parse_into_expression(other)
  end
  Utils.wrap_expr(_rbexpr.list_set_operation(other, "difference"))
end

#set_intersection(other) ⇒ Expr

Compute the SET INTERSECTION between the elements in this list and the elements of other.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2, 3], [], [nil, 3], [5, 6, 7]],
    "b" => [[2, 3, 4], [3], [3, 4, nil], [6, 8]]
  }
)
df.with_columns(intersection: Polars.col("a").list.set_intersection("b"))
# =>
# shape: (4, 3)
# ┌───────────┬──────────────┬──────────────┐
# │ a         ┆ b            ┆ intersection │
# │ ---       ┆ ---          ┆ ---          │
# │ list[i64] ┆ list[i64]    ┆ list[i64]    │
# ╞═══════════╪══════════════╪══════════════╡
# │ [1, 2, 3] ┆ [2, 3, 4]    ┆ [2, 3]       │
# │ []        ┆ [3]          ┆ []           │
# │ [null, 3] ┆ [3, 4, null] ┆ [null, 3]    │
# │ [5, 6, 7] ┆ [6, 8]       ┆ [6]          │
# └───────────┴──────────────┴──────────────┘

Parameters:

  • other (Object)

    Right hand side of the set operation.

Returns:



1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
# File 'lib/polars/list_expr.rb', line 1127

def set_intersection(other)
  if other.respond_to?(:each)
    if !other.is_a?(::Array) && !other.is_a?(Series) && !other.is_a?(DataFrame)
      other = other.to_a
    end
    other = F.lit(other)._rbexpr
  else
    other = Utils.parse_into_expression(other)
  end
  Utils.wrap_expr(_rbexpr.list_set_operation(other, "intersection"))
end

#set_symmetric_difference(other) ⇒ Expr

Compute the SET SYMMETRIC DIFFERENCE between the elements in this list and the elements of other.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2, 3], [], [nil, 3], [5, 6, 7]],
    "b" => [[2, 3, 4], [3], [3, 4, nil], [6, 8]]
  }
)
df.with_columns(sdiff: Polars.col("b").list.set_symmetric_difference("a"))
# =>
# shape: (4, 3)
# ┌───────────┬──────────────┬───────────┐
# │ a         ┆ b            ┆ sdiff     │
# │ ---       ┆ ---          ┆ ---       │
# │ list[i64] ┆ list[i64]    ┆ list[i64] │
# ╞═══════════╪══════════════╪═══════════╡
# │ [1, 2, 3] ┆ [2, 3, 4]    ┆ [4, 1]    │
# │ []        ┆ [3]          ┆ [3]       │
# │ [null, 3] ┆ [3, 4, null] ┆ [4]       │
# │ [5, 6, 7] ┆ [6, 8]       ┆ [8, 5, 7] │
# └───────────┴──────────────┴───────────┘

Parameters:

  • other (Object)

    Right hand side of the set operation.

Returns:



1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
# File 'lib/polars/list_expr.rb', line 1166

def set_symmetric_difference(other)
  if other.respond_to?(:each)
    if !other.is_a?(::Array) && !other.is_a?(Series) && !other.is_a?(DataFrame)
      other = other.to_a
    end
    other = F.lit(other)._rbexpr
  else
    other = Utils.parse_into_expression(other)
  end
  Utils.wrap_expr(_rbexpr.list_set_operation(other, "symmetric_difference"))
end

#set_union(other) ⇒ Expr

Compute the SET UNION between the elements in this list and the elements of other.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 2, 3], [], [nil, 3], [5, 6, 7]],
    "b" => [[2, 3, 4], [3], [3, 4, nil], [6, 8]]
  }
)
df.with_columns(
  union: Polars.col("a").list.set_union("b")
)
# =>
# shape: (4, 3)
# ┌───────────┬──────────────┬──────────────┐
# │ a         ┆ b            ┆ union        │
# │ ---       ┆ ---          ┆ ---          │
# │ list[i64] ┆ list[i64]    ┆ list[i64]    │
# ╞═══════════╪══════════════╪══════════════╡
# │ [1, 2, 3] ┆ [2, 3, 4]    ┆ [1, 2, … 4]  │
# │ []        ┆ [3]          ┆ [3]          │
# │ [null, 3] ┆ [3, 4, null] ┆ [null, 3, 4] │
# │ [5, 6, 7] ┆ [6, 8]       ┆ [5, 6, … 8]  │
# └───────────┴──────────────┴──────────────┘

Parameters:

  • other (Object)

    Right hand side of the set operation.

Returns:



1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
# File 'lib/polars/list_expr.rb', line 1049

def set_union(other)
  if other.respond_to?(:each)
    if !other.is_a?(::Array) && !other.is_a?(Series) && !other.is_a?(DataFrame)
      other = other.to_a
    end
    other = F.lit(other)._rbexpr
  else
    other = Utils.parse_into_expression(other)
  end
  Utils.wrap_expr(_rbexpr.list_set_operation(other, "union"))
end

#shift(n = 1) ⇒ Expr

Shift values by the given period.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.list.shift
# =>
# shape: (2,)
# Series: 'a' [list[i64]]
# [
#         [null, 1, … 3]
#         [null, 10, 2]
# ]

Parameters:

  • n (Integer) (defaults to: 1)

    Number of places to shift (may be negative).

Returns:



771
772
773
774
# File 'lib/polars/list_expr.rb', line 771

def shift(n = 1)
  n = Utils.parse_into_expression(n)
  Utils.wrap_expr(_rbexpr.list_shift(n))
end

#slice(offset, length = nil) ⇒ Expr

Slice every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.list.slice(1, 2)
# =>
# shape: (2,)
# Series: 'a' [list[i64]]
# [
#         [2, 3]
#         [2, 1]
# ]

Parameters:

  • offset (Integer)

    Start index. Negative indexing is supported.

  • length (Integer) (defaults to: nil)

    Length of the slice. If set to nil (default), the slice is taken to the end of the list.

Returns:



796
797
798
799
800
# File 'lib/polars/list_expr.rb', line 796

def slice(offset, length = nil)
  offset = Utils.parse_into_expression(offset, str_as_lit: false)
  length = Utils.parse_into_expression(length, str_as_lit: false)
  Utils.wrap_expr(_rbexpr.list_slice(offset, length))
end

#sort(reverse: false, nulls_last: false) ⇒ Expr

Sort the arrays in the list.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[3, 2, 1], [9, 1, 2]]
  }
)
df.select(Polars.col("a").list.sort)
# =>
# shape: (2, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2, 3] │
# │ [1, 2, 9] │
# └───────────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Sort in descending order.

  • nulls_last (Boolean) (defaults to: false)

    Place null values last.

Returns:



345
346
347
# File 'lib/polars/list_expr.rb', line 345

def sort(reverse: false, nulls_last: false)
  Utils.wrap_expr(_rbexpr.list_sort(reverse, nulls_last))
end

#std(ddof: 1) ⇒ Expr

Compute the std value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[-1, 0, 1], [1, 10]]})
df.with_columns(Polars.col("values").list.std.alias("std"))
# =>
# shape: (2, 2)
# ┌────────────┬──────────┐
# │ values     ┆ std      │
# │ ---        ┆ ---      │
# │ list[i64]  ┆ f64      │
# ╞════════════╪══════════╡
# │ [-1, 0, 1] ┆ 1.0      │
# │ [1, 10]    ┆ 6.363961 │
# └────────────┴──────────┘

Parameters:

  • ddof (Integer) (defaults to: 1)

    “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Returns:



289
290
291
# File 'lib/polars/list_expr.rb', line 289

def std(ddof: 1)
  Utils.wrap_expr(_rbexpr.list_std(ddof))
end

#sumExpr

Sum all the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[1], [2, 3]]})
df.select(Polars.col("values").list.sum)
# =>
# shape: (2, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 1      │
# │ 5      │
# └────────┘

Returns:



179
180
181
# File 'lib/polars/list_expr.rb', line 179

def sum
  Utils.wrap_expr(_rbexpr.list_sum)
end

#tail(n = 5) ⇒ Expr

Slice the last n values of every sublist.

Examples:

s = Polars::Series.new("a", [[1, 2, 3, 4], [10, 2, 1]])
s.list.tail(2)
# =>
# shape: (2,)
# Series: 'a' [list[i64]]
# [
#         [3, 4]
#         [2, 1]
# ]

Parameters:

  • n (Integer) (defaults to: 5)

    Number of values to return for each sublist.

Returns:



840
841
842
843
# File 'lib/polars/list_expr.rb', line 840

def tail(n = 5)
  n = Utils.parse_into_expression(n)
  Utils.wrap_expr(_rbexpr.list_tail(n))
end

#to_array(width) ⇒ Expr

Convert a List column into an Array column with the same inner data type.

Examples:

df = Polars::DataFrame.new(
  {"a" => [[1, 2], [3, 4]]},
  schema: {"a" => Polars::List.new(Polars::Int8)}
)
df.with_columns(array: Polars.col("a").list.to_array(2))
# =>
# shape: (2, 2)
# ┌──────────┬──────────────┐
# │ a        ┆ array        │
# │ ---      ┆ ---          │
# │ list[i8] ┆ array[i8, 2] │
# ╞══════════╪══════════════╡
# │ [1, 2]   ┆ [1, 2]       │
# │ [3, 4]   ┆ [3, 4]       │
# └──────────┴──────────────┘

Parameters:

  • width (Integer)

    Width of the resulting Array column.

Returns:



921
922
923
# File 'lib/polars/list_expr.rb', line 921

def to_array(width)
  Utils.wrap_expr(_rbexpr.list_to_array(width))
end

#to_struct(n_field_strategy: "first_non_null", fields: nil, upper_bound: nil) ⇒ Expr

Convert the series of type List to a series of type Struct.

Examples:

df = Polars::DataFrame.new({"a" => [[1, 2, 3], [1, 2]]})
df.select([Polars.col("a").list.to_struct])
# =>
# shape: (2, 1)
# ┌────────────┐
# │ a          │
# │ ---        │
# │ struct[3]  │
# ╞════════════╡
# │ {1,2,3}    │
# │ {1,2,null} │
# └────────────┘

Parameters:

  • n_field_strategy ("first_non_null", "max_width") (defaults to: "first_non_null")

    Strategy to determine the number of fields of the struct.

  • fields (defaults to: nil)

    pArray If the name and number of the desired fields is known in advance a list of field names can be given, which will be assigned by index. Otherwise, to dynamically assign field names, a custom function can be used; if neither are set, fields will be field_0, field_1 .. field_n.

  • upper_bound (Object) (defaults to: nil)

    A polars LazyFrame needs to know the schema at all times, so the caller must provide an upper bound of the number of struct fields that will be created; if set incorrectly, subsequent operations may fail. (For example, an all.sum expression will look in the current schema to determine which columns to select).

    When operating on a DataFrame, the schema does not need to be tracked or pre-determined, as the result will be eagerly evaluated, so you can leave this parameter unset.

Returns:



960
961
962
# File 'lib/polars/list_expr.rb', line 960

def to_struct(n_field_strategy: "first_non_null", fields: nil, upper_bound: nil)
  Utils.wrap_expr(_rbexpr.list_to_struct(n_field_strategy, fields, nil))
end

#unique(maintain_order: false) ⇒ Expr

Get the unique/distinct values in the list.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [[1, 1, 2]]
  }
)
df.select(Polars.col("a").list.unique)
# =>
# shape: (1, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ list[i64] │
# ╞═══════════╡
# │ [1, 2]    │
# └───────────┘

Returns:



394
395
396
# File 'lib/polars/list_expr.rb', line 394

def unique(maintain_order: false)
  Utils.wrap_expr(_rbexpr.list_unique(maintain_order))
end

#var(ddof: 1) ⇒ Expr

Compute the var value of the lists in the array.

Examples:

df = Polars::DataFrame.new({"values" => [[-1, 0, 1], [1, 10]]})
df.with_columns(Polars.col("values").list.var.alias("var"))
# =>
# shape: (2, 2)
# ┌────────────┬──────┐
# │ values     ┆ var  │
# │ ---        ┆ ---  │
# │ list[i64]  ┆ f64  │
# ╞════════════╪══════╡
# │ [-1, 0, 1] ┆ 1.0  │
# │ [1, 10]    ┆ 40.5 │
# └────────────┴──────┘

Parameters:

  • ddof (Integer) (defaults to: 1)

    “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Returns:



315
316
317
# File 'lib/polars/list_expr.rb', line 315

def var(ddof: 1)
  Utils.wrap_expr(_rbexpr.list_var(ddof))
end