Class: Polars::Expr

Inherits:
Object
  • Object
show all
Defined in:
lib/polars/expr.rb

Overview

Expressions that can be used in various contexts.

Direct Known Subclasses

Selector

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.deserialize(source) ⇒ Expr

Note:

This function uses marshaling if the logical plan contains Ruby UDFs, and as such inherits the security implications. Deserializing can execute arbitrary code, so it should only be attempted on trusted data.

Note:

Serialization is not stable across Polars versions: a LazyFrame serialized in one Polars version may not be deserializable in another Polars version.

Read a serialized expression from a file.

Examples:

expr = Polars.col("foo").sum.over("bar")
bytes = expr.meta.serialize
Polars::Expr.deserialize(StringIO.new(bytes))
# => col("foo").sum().over([col("bar")])

Parameters:

  • source (Object)

    Path to a file or a file-like object (by file-like object, we refer to objects that have a read method, such as a file handler or StringIO).

Returns:

Raises:

  • (Todo)


171
172
173
174
175
176
177
178
179
180
181
# File 'lib/polars/expr.rb', line 171

def self.deserialize(source)
  raise Todo unless RbExpr.respond_to?(:deserialize_binary)

  if Utils.pathlike?(source)
    source = Utils.normalize_filepath(source)
  end

  deserializer = RbExpr.method(:deserialize_binary)

  _from_rbexpr(deserializer.(source))
end

Instance Method Details

#!Expr Also known as: ~

Performs boolean not.

Returns:



137
138
139
# File 'lib/polars/expr.rb', line 137

def !
  is_not
end

#!=(other) ⇒ Expr

Not equal.

Returns:



116
117
118
# File 'lib/polars/expr.rb', line 116

def !=(other)
  wrap_expr(_rbexpr.neq(_to_expr(other)._rbexpr))
end

#%(other) ⇒ Expr

Returns the modulo.

Returns:



80
81
82
# File 'lib/polars/expr.rb', line 80

def %(other)
  wrap_expr(_rbexpr % _to_rbexpr(other))
end

#&(other) ⇒ Expr

Bitwise AND.

Returns:



36
37
38
39
# File 'lib/polars/expr.rb', line 36

def &(other)
  other = Utils.parse_into_expression(other)
  wrap_expr(_rbexpr.and_(other))
end

#*(other) ⇒ Expr

Performs multiplication.

Returns:



66
67
68
# File 'lib/polars/expr.rb', line 66

def *(other)
  wrap_expr(_rbexpr * _to_rbexpr(other))
end

#**(power) ⇒ Expr

Raises to the power of exponent.

Returns:



87
88
89
90
# File 'lib/polars/expr.rb', line 87

def **(power)
  exponent = Utils.parse_into_expression(power)
  wrap_expr(_rbexpr.pow(exponent))
end

#+(other) ⇒ Expr

Performs addition.

Returns:



52
53
54
# File 'lib/polars/expr.rb', line 52

def +(other)
  wrap_expr(_rbexpr + _to_rbexpr(other))
end

#-(other) ⇒ Expr

Performs subtraction.

Returns:



59
60
61
# File 'lib/polars/expr.rb', line 59

def -(other)
  wrap_expr(_rbexpr - _to_rbexpr(other))
end

#-@Expr

Performs negation.

Returns:



145
146
147
# File 'lib/polars/expr.rb', line 145

def -@
  wrap_expr(_rbexpr.neg)
end

#/(other) ⇒ Expr

Performs division.

Returns:



73
74
75
# File 'lib/polars/expr.rb', line 73

def /(other)
  wrap_expr(_rbexpr / _to_rbexpr(other))
end

#<(other) ⇒ Expr

Less than.

Returns:



123
124
125
# File 'lib/polars/expr.rb', line 123

def <(other)
  wrap_expr(_rbexpr.lt(_to_expr(other)._rbexpr))
end

#<=(other) ⇒ Expr

Less than or equal.

Returns:



102
103
104
# File 'lib/polars/expr.rb', line 102

def <=(other)
  wrap_expr(_rbexpr.lt_eq(_to_expr(other)._rbexpr))
end

#==(other) ⇒ Expr

Equal.

Returns:



109
110
111
# File 'lib/polars/expr.rb', line 109

def ==(other)
  wrap_expr(_rbexpr.eq(_to_expr(other)._rbexpr))
end

#>(other) ⇒ Expr

Greater than.

Returns:



130
131
132
# File 'lib/polars/expr.rb', line 130

def >(other)
  wrap_expr(_rbexpr.gt(_to_expr(other)._rbexpr))
end

#>=(other) ⇒ Expr

Greater than or equal.

Returns:



95
96
97
# File 'lib/polars/expr.rb', line 95

def >=(other)
  wrap_expr(_rbexpr.gt_eq(_to_expr(other)._rbexpr))
end

#^(other) ⇒ Expr

Bitwise XOR.

Returns:



28
29
30
31
# File 'lib/polars/expr.rb', line 28

def ^(other)
  other = Utils.parse_into_expression(other)
  wrap_expr(_rbexpr.xor_(other))
end

#_hash(seed = 0, seed_1 = nil, seed_2 = nil, seed_3 = nil) ⇒ Expr

Hash the elements in the selection.

The hash value is of type :u64.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil],
    "b" => ["x", nil, "z"]
  }
)
df.with_column(Polars.all._hash(10, 20, 30, 40))
# =>
# shape: (3, 2)
# ┌──────────────────────┬──────────────────────┐
# │ a                    ┆ b                    │
# │ ---                  ┆ ---                  │
# │ u64                  ┆ u64                  │
# ╞══════════════════════╪══════════════════════╡
# │ 4629889412789719550  ┆ 6959506404929392568  │
# │ 16386608652769605760 ┆ 11638928888656214026 │
# │ 11638928888656214026 ┆ 11040941213715918520 │
# └──────────────────────┴──────────────────────┘

Parameters:

  • seed (Integer) (defaults to: 0)

    Random seed parameter. Defaults to 0.

  • seed_1 (Integer) (defaults to: nil)

    Random seed parameter. Defaults to seed if not set.

  • seed_2 (Integer) (defaults to: nil)

    Random seed parameter. Defaults to seed if not set.

  • seed_3 (Integer) (defaults to: nil)

    Random seed parameter. Defaults to seed if not set.

Returns:



4516
4517
4518
4519
4520
4521
4522
# File 'lib/polars/expr.rb', line 4516

def _hash(seed = 0, seed_1 = nil, seed_2 = nil, seed_3 = nil)
  k0 = seed
  k1 = seed_1.nil? ? seed : seed_1
  k2 = seed_2.nil? ? seed : seed_2
  k3 = seed_3.nil? ? seed : seed_3
  wrap_expr(_rbexpr._hash(k0, k1, k2, k3))
end

#absExpr

Compute absolute values.

Examples:

df = Polars::DataFrame.new(
  {
    "A" => [-1.0, 0.0, 1.0, 2.0]
  }
)
df.select(Polars.col("A").abs)
# =>
# shape: (4, 1)
# ┌─────┐
# │ A   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# │ 0.0 │
# │ 1.0 │
# │ 2.0 │
# └─────┘

Returns:



6453
6454
6455
# File 'lib/polars/expr.rb', line 6453

def abs
  wrap_expr(_rbexpr.abs)
end

#add(other) ⇒ Expr

Method equivalent of addition operator expr + other.

Examples:

df = Polars::DataFrame.new({"x" => [1, 2, 3, 4, 5]})
df.with_columns(
  Polars.col("x").add(2).alias("x+int"),
  Polars.col("x").add(Polars.col("x").cum_prod).alias("x+expr")
)
# =>
# shape: (5, 3)
# ┌─────┬───────┬────────┐
# │ x   ┆ x+int ┆ x+expr │
# │ --- ┆ ---   ┆ ---    │
# │ i64 ┆ i64   ┆ i64    │
# ╞═════╪═══════╪════════╡
# │ 1   ┆ 3     ┆ 2      │
# │ 2   ┆ 4     ┆ 4      │
# │ 3   ┆ 5     ┆ 9      │
# │ 4   ┆ 6     ┆ 28     │
# │ 5   ┆ 7     ┆ 125    │
# └─────┴───────┴────────┘
df = Polars::DataFrame.new(
  {"x" => ["a", "d", "g"], "y": ["b", "e", "h"], "z": ["c", "f", "i"]}
)
df.with_columns(Polars.col("x").add(Polars.col("y")).add(Polars.col("z")).alias("xyz"))
# =>
# shape: (3, 4)
# ┌─────┬─────┬─────┬─────┐
# │ x   ┆ y   ┆ z   ┆ xyz │
# │ --- ┆ --- ┆ --- ┆ --- │
# │ str ┆ str ┆ str ┆ str │
# ╞═════╪═════╪═════╪═════╡
# │ a   ┆ b   ┆ c   ┆ abc │
# │ d   ┆ e   ┆ f   ┆ def │
# │ g   ┆ h   ┆ i   ┆ ghi │
# └─────┴─────┴─────┴─────┘

Parameters:

  • other (Object)

    numeric or string value; accepts expression input.

Returns:



4074
4075
4076
# File 'lib/polars/expr.rb', line 4074

def add(other)
  self + other
end

#agg_groupsExpr

Get the group indexes of the group by operation.

Should be used in aggregation context only.

Examples:

df = Polars::DataFrame.new(
  {
    "group" => [
      "one",
      "one",
      "one",
      "two",
      "two",
      "two"
    ],
    "value" => [94, 95, 96, 97, 97, 99]
  }
)
df.group_by("group", maintain_order: true).agg(Polars.col("value").agg_groups)
# =>
# shape: (2, 2)
# ┌───────┬───────────┐
# │ group ┆ value     │
# │ ---   ┆ ---       │
# │ str   ┆ list[u32] │
# ╞═══════╪═══════════╡
# │ one   ┆ [0, 1, 2] │
# │ two   ┆ [3, 4, 5] │
# └───────┴───────────┘

Returns:



808
809
810
# File 'lib/polars/expr.rb', line 808

def agg_groups
  wrap_expr(_rbexpr.agg_groups)
end

#alias(name) ⇒ Expr

Rename the output of an expression.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, 3],
    "b" => ["a", "b", nil]
  }
)
df.select(
  [
    Polars.col("a").alias("bar"),
    Polars.col("b").alias("foo")
  ]
)
# =>
# shape: (3, 2)
# ┌─────┬──────┐
# │ bar ┆ foo  │
# │ --- ┆ ---  │
# │ i64 ┆ str  │
# ╞═════╪══════╡
# │ 1   ┆ a    │
# │ 2   ┆ b    │
# │ 3   ┆ null │
# └─────┴──────┘

Parameters:

Returns:



410
411
412
# File 'lib/polars/expr.rb', line 410

def alias(name)
  wrap_expr(_rbexpr._alias(name))
end

#all(drop_nulls: true) ⇒ Boolean

Check if all boolean values in a Boolean column are true.

This method is an expression - not to be confused with Polars.all which is a function to select all columns.

Examples:

df = Polars::DataFrame.new(
  {"TT" => [true, true], "TF" => [true, false], "FF" => [false, false]}
)
df.select(Polars.col("*").all)
# =>
# shape: (1, 3)
# ┌──────┬───────┬───────┐
# │ TT   ┆ TF    ┆ FF    │
# │ ---  ┆ ---   ┆ ---   │
# │ bool ┆ bool  ┆ bool  │
# ╞══════╪═══════╪═══════╡
# │ true ┆ false ┆ false │
# └──────┴───────┴───────┘

Returns:



261
262
263
# File 'lib/polars/expr.rb', line 261

def all(drop_nulls: true)
  wrap_expr(_rbexpr.all(drop_nulls))
end

#and_(*others) ⇒ Expr

Method equivalent of bitwise "and" operator expr & other & ....

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [5, 6, 7, 4, 8],
    "y" => [1.5, 2.5, 1.0, 4.0, -5.75],
    "z" => [-9, 2, -1, 4, 8]
  }
)
df.select(
  (Polars.col("x") >= Polars.col("z"))
  .and_(
    Polars.col("y") >= Polars.col("z"),
    Polars.col("y") == Polars.col("y"),
    Polars.col("z") <= Polars.col("x"),
    Polars.col("y") != Polars.col("x"),
  )
  .alias("all")
)
# =>
# shape: (5, 1)
# ┌───────┐
# │ all   │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ true  │
# │ true  │
# │ true  │
# │ false │
# │ false │
# └───────┘

Parameters:

  • others (Array)

    One or more integer or boolean expressions to evaluate/combine.

Returns:



3711
3712
3713
# File 'lib/polars/expr.rb', line 3711

def and_(*others)
  ([self] + others).reduce(:&)
end

#any(drop_nulls: true) ⇒ Boolean

Check if any boolean value in a Boolean column is true.

Examples:

df = Polars::DataFrame.new({"TF" => [true, false], "FF" => [false, false]})
df.select(Polars.all.any)
# =>
# shape: (1, 2)
# ┌──────┬───────┐
# │ TF   ┆ FF    │
# │ ---  ┆ ---   │
# │ bool ┆ bool  │
# ╞══════╪═══════╡
# │ true ┆ false │
# └──────┴───────┘

Returns:



236
237
238
# File 'lib/polars/expr.rb', line 236

def any(drop_nulls: true)
  wrap_expr(_rbexpr.any(drop_nulls))
end

#append(other, upcast: true) ⇒ Expr

Append expressions.

This is done by adding the chunks of other to this Series.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [8, 9, 10],
    "b" => [nil, 4, 4]
  }
)
df.select(Polars.all.head(1).append(Polars.all.tail(1)))
# =>
# shape: (2, 2)
# ┌─────┬──────┐
# │ a   ┆ b    │
# │ --- ┆ ---  │
# │ i64 ┆ i64  │
# ╞═════╪══════╡
# │ 8   ┆ null │
# │ 10  ┆ 4    │
# └─────┴──────┘

Parameters:

  • other (Expr)

    Expression to append.

  • upcast (Boolean) (defaults to: true)

    Cast both Series to the same supertype.

Returns:



920
921
922
923
# File 'lib/polars/expr.rb', line 920

def append(other, upcast: true)
  other = Utils.parse_into_expression(other)
  wrap_expr(_rbexpr.append(other, upcast))
end

#approx_n_uniqueExpr Also known as: approx_unique

Approx count unique values.

This is done using the HyperLogLog++ algorithm for cardinality estimation.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.col("a").approx_n_unique)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 2   │
# └─────┘

Returns:



2544
2545
2546
# File 'lib/polars/expr.rb', line 2544

def approx_n_unique
  wrap_expr(_rbexpr.approx_n_unique)
end

#arccosExpr

Compute the element-wise value for the inverse cosine.

Examples:

df = Polars::DataFrame.new({"a" => [0.0]})
df.select(Polars.col("a").arccos)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.570796 │
# └──────────┘

Returns:



6969
6970
6971
# File 'lib/polars/expr.rb', line 6969

def arccos
  wrap_expr(_rbexpr.arccos)
end

#arccoshExpr

Compute the element-wise value for the inverse hyperbolic cosine.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").arccosh)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 0.0 │
# └─────┘

Returns:



7089
7090
7091
# File 'lib/polars/expr.rb', line 7089

def arccosh
  wrap_expr(_rbexpr.arccosh)
end

#arcsinExpr

Compute the element-wise value for the inverse sine.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").arcsin)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.570796 │
# └──────────┘

Returns:



6949
6950
6951
# File 'lib/polars/expr.rb', line 6949

def arcsin
  wrap_expr(_rbexpr.arcsin)
end

#arcsinhExpr

Compute the element-wise value for the inverse hyperbolic sine.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").arcsinh)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.881374 │
# └──────────┘

Returns:



7069
7070
7071
# File 'lib/polars/expr.rb', line 7069

def arcsinh
  wrap_expr(_rbexpr.arcsinh)
end

#arctanExpr

Compute the element-wise value for the inverse tangent.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").arctan)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.785398 │
# └──────────┘

Returns:



6989
6990
6991
# File 'lib/polars/expr.rb', line 6989

def arctan
  wrap_expr(_rbexpr.arctan)
end

#arctanhExpr

Compute the element-wise value for the inverse hyperbolic tangent.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").arctanh)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ inf │
# └─────┘

Returns:



7109
7110
7111
# File 'lib/polars/expr.rb', line 7109

def arctanh
  wrap_expr(_rbexpr.arctanh)
end

#arg_maxExpr

Get the index of the maximal value.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [20, 10, 30]
  }
)
df.select(Polars.col("a").arg_max)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 2   │
# └─────┘

Returns:



1794
1795
1796
# File 'lib/polars/expr.rb', line 1794

def arg_max
  wrap_expr(_rbexpr.arg_max)
end

#arg_minExpr

Get the index of the minimal value.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [20, 10, 30]
  }
)
df.select(Polars.col("a").arg_min)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 1   │
# └─────┘

Returns:



1818
1819
1820
# File 'lib/polars/expr.rb', line 1818

def arg_min
  wrap_expr(_rbexpr.arg_min)
end

#arg_sort(reverse: false, nulls_last: false) ⇒ Expr

Get the index values that would sort this column.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [20, 10, 30]
  }
)
df.select(Polars.col("a").arg_sort)
# =>
# shape: (3, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 1   │
# │ 0   │
# │ 2   │
# └─────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Sort in reverse (descending) order.

  • nulls_last (Boolean) (defaults to: false)

    Place null values last instead of first.

Returns:



1770
1771
1772
# File 'lib/polars/expr.rb', line 1770

def arg_sort(reverse: false, nulls_last: false)
  wrap_expr(_rbexpr.arg_sort(reverse, nulls_last))
end

#arg_trueExpr

Note:

Modifies number of rows returned, so will fail in combination with other expressions. Use as only expression in select / with_columns.

Return indices where expression evaluates true.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2, 1]})
df.select((Polars.col("a") == 1).arg_true)
# =>
# shape: (3, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 0   │
# │ 1   │
# │ 3   │
# └─────┘

Returns:



287
288
289
# File 'lib/polars/expr.rb', line 287

def arg_true
  wrap_expr(Plr.arg_where(_rbexpr))
end

#arg_uniqueExpr

Get index of first unique value.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [8, 9, 10],
    "b" => [nil, 4, 4]
  }
)
df.select(Polars.col("a").arg_unique)
# =>
# shape: (3, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 0   │
# │ 1   │
# │ 2   │
# └─────┘
df.select(Polars.col("b").arg_unique)
# =>
# shape: (2, 1)
# ┌─────┐
# │ b   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 0   │
# │ 1   │
# └─────┘

Returns:



2636
2637
2638
# File 'lib/polars/expr.rb', line 2636

def arg_unique
  wrap_expr(_rbexpr.arg_unique)
end

#argsort(reverse: false, nulls_last: false) ⇒ expr

Get the index values that would sort this column.

Alias for #arg_sort.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [20, 10, 30]
  }
)
df.select(Polars.col("a").argsort)
# =>
# shape: (3, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 1   │
# │ 0   │
# │ 2   │
# └─────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Sort in reverse (descending) order.

  • nulls_last (Boolean) (defaults to: false)

    Place null values last instead of first.

Returns:

  • (expr)


6486
6487
6488
# File 'lib/polars/expr.rb', line 6486

def argsort(reverse: false, nulls_last: false)
  arg_sort(reverse: reverse, nulls_last: nulls_last)
end

#arrArrayExpr

Create an object namespace of all array related methods.

Returns:



8322
8323
8324
# File 'lib/polars/expr.rb', line 8322

def arr
  ArrayExpr.new(self)
end

#backward_fill(limit: nil) ⇒ Expr

Fill missing values with the next to be seen values.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil],
    "b" => [4, nil, 6]
  }
)
df.select(Polars.all.backward_fill)
# =>
# shape: (3, 2)
# ┌──────┬─────┐
# │ a    ┆ b   │
# │ ---  ┆ --- │
# │ i64  ┆ i64 │
# ╞══════╪═════╡
# │ 1    ┆ 4   │
# │ 2    ┆ 6   │
# │ null ┆ 6   │
# └──────┴─────┘

Parameters:

  • limit (Integer) (defaults to: nil)

    The number of consecutive null values to backward fill.

Returns:



2256
2257
2258
# File 'lib/polars/expr.rb', line 2256

def backward_fill(limit: nil)
  fill_null(strategy: "backward", limit: limit)
end

#binBinaryExpr

Create an object namespace of all binary related methods.

Returns:



8329
8330
8331
# File 'lib/polars/expr.rb', line 8329

def bin
  BinaryExpr.new(self)
end

#bitwise_andExpr

Perform an aggregation of bitwise ANDs.

Examples:

df = Polars::DataFrame.new({"n" => [-1, 0, 1]})
df.select(Polars.col("n").bitwise_and)
# =>
# shape: (1, 1)
# ┌─────┐
# │ n   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 0   │
# └─────┘
df = Polars::DataFrame.new(
  {"grouper" => ["a", "a", "a", "b", "b"], "n" => [-1, 0, 1, -1, 1]}
)
df.group_by("grouper", maintain_order: true).agg(Polars.col("n").bitwise_and)
# =>
# shape: (2, 2)
# ┌─────────┬─────┐
# │ grouper ┆ n   │
# │ ---     ┆ --- │
# │ str     ┆ i64 │
# ╞═════════╪═════╡
# │ a       ┆ 0   │
# │ b       ┆ 1   │
# └─────────┴─────┘

Returns:



8236
8237
8238
# File 'lib/polars/expr.rb', line 8236

def bitwise_and
  wrap_expr(_rbexpr.bitwise_and)
end

#bitwise_count_onesExpr

Evaluate the number of set bits.

Returns:



8165
8166
8167
# File 'lib/polars/expr.rb', line 8165

def bitwise_count_ones
  wrap_expr(_rbexpr.bitwise_count_ones)
end

#bitwise_count_zerosExpr

Evaluate the number of unset bits.

Returns:



8172
8173
8174
# File 'lib/polars/expr.rb', line 8172

def bitwise_count_zeros
  wrap_expr(_rbexpr.bitwise_count_zeros)
end

#bitwise_leading_onesExpr

Evaluate the number most-significant set bits before seeing an unset bit.

Returns:



8179
8180
8181
# File 'lib/polars/expr.rb', line 8179

def bitwise_leading_ones
  wrap_expr(_rbexpr.bitwise_leading_ones)
end

#bitwise_leading_zerosExpr

Evaluate the number most-significant unset bits before seeing a set bit.

Returns:



8186
8187
8188
# File 'lib/polars/expr.rb', line 8186

def bitwise_leading_zeros
  wrap_expr(_rbexpr.bitwise_leading_zeros)
end

#bitwise_orExpr

Perform an aggregation of bitwise ORs.

Examples:

df = Polars::DataFrame.new({"n" => [-1, 0, 1]})
df.select(Polars.col("n").bitwise_or)
# =>
# shape: (1, 1)
# ┌─────┐
# │ n   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ -1  │
# └─────┘
df = Polars::DataFrame.new(
  {"grouper" => ["a", "a", "a", "b", "b"], "n" => [-1, 0, 1, -1, 1]}
)
df.group_by("grouper", maintain_order: true).agg(Polars.col("n").bitwise_or)
# =>
# shape: (2, 2)
# ┌─────────┬─────┐
# │ grouper ┆ n   │
# │ ---     ┆ --- │
# │ str     ┆ i64 │
# ╞═════════╪═════╡
# │ a       ┆ -1  │
# │ b       ┆ -1  │
# └─────────┴─────┘

Returns:



8272
8273
8274
# File 'lib/polars/expr.rb', line 8272

def bitwise_or
  wrap_expr(_rbexpr.bitwise_or)
end

#bitwise_trailing_onesExpr

Evaluate the number least-significant set bits before seeing an unset bit.

Returns:



8193
8194
8195
# File 'lib/polars/expr.rb', line 8193

def bitwise_trailing_ones
  wrap_expr(_rbexpr.bitwise_trailing_ones)
end

#bitwise_trailing_zerosExpr

Evaluate the number least-significant unset bits before seeing a set bit.

Returns:



8200
8201
8202
# File 'lib/polars/expr.rb', line 8200

def bitwise_trailing_zeros
  wrap_expr(_rbexpr.bitwise_trailing_zeros)
end

#bitwise_xorExpr

Perform an aggregation of bitwise XORs.

Examples:

df = Polars::DataFrame.new({"n" => [-1, 0, 1]})
df.select(Polars.col("n").bitwise_xor)
# =>
# shape: (1, 1)
# ┌─────┐
# │ n   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ -2  │
# └─────┘
df = Polars::DataFrame.new(
  {"grouper" => ["a", "a", "a", "b", "b"], "n" => [-1, 0, 1, -1, 1]}
)
df.group_by("grouper", maintain_order: true).agg(Polars.col("n").bitwise_xor)
# =>
# shape: (2, 2)
# ┌─────────┬─────┐
# │ grouper ┆ n   │
# │ ---     ┆ --- │
# │ str     ┆ i64 │
# ╞═════════╪═════╡
# │ a       ┆ -2  │
# │ b       ┆ -2  │
# └─────────┴─────┘

Returns:



8308
8309
8310
# File 'lib/polars/expr.rb', line 8308

def bitwise_xor
  wrap_expr(_rbexpr.bitwise_xor)
end

#bottom_k(k: 5) ⇒ Expr

Return the k smallest elements.

If 'reverse: true` the smallest elements will be given.

Examples:

df = Polars::DataFrame.new(
  {
    "value" => [1, 98, 2, 3, 99, 4]
  }
)
df.select(
  [
    Polars.col("value").top_k.alias("top_k"),
    Polars.col("value").bottom_k.alias("bottom_k")
  ]
)
# =>
# shape: (5, 2)
# ┌───────┬──────────┐
# │ top_k ┆ bottom_k │
# │ ---   ┆ ---      │
# │ i64   ┆ i64      │
# ╞═══════╪══════════╡
# │ 99    ┆ 1        │
# │ 98    ┆ 2        │
# │ 4     ┆ 3        │
# │ 3     ┆ 4        │
# │ 2     ┆ 98       │
# └───────┴──────────┘

Parameters:

  • k (Integer) (defaults to: 5)

    Number of elements to return.

Returns:



1632
1633
1634
1635
# File 'lib/polars/expr.rb', line 1632

def bottom_k(k: 5)
  k = Utils.parse_into_expression(k)
  wrap_expr(_rbexpr.bottom_k(k))
end

#bottom_k_by(by, k: 5, reverse: false) ⇒ Expr

Return the elements corresponding to the k smallest elements of the by column(s).

Non-null elements are always preferred over null elements, regardless of the value of reverse. The output is not guaranteed to be in any particular order, call :func:sort after this function if you wish the output to be sorted.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, 3, 4, 5, 6],
    "b" => [6, 5, 4, 3, 2, 1],
    "c" => ["Apple", "Orange", "Apple", "Apple", "Banana", "Banana"],
  }
)
# =>
# shape: (6, 3)
# ┌─────┬─────┬────────┐
# │ a   ┆ b   ┆ c      │
# │ --- ┆ --- ┆ ---    │
# │ i64 ┆ i64 ┆ str    │
# ╞═════╪═════╪════════╡
# │ 1   ┆ 6   ┆ Apple  │
# │ 2   ┆ 5   ┆ Orange │
# │ 3   ┆ 4   ┆ Apple  │
# │ 4   ┆ 3   ┆ Apple  │
# │ 5   ┆ 2   ┆ Banana │
# │ 6   ┆ 1   ┆ Banana │
# └─────┴─────┴────────┘

Get the bottom 2 rows by column a or b.

df.select(
  Polars.all.bottom_k_by("a", k: 2).name.suffix("_btm_by_a"),
  Polars.all.bottom_k_by("b", k: 2).name.suffix("_btm_by_b")
)
# =>
# shape: (2, 6)
# ┌────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐
# │ a_btm_by_a ┆ b_btm_by_a ┆ c_btm_by_a ┆ a_btm_by_b ┆ b_btm_by_b ┆ c_btm_by_b │
# │ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        │
# │ i64        ┆ i64        ┆ str        ┆ i64        ┆ i64        ┆ str        │
# ╞════════════╪════════════╪════════════╪════════════╪════════════╪════════════╡
# │ 1          ┆ 6          ┆ Apple      ┆ 6          ┆ 1          ┆ Banana     │
# │ 2          ┆ 5          ┆ Orange     ┆ 5          ┆ 2          ┆ Banana     │
# └────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘

Get the bottom 2 rows by multiple columns with given order.

df.select(
  Polars.all
  .bottom_k_by(["c", "a"], k: 2, reverse: [false, true])
  .name.suffix("_by_ca"),
  Polars.all
  .bottom_k_by(["c", "b"], k: 2, reverse: [false, true])
  .name.suffix("_by_cb"),
)
# =>
# shape: (2, 6)
# ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
# │ a_by_ca ┆ b_by_ca ┆ c_by_ca ┆ a_by_cb ┆ b_by_cb ┆ c_by_cb │
# │ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     │
# │ i64     ┆ i64     ┆ str     ┆ i64     ┆ i64     ┆ str     │
# ╞═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
# │ 4       ┆ 3       ┆ Apple   ┆ 1       ┆ 6       ┆ Apple   │
# │ 3       ┆ 4       ┆ Apple   ┆ 3       ┆ 4       ┆ Apple   │
# └─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘

Get the bottom 2 rows by column a in each group.

df.group_by("c", maintain_order: true)
  .agg(Polars.all.bottom_k_by("a", k: 2))
  .explode(Polars.all.exclude("c"))
# =>
# shape: (5, 3)
# ┌────────┬─────┬─────┐
# │ c      ┆ a   ┆ b   │
# │ ---    ┆ --- ┆ --- │
# │ str    ┆ i64 ┆ i64 │
# ╞════════╪═════╪═════╡
# │ Apple  ┆ 1   ┆ 6   │
# │ Apple  ┆ 3   ┆ 4   │
# │ Orange ┆ 2   ┆ 5   │
# │ Banana ┆ 5   ┆ 2   │
# │ Banana ┆ 6   ┆ 1   │
# └────────┴─────┴─────┘

Parameters:

  • by (Object)

    Column(s) used to determine the smallest elements. Accepts expression input. Strings are parsed as column names.

  • k (Integer) (defaults to: 5)

    Number of elements to return.

  • reverse (Object) (defaults to: false)

    Consider the k largest elements of the by column(s) (instead of the k smallest). This can be specified per column by passing a sequence of booleans.

Returns:



1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
# File 'lib/polars/expr.rb', line 1732

def bottom_k_by(
  by,
  k: 5,
  reverse: false
)
  k = Utils.parse_into_expression(k)
  by = Utils.parse_into_list_of_expressions(by)
  reverse = Utils.extend_bool(reverse, by.length, "reverse", "by")
  wrap_expr(_rbexpr.bottom_k_by(by, k, reverse))
end

#cast(dtype, strict: true) ⇒ Expr

Cast between data types.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, 3],
    "b" => ["4", "5", "6"]
  }
)
df.with_columns(
  [
    Polars.col("a").cast(:f64),
    Polars.col("b").cast(:i32)
  ]
)
# =>
# shape: (3, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ f64 ┆ i32 │
# ╞═════╪═════╡
# │ 1.0 ┆ 4   │
# │ 2.0 ┆ 5   │
# │ 3.0 ┆ 6   │
# └─────┴─────┘

Parameters:

  • dtype (Symbol)

    DataType to cast to.

  • strict (Boolean) (defaults to: true)

    Throw an error if a cast could not be done. For instance, due to an overflow.

Returns:



1373
1374
1375
1376
# File 'lib/polars/expr.rb', line 1373

def cast(dtype, strict: true)
  dtype = Utils.rb_type_to_dtype(dtype)
  wrap_expr(_rbexpr.cast(dtype, strict))
end

#catCatExpr

Create an object namespace of all categorical related methods.

Returns:



8336
8337
8338
# File 'lib/polars/expr.rb', line 8336

def cat
  CatExpr.new(self)
end

#cbrtExpr

Compute the cube root of the elements.

Examples:

df = Polars::DataFrame.new({"values" => [1.0, 2.0, 4.0]})
df.select(Polars.col("values").cbrt)
# =>
# shape: (3, 1)
# ┌──────────┐
# │ values   │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.0      │
# │ 1.259921 │
# │ 1.587401 │
# └──────────┘

Returns:



331
332
333
# File 'lib/polars/expr.rb', line 331

def cbrt
  wrap_expr(_rbexpr.cbrt)
end

#ceilExpr

Rounds up to the nearest integer value.

Only works on floating point Series.

Examples:

df = Polars::DataFrame.new({"a" => [0.3, 0.5, 1.0, 1.1]})
df.select(Polars.col("a").ceil)
# =>
# shape: (4, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# │ 1.0 │
# │ 1.0 │
# │ 2.0 │
# └─────┘

Returns:



1220
1221
1222
# File 'lib/polars/expr.rb', line 1220

def ceil
  wrap_expr(_rbexpr.ceil)
end

#clip(lower_bound = nil, upper_bound = nil) ⇒ Expr

Set values outside the given boundaries to the boundary value.

Only works for numeric and temporal columns. If you want to clip other data types, consider writing a when-then-otherwise expression.

Examples:

df = Polars::DataFrame.new({"foo" => [-50, 5, nil, 50]})
df.with_column(Polars.col("foo").clip(1, 10).alias("foo_clipped"))
# =>
# shape: (4, 2)
# ┌──────┬─────────────┐
# │ foo  ┆ foo_clipped │
# │ ---  ┆ ---         │
# │ i64  ┆ i64         │
# ╞══════╪═════════════╡
# │ -50  ┆ 1           │
# │ 5    ┆ 5           │
# │ null ┆ null        │
# │ 50   ┆ 10          │
# └──────┴─────────────┘

Parameters:

  • lower_bound (Numeric) (defaults to: nil)

    Minimum value.

  • upper_bound (Numeric) (defaults to: nil)

    Maximum value.

Returns:



6711
6712
6713
6714
6715
6716
6717
6718
6719
# File 'lib/polars/expr.rb', line 6711

def clip(lower_bound = nil, upper_bound = nil)
  if !lower_bound.nil?
    lower_bound = Utils.parse_into_expression(lower_bound)
  end
  if !upper_bound.nil?
    upper_bound = Utils.parse_into_expression(upper_bound)
  end
  wrap_expr(_rbexpr.clip(lower_bound, upper_bound))
end

#clip_max(upper_bound) ⇒ Expr

Clip (limit) the values in an array to a max boundary.

Only works for numerical types.

If you want to clip other dtypes, consider writing a "when, then, otherwise" expression. See when for more information.

Examples:

df = Polars::DataFrame.new({"foo" => [-50, 5, nil, 50]})
df.with_column(Polars.col("foo").clip_max(0).alias("foo_clipped"))
# =>
# shape: (4, 2)
# ┌──────┬─────────────┐
# │ foo  ┆ foo_clipped │
# │ ---  ┆ ---         │
# │ i64  ┆ i64         │
# ╞══════╪═════════════╡
# │ -50  ┆ -50         │
# │ 5    ┆ 0           │
# │ null ┆ null        │
# │ 50   ┆ 0           │
# └──────┴─────────────┘

Parameters:

  • upper_bound (Numeric)

    Maximum value.

Returns:



6779
6780
6781
# File 'lib/polars/expr.rb', line 6779

def clip_max(upper_bound)
  clip(nil, upper_bound)
end

#clip_min(lower_bound) ⇒ Expr

Clip (limit) the values in an array to a min boundary.

Only works for numerical types.

If you want to clip other dtypes, consider writing a "when, then, otherwise" expression. See when for more information.

Examples:

df = Polars::DataFrame.new({"foo" => [-50, 5, nil, 50]})
df.with_column(Polars.col("foo").clip_min(0).alias("foo_clipped"))
# =>
# shape: (4, 2)
# ┌──────┬─────────────┐
# │ foo  ┆ foo_clipped │
# │ ---  ┆ ---         │
# │ i64  ┆ i64         │
# ╞══════╪═════════════╡
# │ -50  ┆ 0           │
# │ 5    ┆ 5           │
# │ null ┆ null        │
# │ 50   ┆ 50          │
# └──────┴─────────────┘

Parameters:

  • lower_bound (Numeric)

    Minimum value.

Returns:



6748
6749
6750
# File 'lib/polars/expr.rb', line 6748

def clip_min(lower_bound)
  clip(lower_bound, nil)
end

#cosExpr

Compute the element-wise value for the cosine.

Examples:

df = Polars::DataFrame.new({"a" => [0.0]})
df.select(Polars.col("a").cos)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# └─────┘

Returns:



6889
6890
6891
# File 'lib/polars/expr.rb', line 6889

def cos
  wrap_expr(_rbexpr.cos)
end

#coshExpr

Compute the element-wise value for the hyperbolic cosine.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").cosh)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.543081 │
# └──────────┘

Returns:



7029
7030
7031
# File 'lib/polars/expr.rb', line 7029

def cosh
  wrap_expr(_rbexpr.cosh)
end

#cotExpr

Compute the element-wise value for the cotangent.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").cot.round(2))
# =>
# shape: (1, 1)
# ┌──────┐
# │ a    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ 0.64 │
# └──────┘

Returns:



6929
6930
6931
# File 'lib/polars/expr.rb', line 6929

def cot
  wrap_expr(_rbexpr.cot)
end

#countExpr

Count the number of values in this expression.

Examples:

df = Polars::DataFrame.new({"a" => [8, 9, 10], "b" => [nil, 4, 4]})
df.select(Polars.all.count)
# =>
# shape: (1, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ u32 ┆ u32 │
# ╞═════╪═════╡
# │ 3   ┆ 2   │
# └─────┴─────┘

Returns:



828
829
830
# File 'lib/polars/expr.rb', line 828

def count
  wrap_expr(_rbexpr.count)
end

#cum_count(reverse: false) ⇒ Expr Also known as: cumcount

Get an array with the cumulative count computed at every element.

Counting from 0 to len

Examples:

df = Polars::DataFrame.new({"a" => ["x", "k", nil, "d"]})
df.with_columns(
  [
    Polars.col("a").cum_count.alias("cum_count"),
    Polars.col("a").cum_count(reverse: true).alias("cum_count_reverse")
  ]
)
# =>
# shape: (4, 3)
# ┌──────┬───────────┬───────────────────┐
# │ a    ┆ cum_count ┆ cum_count_reverse │
# │ ---  ┆ ---       ┆ ---               │
# │ str  ┆ u32       ┆ u32               │
# ╞══════╪═══════════╪═══════════════════╡
# │ x    ┆ 1         ┆ 3                 │
# │ k    ┆ 2         ┆ 2                 │
# │ null ┆ 2         ┆ 1                 │
# │ d    ┆ 3         ┆ 1                 │
# └──────┴───────────┴───────────────────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Reverse the operation.

Returns:



1169
1170
1171
# File 'lib/polars/expr.rb', line 1169

def cum_count(reverse: false)
  wrap_expr(_rbexpr.cum_count(reverse))
end

#cum_max(reverse: false) ⇒ Expr Also known as: cummax

Get an array with the cumulative max computed at every element.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 4]})
df.select(
  [
    Polars.col("a").cum_max,
    Polars.col("a").cum_max(reverse: true).alias("a_reverse")
  ]
)
# =>
# shape: (4, 2)
# ┌─────┬───────────┐
# │ a   ┆ a_reverse │
# │ --- ┆ ---       │
# │ i64 ┆ i64       │
# ╞═════╪═══════════╡
# │ 1   ┆ 4         │
# │ 2   ┆ 4         │
# │ 3   ┆ 4         │
# │ 4   ┆ 4         │
# └─────┴───────────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Reverse the operation.

Returns:



1135
1136
1137
# File 'lib/polars/expr.rb', line 1135

def cum_max(reverse: false)
  wrap_expr(_rbexpr.cum_max(reverse))
end

#cum_min(reverse: false) ⇒ Expr Also known as: cummin

Get an array with the cumulative min computed at every element.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 4]})
df.select(
  [
    Polars.col("a").cum_min,
    Polars.col("a").cum_min(reverse: true).alias("a_reverse")
  ]
)
# =>
# shape: (4, 2)
# ┌─────┬───────────┐
# │ a   ┆ a_reverse │
# │ --- ┆ ---       │
# │ i64 ┆ i64       │
# ╞═════╪═══════════╡
# │ 1   ┆ 1         │
# │ 1   ┆ 2         │
# │ 1   ┆ 3         │
# │ 1   ┆ 4         │
# └─────┴───────────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Reverse the operation.

Returns:



1103
1104
1105
# File 'lib/polars/expr.rb', line 1103

def cum_min(reverse: false)
  wrap_expr(_rbexpr.cum_min(reverse))
end

#cum_prod(reverse: false) ⇒ Expr Also known as: cumprod

Note:

Dtypes in :i8, :u8, :i16, and :u16 are cast to :i64 before summing to prevent overflow issues.

Get an array with the cumulative product computed at every element.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 4]})
df.select(
  [
    Polars.col("a").cum_prod,
    Polars.col("a").cum_prod(reverse: true).alias("a_reverse")
  ]
)
# =>
# shape: (4, 2)
# ┌─────┬───────────┐
# │ a   ┆ a_reverse │
# │ --- ┆ ---       │
# │ i64 ┆ i64       │
# ╞═════╪═══════════╡
# │ 1   ┆ 24        │
# │ 2   ┆ 24        │
# │ 6   ┆ 12        │
# │ 24  ┆ 4         │
# └─────┴───────────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Reverse the operation.

Returns:



1071
1072
1073
# File 'lib/polars/expr.rb', line 1071

def cum_prod(reverse: false)
  wrap_expr(_rbexpr.cum_prod(reverse))
end

#cum_sum(reverse: false) ⇒ Expr Also known as: cumsum

Note:

Dtypes in :i8, :u8, :i16, and :u16 are cast to :i64 before summing to prevent overflow issues.

Get an array with the cumulative sum computed at every element.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 4]})
df.select(
  [
    Polars.col("a").cum_sum,
    Polars.col("a").cum_sum(reverse: true).alias("a_reverse")
  ]
)
# =>
# shape: (4, 2)
# ┌─────┬───────────┐
# │ a   ┆ a_reverse │
# │ --- ┆ ---       │
# │ i64 ┆ i64       │
# ╞═════╪═══════════╡
# │ 1   ┆ 10        │
# │ 3   ┆ 9         │
# │ 6   ┆ 7         │
# │ 10  ┆ 4         │
# └─────┴───────────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    Reverse the operation.

Returns:



1035
1036
1037
# File 'lib/polars/expr.rb', line 1035

def cum_sum(reverse: false)
  wrap_expr(_rbexpr.cum_sum(reverse))
end

#cumulative_eval(expr, min_periods: 1) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

This can be really slow as it can have O(n^2) complexity. Don't use this for operations that visit all elements.

Run an expression over a sliding window that increases 1 slot every iteration.

Examples:

df = Polars::DataFrame.new({"values" => [1, 2, 3, 4, 5]})
df.select(
  [
    Polars.col("values").cumulative_eval(
      Polars.element.first - Polars.element.last ** 2
    )
  ]
)
# =>
# shape: (5, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 0      │
# │ -3     │
# │ -8     │
# │ -15    │
# │ -24    │
# └────────┘

Parameters:

  • expr (Expr)

    Expression to evaluate

  • min_periods (Integer) (defaults to: 1)

    Number of valid values there should be in the window before the expression is evaluated. valid values = length - null_count

Returns:



7713
7714
7715
7716
7717
# File 'lib/polars/expr.rb', line 7713

def cumulative_eval(expr, min_periods: 1)
  wrap_expr(
    _rbexpr.cumulative_eval(expr._rbexpr, min_periods)
  )
end

#cut(breaks, labels: nil, left_closed: false, include_breaks: false) ⇒ Expr

Bin continuous values into discrete categories.

Examples:

Divide a column into three categories.

df = Polars::DataFrame.new({"foo" => [-2, -1, 0, 1, 2]})
df.with_columns(
  Polars.col("foo").cut([-1, 1], labels: ["a", "b", "c"]).alias("cut")
)
# =>
# shape: (5, 2)
# ┌─────┬─────┐
# │ foo ┆ cut │
# │ --- ┆ --- │
# │ i64 ┆ cat │
# ╞═════╪═════╡
# │ -2  ┆ a   │
# │ -1  ┆ a   │
# │ 0   ┆ b   │
# │ 1   ┆ b   │
# │ 2   ┆ c   │
# └─────┴─────┘

Add both the category and the breakpoint.

df.with_columns(
  Polars.col("foo").cut([-1, 1], include_breaks: true).alias("cut")
).unnest("cut")
# =>
# shape: (5, 3)
# ┌─────┬────────────┬────────────┐
# │ foo ┆ breakpoint ┆ category   │
# │ --- ┆ ---        ┆ ---        │
# │ i64 ┆ f64        ┆ cat        │
# ╞═════╪════════════╪════════════╡
# │ -2  ┆ -1.0       ┆ (-inf, -1] │
# │ -1  ┆ -1.0       ┆ (-inf, -1] │
# │ 0   ┆ 1.0        ┆ (-1, 1]    │
# │ 1   ┆ 1.0        ┆ (-1, 1]    │
# │ 2   ┆ inf        ┆ (1, inf]   │
# └─────┴────────────┴────────────┘

Parameters:

  • breaks (Array)

    List of unique cut points.

  • labels (Array) (defaults to: nil)

    Names of the categories. The number of labels must be equal to the number of cut points plus one.

  • left_closed (Boolean) (defaults to: false)

    Set the intervals to be left-closed instead of right-closed.

  • include_breaks (Boolean) (defaults to: false)

    Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct.

Returns:



3146
3147
3148
# File 'lib/polars/expr.rb', line 3146

def cut(breaks, labels: nil, left_closed: false, include_breaks: false)
  wrap_expr(_rbexpr.cut(breaks, labels, left_closed, include_breaks))
end

#degreesExpr

Convert from radians to degrees.

Examples:

df = Polars::DataFrame.new({"a" => (-4...5).map { |x| x * Math::PI }})
df.select(Polars.col("a").degrees)
# =>
# shape: (9, 1)
# ┌────────┐
# │ a      │
# │ ---    │
# │ f64    │
# ╞════════╡
# │ -720.0 │
# │ -540.0 │
# │ -360.0 │
# │ -180.0 │
# │ 0.0    │
# │ 180.0  │
# │ 360.0  │
# │ 540.0  │
# │ 720.0  │
# └────────┘

Returns:



7137
7138
7139
# File 'lib/polars/expr.rb', line 7137

def degrees
  wrap_expr(_rbexpr.degrees)
end

#diff(n: 1, null_behavior: "ignore") ⇒ Expr

Calculate the n-th discrete difference.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [20, 10, 30]
  }
)
df.select(Polars.col("a").diff)
# =>
# shape: (3, 1)
# ┌──────┐
# │ a    │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ null │
# │ -10  │
# │ 20   │
# └──────┘

Parameters:

  • n (Integer) (defaults to: 1)

    Number of slots to shift.

  • null_behavior ("ignore", "drop") (defaults to: "ignore")

    How to handle null values.

Returns:



6581
6582
6583
6584
# File 'lib/polars/expr.rb', line 6581

def diff(n: 1, null_behavior: "ignore")
  n = Utils.parse_into_expression(n)
  wrap_expr(_rbexpr.diff(n, null_behavior))
end

#dot(other) ⇒ Expr

Compute the dot/inner product between two Expressions.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 3, 5],
    "b" => [2, 4, 6]
  }
)
df.select(Polars.col("a").dot(Polars.col("b")))
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 44  │
# └─────┘

Parameters:

  • other (Expr)

    Expression to compute dot product with.

Returns:



1306
1307
1308
1309
# File 'lib/polars/expr.rb', line 1306

def dot(other)
  other = Utils.parse_into_expression(other, str_as_lit: false)
  wrap_expr(_rbexpr.dot(other))
end

#drop_nansExpr

Drop floating point NaN values.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [8, 9, 10, 11],
    "b" => [nil, 4.0, 4.0, Float::NAN]
  }
)
df.select(Polars.col("b").drop_nans)
# =>
# shape: (3, 1)
# ┌──────┐
# │ b    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ 4.0  │
# │ 4.0  │
# └──────┘

Returns:



1000
1001
1002
# File 'lib/polars/expr.rb', line 1000

def drop_nans
  wrap_expr(_rbexpr.drop_nans)
end

#drop_nullsExpr

Drop null values.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [8, 9, 10, 11],
    "b" => [nil, 4.0, 4.0, Float::NAN]
  }
)
df.select(Polars.col("b").drop_nulls)
# =>
# shape: (3, 1)
# ┌─────┐
# │ b   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 4.0 │
# │ 4.0 │
# │ NaN │
# └─────┘

Returns:



973
974
975
# File 'lib/polars/expr.rb', line 973

def drop_nulls
  wrap_expr(_rbexpr.drop_nulls)
end

#dtDateTimeExpr

Create an object namespace of all datetime related methods.

Returns:



8343
8344
8345
# File 'lib/polars/expr.rb', line 8343

def dt
  DateTimeExpr.new(self)
end

#entropy(base: 2, normalize: true) ⇒ Expr

Computes the entropy.

Uses the formula -sum(pk * log(pk) where pk are discrete probabilities.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").entropy(base: 2))
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.459148 │
# └──────────┘
df.select(Polars.col("a").entropy(base: 2, normalize: false))
# =>
# shape: (1, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ f64       │
# ╞═══════════╡
# │ -6.754888 │
# └───────────┘

Parameters:

  • base (Float) (defaults to: 2)

    Given base, defaults to e.

  • normalize (Boolean) (defaults to: true)

    Normalize pk if it doesn't sum to 1.

Returns:



7669
7670
7671
# File 'lib/polars/expr.rb', line 7669

def entropy(base: 2, normalize: true)
  wrap_expr(_rbexpr.entropy(base, normalize))
end

#eq(other) ⇒ Expr

Method equivalent of equality operator expr == other.

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [1.0, 2.0, Float::NAN, 4.0],
    "y" => [2.0, 2.0, Float::NAN, 4.0]
  }
)
df.with_columns(
  Polars.col("x").eq(Polars.col("y")).alias("x == y")
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬────────┐
# │ x   ┆ y   ┆ x == y │
# │ --- ┆ --- ┆ ---    │
# │ f64 ┆ f64 ┆ bool   │
# ╞═════╪═════╪════════╡
# │ 1.0 ┆ 2.0 ┆ false  │
# │ 2.0 ┆ 2.0 ┆ true   │
# │ NaN ┆ NaN ┆ true   │
# │ 4.0 ┆ 4.0 ┆ true   │
# └─────┴─────┴────────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



3784
3785
3786
# File 'lib/polars/expr.rb', line 3784

def eq(other)
  self == other
end

#eq_missing(other) ⇒ Expr

Method equivalent of equality operator expr == other where nil == nil.

This differs from default eq where null values are propagated.

Examples:

df = Polars::DataFrame.new(
  data={
    "x" => [1.0, 2.0, Float::NAN, 4.0, nil, nil],
    "y" => [2.0, 2.0, Float::NAN, 4.0, 5.0, nil]
  }
)
df.with_columns(
  Polars.col("x").eq(Polars.col("y")).alias("x eq y"),
  Polars.col("x").eq_missing(Polars.col("y")).alias("x eq_missing y")
)
# =>
# shape: (6, 4)
# ┌──────┬──────┬────────┬────────────────┐
# │ x    ┆ y    ┆ x eq y ┆ x eq_missing y │
# │ ---  ┆ ---  ┆ ---    ┆ ---            │
# │ f64  ┆ f64  ┆ bool   ┆ bool           │
# ╞══════╪══════╪════════╪════════════════╡
# │ 1.0  ┆ 2.0  ┆ false  ┆ false          │
# │ 2.0  ┆ 2.0  ┆ true   ┆ true           │
# │ NaN  ┆ NaN  ┆ true   ┆ true           │
# │ 4.0  ┆ 4.0  ┆ true   ┆ true           │
# │ null ┆ 5.0  ┆ null   ┆ false          │
# │ null ┆ null ┆ null   ┆ true           │
# └──────┴──────┴────────┴────────────────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



3822
3823
3824
3825
# File 'lib/polars/expr.rb', line 3822

def eq_missing(other)
  other = Utils.parse_into_expression(other, str_as_lit: true)
  wrap_expr(_rbexpr.eq_missing(other))
end

#ewm_mean(com: nil, span: nil, half_life: nil, alpha: nil, adjust: true, min_periods: 1, ignore_nulls: true) ⇒ Expr

Exponentially-weighted moving average.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").ewm_mean(com: 1))
# =>
# shape: (3, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.0      │
# │ 1.666667 │
# │ 2.428571 │
# └──────────┘

Returns:



7317
7318
7319
7320
7321
7322
7323
7324
7325
7326
7327
7328
# File 'lib/polars/expr.rb', line 7317

def ewm_mean(
  com: nil,
  span: nil,
  half_life: nil,
  alpha: nil,
  adjust: true,
  min_periods: 1,
  ignore_nulls: true
)
  alpha = _prepare_alpha(com, span, half_life, alpha)
  wrap_expr(_rbexpr.ewm_mean(alpha, adjust, min_periods, ignore_nulls))
end

#ewm_mean_by(by, half_life:) ⇒ Expr

Compute time-based exponentially weighted moving average.

Examples:

df = Polars::DataFrame.new(
  {
    "values": [0, 1, 2, nil, 4],
    "times": [
        Date.new(2020, 1, 1),
        Date.new(2020, 1, 3),
        Date.new(2020, 1, 10),
        Date.new(2020, 1, 15),
        Date.new(2020, 1, 17)
    ]
  }
).sort("times")
df.with_columns(
  result: Polars.col("values").ewm_mean_by("times", half_life: "4d")
)
# =>
# shape: (5, 3)
# ┌────────┬────────────┬──────────┐
# │ values ┆ times      ┆ result   │
# │ ---    ┆ ---        ┆ ---      │
# │ i64    ┆ date       ┆ f64      │
# ╞════════╪════════════╪══════════╡
# │ 0      ┆ 2020-01-01 ┆ 0.0      │
# │ 1      ┆ 2020-01-03 ┆ 0.292893 │
# │ 2      ┆ 2020-01-10 ┆ 1.492474 │
# │ null   ┆ 2020-01-15 ┆ null     │
# │ 4      ┆ 2020-01-17 ┆ 3.254508 │
# └────────┴────────────┴──────────┘

Parameters:

  • by (Object)

    Times to calculate average by. Should be DateTime, Date, UInt64, UInt32, Int64, or Int32 data type.

  • half_life (Object)

    Unit over which observation decays to half its value.

    Can be created either from a timedelta, or by using the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1i (1 index count)

    Or combine them: "3d12h4m25s" # 3 days, 12 hours, 4 minutes, and 25 seconds

    Note that half_life is treated as a constant duration - calendar durations such as months (or even days in the time-zone-aware case) are not supported, please express your duration in an approximately equivalent number of hours (e.g. '370h' instead of '1mo').

Returns:



7390
7391
7392
7393
7394
7395
7396
7397
# File 'lib/polars/expr.rb', line 7390

def ewm_mean_by(
  by,
  half_life:
)
  by = Utils.parse_into_expression(by)
  half_life = Utils.parse_as_duration_string(half_life)
  wrap_expr(_rbexpr.ewm_mean_by(by, half_life))
end

#ewm_std(com: nil, span: nil, half_life: nil, alpha: nil, adjust: true, bias: false, min_periods: 1, ignore_nulls: true) ⇒ Expr

Exponentially-weighted moving standard deviation.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").ewm_std(com: 1))
# =>
# shape: (3, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.0      │
# │ 0.707107 │
# │ 0.963624 │
# └──────────┘

Returns:



7417
7418
7419
7420
7421
7422
7423
7424
7425
7426
7427
7428
7429
# File 'lib/polars/expr.rb', line 7417

def ewm_std(
  com: nil,
  span: nil,
  half_life: nil,
  alpha: nil,
  adjust: true,
  bias: false,
  min_periods: 1,
  ignore_nulls: true
)
  alpha = _prepare_alpha(com, span, half_life, alpha)
  wrap_expr(_rbexpr.ewm_std(alpha, adjust, bias, min_periods, ignore_nulls))
end

#ewm_var(com: nil, span: nil, half_life: nil, alpha: nil, adjust: true, bias: false, min_periods: 1, ignore_nulls: true) ⇒ Expr

Exponentially-weighted moving variance.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").ewm_var(com: 1))
# =>
# shape: (3, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.0      │
# │ 0.5      │
# │ 0.928571 │
# └──────────┘

Returns:



7449
7450
7451
7452
7453
7454
7455
7456
7457
7458
7459
7460
7461
# File 'lib/polars/expr.rb', line 7449

def ewm_var(
  com: nil,
  span: nil,
  half_life: nil,
  alpha: nil,
  adjust: true,
  bias: false,
  min_periods: 1,
  ignore_nulls: true
)
  alpha = _prepare_alpha(com, span, half_life, alpha)
  wrap_expr(_rbexpr.ewm_var(alpha, adjust, bias, min_periods, ignore_nulls))
end

#exclude(columns, *more_columns) ⇒ Expr

Exclude certain columns from a wildcard/regex selection.

You may also use regexes in the exclude list. They must start with ^ and end with $.

Examples:

df = Polars::DataFrame.new(
  {
    "aa" => [1, 2, 3],
    "ba" => ["a", "b", nil],
    "cc" => [nil, 2.5, 1.5]
  }
)
df.select(Polars.all.exclude("ba"))
# =>
# shape: (3, 2)
# ┌─────┬──────┐
# │ aa  ┆ cc   │
# │ --- ┆ ---  │
# │ i64 ┆ f64  │
# ╞═════╪══════╡
# │ 1   ┆ null │
# │ 2   ┆ 2.5  │
# │ 3   ┆ 1.5  │
# └─────┴──────┘

Parameters:

  • columns (Object)

    The name or datatype of the column(s) to exclude. Accepts regular expression input. Regular expressions should start with ^ and end with $.

  • more_columns (Array)

    Additional names or datatypes of columns to exclude, specified as positional arguments.

Returns:



448
449
450
# File 'lib/polars/expr.rb', line 448

def exclude(columns, *more_columns)
  meta.as_selector.exclude(columns, *more_columns).as_expr
end

#expExpr

Compute the exponential, element-wise.

Examples:

df = Polars::DataFrame.new({"values" => [1.0, 2.0, 4.0]})
df.select(Polars.col("values").exp)
# =>
# shape: (3, 1)
# ┌──────────┐
# │ values   │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 2.718282 │
# │ 7.389056 │
# │ 54.59815 │
# └──────────┘

Returns:



375
376
377
# File 'lib/polars/expr.rb', line 375

def exp
  wrap_expr(_rbexpr.exp)
end

#explodeExpr

Explode a list or utf8 Series.

This means that every item is expanded to a new row.

Examples:

df = Polars::DataFrame.new({"b" => [[1, 2, 3], [4, 5, 6]]})
df.select(Polars.col("b").explode)
# =>
# shape: (6, 1)
# ┌─────┐
# │ b   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# │ 2   │
# │ 3   │
# │ 4   │
# │ 5   │
# │ 6   │
# └─────┘

Returns:



3569
3570
3571
# File 'lib/polars/expr.rb', line 3569

def explode
  wrap_expr(_rbexpr.explode)
end

#extend_constant(value, n) ⇒ Expr

Extend the Series with given number of values.

Examples:

df = Polars::DataFrame.new({"values" => [1, 2, 3]})
df.select(Polars.col("values").extend_constant(99, 2))
# =>
# shape: (5, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 1      │
# │ 2      │
# │ 3      │
# │ 99     │
# │ 99     │
# └────────┘

Parameters:

  • value (Object)

    The value to extend the Series with. This value may be nil to fill with nulls.

  • n (Integer)

    The number of values to extend.

Returns:



7489
7490
7491
7492
7493
# File 'lib/polars/expr.rb', line 7489

def extend_constant(value, n)
  value = Utils.parse_into_expression(value, str_as_lit: true)
  n = Utils.parse_into_expression(n)
  wrap_expr(_rbexpr.extend_constant(value, n))
end

#fill_nan(fill_value) ⇒ Expr

Fill floating point NaN value with a fill value.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1.0, nil, Float::NAN],
    "b" => [4.0, Float::NAN, 6]
  }
)
df.fill_nan("zero")
# =>
# shape: (3, 2)
# ┌──────┬──────┐
# │ a    ┆ b    │
# │ ---  ┆ ---  │
# │ str  ┆ str  │
# ╞══════╪══════╡
# │ 1.0  ┆ 4.0  │
# │ null ┆ zero │
# │ zero ┆ 6.0  │
# └──────┴──────┘

Returns:



2195
2196
2197
2198
# File 'lib/polars/expr.rb', line 2195

def fill_nan(fill_value)
  fill_value = Utils.parse_into_expression(fill_value, str_as_lit: true)
  wrap_expr(_rbexpr.fill_nan(fill_value))
end

#fill_null(value = nil, strategy: nil, limit: nil) ⇒ Expr

Fill null values using the specified value or strategy.

To interpolate over null values see interpolate.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil],
    "b" => [4, nil, 6]
  }
)
df.fill_null(strategy: "zero")
# =>
# shape: (3, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1   ┆ 4   │
# │ 2   ┆ 0   │
# │ 0   ┆ 6   │
# └─────┴─────┘
df.fill_null(99)
# =>
# shape: (3, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1   ┆ 4   │
# │ 2   ┆ 99  │
# │ 99  ┆ 6   │
# └─────┴─────┘
df.fill_null(strategy: "forward")
# =>
# shape: (3, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1   ┆ 4   │
# │ 2   ┆ 4   │
# │ 2   ┆ 6   │
# └─────┴─────┘

Parameters:

  • value (Object) (defaults to: nil)

    Value used to fill null values.

  • strategy (nil, "forward", "backward", "min", "max", "mean", "zero", "one") (defaults to: nil)

    Strategy used to fill null values.

  • limit (Integer) (defaults to: nil)

    Number of consecutive null values to fill when using the 'forward' or 'backward' strategy.

Returns:



2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
# File 'lib/polars/expr.rb', line 2155

def fill_null(value = nil, strategy: nil, limit: nil)
  if !value.nil? && !strategy.nil?
    raise ArgumentError, "cannot specify both 'value' and 'strategy'."
  elsif value.nil? && strategy.nil?
    raise ArgumentError, "must specify either a fill 'value' or 'strategy'"
  elsif ["forward", "backward"].include?(strategy) && !limit.nil?
    raise ArgumentError, "can only specify 'limit' when strategy is set to 'backward' or 'forward'"
  end

  if !value.nil?
    value = Utils.parse_into_expression(value, str_as_lit: true)
    wrap_expr(_rbexpr.fill_null(value))
  else
    wrap_expr(_rbexpr.fill_null_with_strategy(strategy, limit))
  end
end

#filter(predicate) ⇒ Expr

Filter a single column.

Mostly useful in an aggregation context. If you want to filter on a DataFrame level, use LazyFrame#filter.

Examples:

df = Polars::DataFrame.new(
  {
    "group_col" => ["g1", "g1", "g2"],
    "b" => [1, 2, 3]
  }
)
(
  df.group_by("group_col").agg(
    [
      Polars.col("b").filter(Polars.col("b") < 2).sum.alias("lt"),
      Polars.col("b").filter(Polars.col("b") >= 2).sum.alias("gte")
    ]
  )
).sort("group_col")
# =>
# shape: (2, 3)
# ┌───────────┬─────┬─────┐
# │ group_col ┆ lt  ┆ gte │
# │ ---       ┆ --- ┆ --- │
# │ str       ┆ i64 ┆ i64 │
# ╞═══════════╪═════╪═════╡
# │ g1        ┆ 1   ┆ 2   │
# │ g2        ┆ 0   ┆ 3   │
# └───────────┴─────┴─────┘

Parameters:

  • predicate (Expr)

    Boolean expression.

Returns:



3329
3330
3331
# File 'lib/polars/expr.rb', line 3329

def filter(predicate)
  wrap_expr(_rbexpr.filter(predicate._rbexpr))
end

#firstExpr

Get the first value.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.col("a").first)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# └─────┘

Returns:



2684
2685
2686
# File 'lib/polars/expr.rb', line 2684

def first
  wrap_expr(_rbexpr.first)
end

#flattenExpr

Explode a list or utf8 Series. This means that every item is expanded to a new row.

Alias for #explode.

Examples:

df = Polars::DataFrame.new(
  {
    "group" => ["a", "b", "b"],
    "values" => [[1, 2], [2, 3], [4]]
  }
)
df.group_by("group").agg(Polars.col("values").flatten)
# =>
# shape: (2, 2)
# ┌───────┬───────────┐
# │ group ┆ values    │
# │ ---   ┆ ---       │
# │ str   ┆ list[i64] │
# ╞═══════╪═══════════╡
# │ a     ┆ [1, 2]    │
# │ b     ┆ [2, 3, 4] │
# └───────┴───────────┘

Returns:



3542
3543
3544
# File 'lib/polars/expr.rb', line 3542

def flatten
  wrap_expr(_rbexpr.explode)
end

#floorExpr

Rounds down to the nearest integer value.

Only works on floating point Series.

Examples:

df = Polars::DataFrame.new({"a" => [0.3, 0.5, 1.0, 1.1]})
df.select(Polars.col("a").floor)
# =>
# shape: (4, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 0.0 │
# │ 0.0 │
# │ 1.0 │
# │ 1.0 │
# └─────┘

Returns:



1195
1196
1197
# File 'lib/polars/expr.rb', line 1195

def floor
  wrap_expr(_rbexpr.floor)
end

#floordiv(other) ⇒ Expr

Method equivalent of integer division operator expr // other.

Examples:

df = Polars::DataFrame.new({"x" => [1, 2, 3, 4, 5]})
df.with_columns(
  Polars.col("x").truediv(2).alias("x/2"),
  Polars.col("x").floordiv(2).alias("x//2")
)
# =>
# shape: (5, 3)
# ┌─────┬─────┬──────┐
# │ x   ┆ x/2 ┆ x//2 │
# │ --- ┆ --- ┆ ---  │
# │ i64 ┆ f64 ┆ i64  │
# ╞═════╪═════╪══════╡
# │ 1   ┆ 0.5 ┆ 0    │
# │ 2   ┆ 1.0 ┆ 1    │
# │ 3   ┆ 1.5 ┆ 1    │
# │ 4   ┆ 2.0 ┆ 2    │
# │ 5   ┆ 2.5 ┆ 2    │
# └─────┴─────┴──────┘

Parameters:

  • other (Object)

    Numeric literal or expression value.

Returns:



4104
4105
4106
# File 'lib/polars/expr.rb', line 4104

def floordiv(other)
  wrap_expr(_rbexpr.floordiv(_to_rbexpr(other)))
end

#forward_fill(limit: nil) ⇒ Expr

Fill missing values with the latest seen values.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil],
    "b" => [4, nil, 6]
  }
)
df.select(Polars.all.forward_fill)
# =>
# shape: (3, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1   ┆ 4   │
# │ 2   ┆ 4   │
# │ 2   ┆ 6   │
# └─────┴─────┘

Parameters:

  • limit (Integer) (defaults to: nil)

    The number of consecutive null values to forward fill.

Returns:



2226
2227
2228
# File 'lib/polars/expr.rb', line 2226

def forward_fill(limit: nil)
  fill_null(strategy: "forward", limit: limit)
end

#gather(indices) ⇒ Expr Also known as: take

Take values by index.

Examples:

df = Polars::DataFrame.new(
  {
    "group" => [
      "one",
      "one",
      "one",
      "two",
      "two",
      "two"
    ],
    "value" => [1, 98, 2, 3, 99, 4]
  }
)
df.group_by("group", maintain_order: true).agg(Polars.col("value").take([2, 1]))
# =>
# shape: (2, 2)
# ┌───────┬───────────┐
# │ group ┆ value     │
# │ ---   ┆ ---       │
# │ str   ┆ list[i64] │
# ╞═══════╪═══════════╡
# │ one   ┆ [2, 98]   │
# │ two   ┆ [4, 99]   │
# └───────┴───────────┘

Parameters:

  • indices (Expr)

    An expression that leads to a :u32 dtyped Series.

Returns:



1987
1988
1989
1990
1991
1992
1993
1994
# File 'lib/polars/expr.rb', line 1987

def gather(indices)
  if indices.is_a?(::Array)
    indices_lit = Polars.lit(Series.new("", indices, dtype: :u32))._rbexpr
  else
    indices_lit = Utils.parse_into_expression(indices, str_as_lit: false)
  end
  wrap_expr(_rbexpr.gather(indices_lit))
end

#gather_every(n, offset = 0) ⇒ Expr Also known as: take_every

Take every nth value in the Series and return as a new Series.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2, 3, 4, 5, 6, 7, 8, 9]})
df.select(Polars.col("foo").gather_every(3))
# =>
# shape: (3, 1)
# ┌─────┐
# │ foo │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# │ 4   │
# │ 7   │
# └─────┘

Returns:



3591
3592
3593
# File 'lib/polars/expr.rb', line 3591

def gather_every(n, offset = 0)
  wrap_expr(_rbexpr.gather_every(n, offset))
end

#ge(other) ⇒ Expr

Method equivalent of "greater than or equal" operator expr >= other.

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [5.0, 4.0, Float::NAN, 2.0],
    "y" => [5.0, 3.0, Float::NAN, 1.0]
  }
)
df.with_columns(
  Polars.col("x").ge(Polars.col("y")).alias("x >= y")
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬────────┐
# │ x   ┆ y   ┆ x >= y │
# │ --- ┆ --- ┆ ---    │
# │ f64 ┆ f64 ┆ bool   │
# ╞═════╪═════╪════════╡
# │ 5.0 ┆ 5.0 ┆ true   │
# │ 4.0 ┆ 3.0 ┆ true   │
# │ NaN ┆ NaN ┆ true   │
# │ 2.0 ┆ 1.0 ┆ true   │
# └─────┴─────┴────────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



3856
3857
3858
# File 'lib/polars/expr.rb', line 3856

def ge(other)
  self >= other
end

#get(index) ⇒ Expr

Return a single value by index.

Examples:

df = Polars::DataFrame.new(
  {
    "group" => [
      "one",
      "one",
      "one",
      "two",
      "two",
      "two"
    ],
    "value" => [1, 98, 2, 3, 99, 4]
  }
)
df.group_by("group", maintain_order: true).agg(Polars.col("value").get(1))
# =>
# shape: (2, 2)
# ┌───────┬───────┐
# │ group ┆ value │
# │ ---   ┆ ---   │
# │ str   ┆ i64   │
# ╞═══════╪═══════╡
# │ one   ┆ 98    │
# │ two   ┆ 99    │
# └───────┴───────┘

Parameters:

  • index (Object)

    An expression that leads to a UInt32 index.

Returns:



2029
2030
2031
2032
# File 'lib/polars/expr.rb', line 2029

def get(index)
  index_lit = Utils.parse_into_expression(index)
  wrap_expr(_rbexpr.get(index_lit))
end

#gt(other) ⇒ Expr

Method equivalent of "greater than" operator expr > other.

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [5.0, 4.0, Float::NAN, 2.0],
    "y" => [5.0, 3.0, Float::NAN, 1.0]
  }
)
df.with_columns(
    Polars.col("x").gt(Polars.col("y")).alias("x > y")
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬───────┐
# │ x   ┆ y   ┆ x > y │
# │ --- ┆ --- ┆ ---   │
# │ f64 ┆ f64 ┆ bool  │
# ╞═════╪═════╪═══════╡
# │ 5.0 ┆ 5.0 ┆ false │
# │ 4.0 ┆ 3.0 ┆ true  │
# │ NaN ┆ NaN ┆ false │
# │ 2.0 ┆ 1.0 ┆ true  │
# └─────┴─────┴───────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



3889
3890
3891
# File 'lib/polars/expr.rb', line 3889

def gt(other)
  self > other
end

#has_nullsExpr

Check whether the expression contains one or more null values.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [nil, 1, nil],
    "b" => [10, nil, 300],
    "c" => [350, 650, 850]
  }
)
df.select(Polars.all.has_nulls)
# =>
# shape: (1, 3)
# ┌──────┬──────┬───────┐
# │ a    ┆ b    ┆ c     │
# │ ---  ┆ ---  ┆ ---   │
# │ bool ┆ bool ┆ bool  │
# ╞══════╪══════╪═══════╡
# │ true ┆ true ┆ false │
# └──────┴──────┴───────┘

Returns:



2596
2597
2598
# File 'lib/polars/expr.rb', line 2596

def has_nulls
  null_count > 0
end

#head(n = 10) ⇒ Expr

Get the first n rows.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2, 3, 4, 5, 6, 7]})
df.head(3)
# =>
# shape: (3, 1)
# ┌─────┐
# │ foo │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# │ 2   │
# │ 3   │
# └─────┘

Parameters:

  • n (Integer) (defaults to: 10)

    Number of rows to return.

Returns:



3617
3618
3619
# File 'lib/polars/expr.rb', line 3617

def head(n = 10)
  wrap_expr(_rbexpr.head(n))
end

#hist(bins: nil, bin_count: nil, include_category: false, include_breakpoint: false) ⇒ Expr

Note:

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Bin values into buckets and count their occurrences.

Examples:

df = Polars::DataFrame.new({"a" => [1, 3, 8, 8, 2, 1, 3]})
df.select(Polars.col("a").hist(bins: [1, 2, 3]))
# =>
# shape: (2, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 3   │
# │ 2   │
# └─────┘
df.select(
  Polars.col("a").hist(
    bins: [1, 2, 3], include_breakpoint: true, include_category: true
  )
)
# =>
# shape: (2, 1)
# ┌──────────────────────┐
# │ a                    │
# │ ---                  │
# │ struct[3]            │
# ╞══════════════════════╡
# │ {2.0,"[1.0, 2.0]",3} │
# │ {3.0,"(2.0, 3.0]",2} │
# └──────────────────────┘

Parameters:

  • bins (Object) (defaults to: nil)

    Bin edges. If nil given, we determine the edges based on the data.

  • bin_count (Integer) (defaults to: nil)

    If bins is not provided, bin_count uniform bins are created that fully encompass the data.

  • include_category (Boolean) (defaults to: false)

    Include a column that shows the intervals as categories.

  • include_breakpoint (Boolean) (defaults to: false)

    Include a column that indicates the upper breakpoint.

Returns:



7832
7833
7834
7835
7836
7837
7838
7839
7840
7841
7842
7843
7844
7845
7846
7847
# File 'lib/polars/expr.rb', line 7832

def hist(
  bins: nil,
  bin_count: nil,
  include_category: false,
  include_breakpoint: false
)
  if !bins.nil?
    if bins.is_a?(::Array)
      bins = Polars::Series.new(bins)
    end
    bins = Utils.parse_into_expression(bins)
  end
  wrap_expr(
    _rbexpr.hist(bins, bin_count, include_category, include_breakpoint)
  )
end

#implodeExpr

Aggregate to list.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, 3],
    "b" => [4, 5, 6]
  }
)
df.select(Polars.all.implode)
# =>
# shape: (1, 2)
# ┌───────────┬───────────┐
# │ a         ┆ b         │
# │ ---       ┆ ---       │
# │ list[i64] ┆ list[i64] │
# ╞═══════════╪═══════════╡
# │ [1, 2, 3] ┆ [4, 5, 6] │
# └───────────┴───────────┘

Returns:



7769
7770
7771
# File 'lib/polars/expr.rb', line 7769

def implode
  wrap_expr(_rbexpr.implode)
end

#index_of(element) ⇒ Expr

Get the index of the first occurrence of a value, or nil if it's not found.

Examples:

df = Polars::DataFrame.new({"a" => [1, nil, 17]})
df.select(
  [
    Polars.col("a").index_of(17).alias("seventeen"),
    Polars.col("a").index_of(nil).alias("null"),
    Polars.col("a").index_of(55).alias("fiftyfive")
  ]
)
# =>
# shape: (1, 3)
# ┌───────────┬──────┬───────────┐
# │ seventeen ┆ null ┆ fiftyfive │
# │ ---       ┆ ---  ┆ ---       │
# │ u32       ┆ u32  ┆ u32       │
# ╞═══════════╪══════╪═══════════╡
# │ 2         ┆ 1    ┆ null      │
# └───────────┴──────┴───────────┘

Parameters:

  • element (Object)

    Value to find.

Returns:



1847
1848
1849
1850
# File 'lib/polars/expr.rb', line 1847

def index_of(element)
  element = Utils.parse_into_expression(element, str_as_lit: true)
  wrap_expr(_rbexpr.index_of(element))
end

#interpolate(method: "linear") ⇒ Expr

Fill nulls with linear interpolation over missing values.

Can also be used to regrid data to a new grid - see examples below.

Examples:

Fill nulls with linear interpolation

df = Polars::DataFrame.new(
  {
    "a" => [1, nil, 3],
    "b" => [1.0, Float::NAN, 3.0]
  }
)
df.select(Polars.all.interpolate)
# =>
# shape: (3, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ f64 ┆ f64 │
# ╞═════╪═════╡
# │ 1.0 ┆ 1.0 │
# │ 2.0 ┆ NaN │
# │ 3.0 ┆ 3.0 │
# └─────┴─────┘

Returns:



4617
4618
4619
# File 'lib/polars/expr.rb', line 4617

def interpolate(method: "linear")
  wrap_expr(_rbexpr.interpolate(method))
end

#interpolate_by(by) ⇒ Expr

Fill null values using interpolation based on another column.

Examples:

Fill null values using linear interpolation.

df = Polars::DataFrame.new(
  {
    "a" => [1, nil, nil, 3],
    "b" => [1, 2, 7, 8]
  }
)
df.with_columns(a_interpolated: Polars.col("a").interpolate_by("b"))
# =>
# shape: (4, 3)
# ┌──────┬─────┬────────────────┐
# │ a    ┆ b   ┆ a_interpolated │
# │ ---  ┆ --- ┆ ---            │
# │ i64  ┆ i64 ┆ f64            │
# ╞══════╪═════╪════════════════╡
# │ 1    ┆ 1   ┆ 1.0            │
# │ null ┆ 2   ┆ 1.285714       │
# │ null ┆ 7   ┆ 2.714286       │
# │ 3    ┆ 8   ┆ 3.0            │
# └──────┴─────┴────────────────┘

Parameters:

  • by (Expr)

    Column to interpolate values based on.

Returns:



4647
4648
4649
4650
# File 'lib/polars/expr.rb', line 4647

def interpolate_by(by)
  by = Utils.parse_into_expression(by)
  wrap_expr(_rbexpr.interpolate_by(by))
end

#is_between(lower_bound, upper_bound, closed: "both") ⇒ Expr

Check if this expression is between start and end.

Examples:

df = Polars::DataFrame.new({"num" => [1, 2, 3, 4, 5]})
df.with_columns(Polars.col("num").is_between(2, 4).alias("is_between"))
# =>
# shape: (5, 2)
# ┌─────┬────────────┐
# │ num ┆ is_between │
# │ --- ┆ ---        │
# │ i64 ┆ bool       │
# ╞═════╪════════════╡
# │ 1   ┆ false      │
# │ 2   ┆ true       │
# │ 3   ┆ true       │
# │ 4   ┆ true       │
# │ 5   ┆ false      │
# └─────┴────────────┘

Use the closed argument to include or exclude the values at the bounds:

df.with_columns(
  Polars.col("num").is_between(2, 4, closed: "left").alias("is_between")
)
# =>
# shape: (5, 2)
# ┌─────┬────────────┐
# │ num ┆ is_between │
# │ --- ┆ ---        │
# │ i64 ┆ bool       │
# ╞═════╪════════════╡
# │ 1   ┆ false      │
# │ 2   ┆ true       │
# │ 3   ┆ true       │
# │ 4   ┆ false      │
# │ 5   ┆ false      │
# └─────┴────────────┘

You can also use strings as well as numeric/temporal values:

df = Polars::DataFrame.new({"a" => ["a", "b", "c", "d", "e"]})
df.with_columns(
  Polars.col("a")
    .is_between(Polars.lit("a"), Polars.lit("c"), closed: "both")
    .alias("is_between")
)
# =>
# shape: (5, 2)
# ┌─────┬────────────┐
# │ a   ┆ is_between │
# │ --- ┆ ---        │
# │ str ┆ bool       │
# ╞═════╪════════════╡
# │ a   ┆ true       │
# │ b   ┆ true       │
# │ c   ┆ true       │
# │ d   ┆ false      │
# │ e   ┆ false      │
# └─────┴────────────┘

Parameters:

  • lower_bound (Object)

    Lower bound as primitive type or datetime.

  • upper_bound (Object)

    Upper bound as primitive type or datetime.

  • closed ("both", "left", "right", "none") (defaults to: "both")

    Define which sides of the interval are closed (inclusive).

Returns:



4436
4437
4438
4439
4440
4441
4442
4443
# File 'lib/polars/expr.rb', line 4436

def is_between(lower_bound, upper_bound, closed: "both")
  lower_bound = Utils.parse_into_expression(lower_bound)
  upper_bound = Utils.parse_into_expression(upper_bound)

  wrap_expr(
    _rbexpr.is_between(lower_bound, upper_bound, closed)
  )
end

#is_close(other, abs_tol: 0.0, rel_tol: 1e-09, nans_equal: false) ⇒ Expr

Check if this expression is close, i.e. almost equal, to the other expression.

Examples:

df = Polars::DataFrame.new({"a" => [1.5, 2.0, 2.5], "b" => [1.55, 2.2, 3.0]})
df.with_columns(Polars.col("a").is_close("b", abs_tol: 0.1).alias("is_close"))
# =>
# shape: (3, 3)
# ┌─────┬──────┬──────────┐
# │ a   ┆ b    ┆ is_close │
# │ --- ┆ ---  ┆ ---      │
# │ f64 ┆ f64  ┆ bool     │
# ╞═════╪══════╪══════════╡
# │ 1.5 ┆ 1.55 ┆ true     │
# │ 2.0 ┆ 2.2  ┆ false    │
# │ 2.5 ┆ 3.0  ┆ false    │
# └─────┴──────┴──────────┘

Parameters:

  • abs_tol (Float) (defaults to: 0.0)

    Absolute tolerance. This is the maximum allowed absolute difference between two values. Must be non-negative.

  • rel_tol (Float) (defaults to: 1e-09)

    Relative tolerance. This is the maximum allowed difference between two values, relative to the larger absolute value. Must be in the range [0, 1).

  • nans_equal (Boolean) (defaults to: false)

    Whether NaN values should be considered equal.

Returns:



4472
4473
4474
4475
4476
4477
4478
4479
4480
# File 'lib/polars/expr.rb', line 4472

def is_close(
  other,
  abs_tol: 0.0,
  rel_tol: 1e-09,
  nans_equal: false
)
  other = Utils.parse_into_expression(other)
  wrap_expr(_rbexpr.is_close(other, abs_tol, rel_tol, nans_equal))
end

#is_duplicatedExpr

Get mask of duplicated values.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.col("a").is_duplicated)
# =>
# shape: (3, 1)
# ┌───────┐
# │ a     │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ true  │
# │ true  │
# │ false │
# └───────┘

Returns:



2968
2969
2970
# File 'lib/polars/expr.rb', line 2968

def is_duplicated
  wrap_expr(_rbexpr.is_duplicated)
end

#is_finiteExpr

Returns a boolean Series indicating which values are finite.

Examples:

df = Polars::DataFrame.new(
  {
    "A" => [1.0, 2],
    "B" => [3.0, Float::INFINITY]
  }
)
df.select(Polars.all.is_finite)
# =>
# shape: (2, 2)
# ┌──────┬───────┐
# │ A    ┆ B     │
# │ ---  ┆ ---   │
# │ bool ┆ bool  │
# ╞══════╪═══════╡
# │ true ┆ true  │
# │ true ┆ false │
# └──────┴───────┘

Returns:



681
682
683
# File 'lib/polars/expr.rb', line 681

def is_finite
  wrap_expr(_rbexpr.is_finite)
end

#is_first_distinctExpr Also known as: is_first

Get a mask of the first unique value.

Examples:

df = Polars::DataFrame.new(
  {
    "num" => [1, 2, 3, 1, 5]
  }
)
df.with_column(Polars.col("num").is_first.alias("is_first"))
# =>
# shape: (5, 2)
# ┌─────┬──────────┐
# │ num ┆ is_first │
# │ --- ┆ ---      │
# │ i64 ┆ bool     │
# ╞═════╪══════════╡
# │ 1   ┆ true     │
# │ 2   ┆ true     │
# │ 3   ┆ true     │
# │ 1   ┆ false    │
# │ 5   ┆ true     │
# └─────┴──────────┘

Returns:



2921
2922
2923
# File 'lib/polars/expr.rb', line 2921

def is_first_distinct
  wrap_expr(_rbexpr.is_first_distinct)
end

#is_in(other, nulls_equal: false) ⇒ Expr Also known as: in?

Check if elements of this expression are present in the other Series.

Examples:

df = Polars::DataFrame.new(
  {"sets" => [[1, 2, 3], [1, 2], [9, 10]], "optional_members" => [1, 2, 3]}
)
df.with_columns(contains: Polars.col("optional_members").is_in("sets"))
# =>
# shape: (3, 3)
# ┌───────────┬──────────────────┬──────────┐
# │ sets      ┆ optional_members ┆ contains │
# │ ---       ┆ ---              ┆ ---      │
# │ list[i64] ┆ i64              ┆ bool     │
# ╞═══════════╪══════════════════╪══════════╡
# │ [1, 2, 3] ┆ 1                ┆ true     │
# │ [1, 2]    ┆ 2                ┆ true     │
# │ [9, 10]   ┆ 3                ┆ false    │
# └───────────┴──────────────────┴──────────┘

Parameters:

  • other (Object)

    Series or sequence of primitive type.

  • nulls_equal (Boolean) (defaults to: false)

    If true, treat null as a distinct value. Null values will not propagate.

Returns:



4329
4330
4331
4332
# File 'lib/polars/expr.rb', line 4329

def is_in(other, nulls_equal: false)
  other = Utils.parse_into_expression(other)
  wrap_expr(_rbexpr.is_in(other, nulls_equal))
end

#is_infiniteExpr

Returns a boolean Series indicating which values are infinite.

Examples:

df = Polars::DataFrame.new(
  {
    "A" => [1.0, 2],
    "B" => [3.0, Float::INFINITY]
  }
)
df.select(Polars.all.is_infinite)
# =>
# shape: (2, 2)
# ┌───────┬───────┐
# │ A     ┆ B     │
# │ ---   ┆ ---   │
# │ bool  ┆ bool  │
# ╞═══════╪═══════╡
# │ false ┆ false │
# │ false ┆ true  │
# └───────┴───────┘

Returns:



707
708
709
# File 'lib/polars/expr.rb', line 707

def is_infinite
  wrap_expr(_rbexpr.is_infinite)
end

#is_last_distinctExpr

Return a boolean mask indicating the last occurrence of each distinct value.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2, 3, 2]})
df.with_columns(Polars.col("a").is_last_distinct.alias("last"))
# =>
# shape: (5, 2)
# ┌─────┬───────┐
# │ a   ┆ last  │
# │ --- ┆ ---   │
# │ i64 ┆ bool  │
# ╞═════╪═══════╡
# │ 1   ┆ false │
# │ 1   ┆ true  │
# │ 2   ┆ false │
# │ 3   ┆ true  │
# │ 2   ┆ true  │
# └─────┴───────┘

Returns:



2946
2947
2948
# File 'lib/polars/expr.rb', line 2946

def is_last_distinct
  wrap_expr(_rbexpr.is_last_distinct)
end

#is_nanExpr

Note:

Floating point NaN (Not A Number) should not be confused with missing data represented as nil.

Returns a boolean Series indicating which values are NaN.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil, 1, 5],
    "b" => [1.0, 2.0, Float::NAN, 1.0, 5.0]
  }
)
df.with_column(Polars.col(Polars::Float64).is_nan.suffix("_isnan"))
# =>
# shape: (5, 3)
# ┌──────┬─────┬─────────┐
# │ a    ┆ b   ┆ b_isnan │
# │ ---  ┆ --- ┆ ---     │
# │ i64  ┆ f64 ┆ bool    │
# ╞══════╪═════╪═════════╡
# │ 1    ┆ 1.0 ┆ false   │
# │ 2    ┆ 2.0 ┆ false   │
# │ null ┆ NaN ┆ true    │
# │ 1    ┆ 1.0 ┆ false   │
# │ 5    ┆ 5.0 ┆ false   │
# └──────┴─────┴─────────┘

Returns:



740
741
742
# File 'lib/polars/expr.rb', line 740

def is_nan
  wrap_expr(_rbexpr.is_nan)
end

#is_notExpr Also known as: not_

Negate a boolean expression.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [true, false, false],
    "b" => ["a", "b", nil]
  }
)
# =>
# shape: (3, 2)
# ┌───────┬──────┐
# │ a     ┆ b    │
# │ ---   ┆ ---  │
# │ bool  ┆ str  │
# ╞═══════╪══════╡
# │ true  ┆ a    │
# │ false ┆ b    │
# │ false ┆ null │
# └───────┴──────┘
df.select(Polars.col("a").is_not)
# =>
# shape: (3, 1)
# ┌───────┐
# │ a     │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ false │
# │ true  │
# │ true  │
# └───────┘

Returns:



596
597
598
# File 'lib/polars/expr.rb', line 596

def is_not
  wrap_expr(_rbexpr.not_)
end

#is_not_nanExpr

Note:

Floating point NaN (Not A Number) should not be confused with missing data represented as nil.

Returns a boolean Series indicating which values are not NaN.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil, 1, 5],
    "b" => [1.0, 2.0, Float::NAN, 1.0, 5.0]
  }
)
df.with_column(Polars.col(Polars::Float64).is_not_nan.suffix("_is_not_nan"))
# =>
# shape: (5, 3)
# ┌──────┬─────┬──────────────┐
# │ a    ┆ b   ┆ b_is_not_nan │
# │ ---  ┆ --- ┆ ---          │
# │ i64  ┆ f64 ┆ bool         │
# ╞══════╪═════╪══════════════╡
# │ 1    ┆ 1.0 ┆ true         │
# │ 2    ┆ 2.0 ┆ true         │
# │ null ┆ NaN ┆ false        │
# │ 1    ┆ 1.0 ┆ true         │
# │ 5    ┆ 5.0 ┆ true         │
# └──────┴─────┴──────────────┘

Returns:



773
774
775
# File 'lib/polars/expr.rb', line 773

def is_not_nan
  wrap_expr(_rbexpr.is_not_nan)
end

#is_not_nullExpr

Returns a boolean Series indicating which values are not null.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil, 1, 5],
    "b" => [1.0, 2.0, Float::NAN, 1.0, 5.0]
  }
)
df.with_column(Polars.all.is_not_null.suffix("_not_null"))
# =>
# shape: (5, 4)
# ┌──────┬─────┬────────────┬────────────┐
# │ a    ┆ b   ┆ a_not_null ┆ b_not_null │
# │ ---  ┆ --- ┆ ---        ┆ ---        │
# │ i64  ┆ f64 ┆ bool       ┆ bool       │
# ╞══════╪═════╪════════════╪════════════╡
# │ 1    ┆ 1.0 ┆ true       ┆ true       │
# │ 2    ┆ 2.0 ┆ true       ┆ true       │
# │ null ┆ NaN ┆ false      ┆ true       │
# │ 1    ┆ 1.0 ┆ true       ┆ true       │
# │ 5    ┆ 5.0 ┆ true       ┆ true       │
# └──────┴─────┴────────────┴────────────┘

Returns:



655
656
657
# File 'lib/polars/expr.rb', line 655

def is_not_null
  wrap_expr(_rbexpr.is_not_null)
end

#is_nullExpr

Returns a boolean Series indicating which values are null.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, nil, 1, 5],
    "b" => [1.0, 2.0, Float::NAN, 1.0, 5.0]
  }
)
df.with_column(Polars.all.is_null.suffix("_isnull"))
# =>
# shape: (5, 4)
# ┌──────┬─────┬──────────┬──────────┐
# │ a    ┆ b   ┆ a_isnull ┆ b_isnull │
# │ ---  ┆ --- ┆ ---      ┆ ---      │
# │ i64  ┆ f64 ┆ bool     ┆ bool     │
# ╞══════╪═════╪══════════╪══════════╡
# │ 1    ┆ 1.0 ┆ false    ┆ false    │
# │ 2    ┆ 2.0 ┆ false    ┆ false    │
# │ null ┆ NaN ┆ true     ┆ false    │
# │ 1    ┆ 1.0 ┆ false    ┆ false    │
# │ 5    ┆ 5.0 ┆ false    ┆ false    │
# └──────┴─────┴──────────┴──────────┘

Returns:



626
627
628
# File 'lib/polars/expr.rb', line 626

def is_null
  wrap_expr(_rbexpr.is_null)
end

#is_uniqueExpr

Get mask of unique values.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.col("a").is_unique)
# =>
# shape: (3, 1)
# ┌───────┐
# │ a     │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ false │
# │ false │
# │ true  │
# └───────┘

Returns:



2893
2894
2895
# File 'lib/polars/expr.rb', line 2893

def is_unique
  wrap_expr(_rbexpr.is_unique)
end

#keep_nameExpr

Keep the original root name of the expression.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2],
    "b" => [3, 4]
  }
)
df.with_columns([(Polars.col("a") * 9).alias("c").keep_name])
# =>
# shape: (2, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 9   ┆ 3   │
# │ 18  ┆ 4   │
# └─────┴─────┘

Returns:



474
475
476
# File 'lib/polars/expr.rb', line 474

def keep_name
  name.keep
end

#kurtosis(fisher: true, bias: true) ⇒ Expr

Compute the kurtosis (Fisher or Pearson) of a dataset.

Kurtosis is the fourth central moment divided by the square of the variance. If Fisher's definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution. If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 2, 1]})
df.select(Polars.col("a").kurtosis)
# =>
# shape: (1, 1)
# ┌───────────┐
# │ a         │
# │ ---       │
# │ f64       │
# ╞═══════════╡
# │ -1.153061 │
# └───────────┘

Parameters:

  • fisher (Boolean) (defaults to: true)

    If true, Fisher's definition is used (normal ==> 0.0). If false, Pearson's definition is used (normal ==> 3.0).

  • bias (Boolean) (defaults to: true)

    If false, the calculations are corrected for statistical bias.

Returns:



6680
6681
6682
# File 'lib/polars/expr.rb', line 6680

def kurtosis(fisher: true, bias: true)
  wrap_expr(_rbexpr.kurtosis(fisher, bias))
end

#lastExpr

Get the last value.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.col("a").last)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 2   │
# └─────┘

Returns:



2704
2705
2706
# File 'lib/polars/expr.rb', line 2704

def last
  wrap_expr(_rbexpr.last)
end

#le(other) ⇒ Expr

Method equivalent of "less than or equal" operator expr <= other.

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [5.0, 4.0, Float::NAN, 0.5],
    "y" => [5.0, 3.5, Float::NAN, 2.0]
  }
)
df.with_columns(
  Polars.col("x").le(Polars.col("y")).alias("x <= y")
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬────────┐
# │ x   ┆ y   ┆ x <= y │
# │ --- ┆ --- ┆ ---    │
# │ f64 ┆ f64 ┆ bool   │
# ╞═════╪═════╪════════╡
# │ 5.0 ┆ 5.0 ┆ true   │
# │ 4.0 ┆ 3.5 ┆ false  │
# │ NaN ┆ NaN ┆ true   │
# │ 0.5 ┆ 2.0 ┆ true   │
# └─────┴─────┴────────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



3922
3923
3924
# File 'lib/polars/expr.rb', line 3922

def le(other)
  self <= other
end

#lenExpr Also known as: length

Count the number of values in this expression.

Examples:

df = Polars::DataFrame.new({"a" => [8, 9, 10], "b" => [nil, 4, 4]})
df.select(Polars.all.len)
# =>
# shape: (1, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ u32 ┆ u32 │
# ╞═════╪═════╡
# │ 3   ┆ 3   │
# └─────┴─────┘

Returns:



848
849
850
# File 'lib/polars/expr.rb', line 848

def len
  wrap_expr(_rbexpr.len)
end

#limit(n = 10) ⇒ Expr

Get the first n rows.

Alias for #head.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2, 3, 4, 5, 6, 7]})
df.select(Polars.col("foo").limit(3))
# =>
# shape: (3, 1)
# ┌─────┐
# │ foo │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# │ 2   │
# │ 3   │
# └─────┘

Parameters:

  • n (Integer) (defaults to: 10)

    Number of rows to return.

Returns:



3669
3670
3671
# File 'lib/polars/expr.rb', line 3669

def limit(n = 10)
  head(n)
end

#listListExpr

Create an object namespace of all list related methods.

Returns:



8315
8316
8317
# File 'lib/polars/expr.rb', line 8315

def list
  ListExpr.new(self)
end

#log(base = Math::E) ⇒ Expr

Compute the logarithm to a given base.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").log(2))
# =>
# shape: (3, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.0      │
# │ 1.0      │
# │ 1.584963 │
# └──────────┘

Parameters:

  • base (Float) (defaults to: Math::E)

    Given base, defaults to e.

Returns:



7605
7606
7607
7608
# File 'lib/polars/expr.rb', line 7605

def log(base = Math::E)
  base_rbexpr = Utils.parse_into_expression(base)
  wrap_expr(_rbexpr.log(base_rbexpr))
end

#log10Expr

Compute the base 10 logarithm of the input array, element-wise.

Examples:

df = Polars::DataFrame.new({"values" => [1.0, 2.0, 4.0]})
df.select(Polars.col("values").log10)
# =>
# shape: (3, 1)
# ┌─────────┐
# │ values  │
# │ ---     │
# │ f64     │
# ╞═════════╡
# │ 0.0     │
# │ 0.30103 │
# │ 0.60206 │
# └─────────┘

Returns:



353
354
355
# File 'lib/polars/expr.rb', line 353

def log10
  log(10)
end

#log1pExpr

Compute the natural logarithm of each element plus one.

This computes log(1 + x) but is more numerically stable for x close to zero.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").log1p)
# =>
# shape: (3, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.693147 │
# │ 1.098612 │
# │ 1.386294 │
# └──────────┘

Returns:



7630
7631
7632
# File 'lib/polars/expr.rb', line 7630

def log1p
  wrap_expr(_rbexpr.log1p)
end

#lower_boundExpr

Calculate the lower bound.

Returns a unit Series with the lowest value possible for the dtype of this expression.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 2, 1]})
df.select(Polars.col("a").lower_bound)
# =>
# shape: (1, 1)
# ┌──────────────────────┐
# │ a                    │
# │ ---                  │
# │ i64                  │
# ╞══════════════════════╡
# │ -9223372036854775808 │
# └──────────────────────┘

Returns:



6802
6803
6804
# File 'lib/polars/expr.rb', line 6802

def lower_bound
  wrap_expr(_rbexpr.lower_bound)
end

#lt(other) ⇒ Expr

Method equivalent of "less than" operator expr < other.

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [1.0, 2.0, Float::NAN, 3.0],
    "y" => [2.0, 2.0, Float::NAN, 4.0]
  }
)
df.with_columns(
  Polars.col("x").lt(Polars.col("y")).alias("x < y"),
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬───────┐
# │ x   ┆ y   ┆ x < y │
# │ --- ┆ --- ┆ ---   │
# │ f64 ┆ f64 ┆ bool  │
# ╞═════╪═════╪═══════╡
# │ 1.0 ┆ 2.0 ┆ true  │
# │ 2.0 ┆ 2.0 ┆ false │
# │ NaN ┆ NaN ┆ false │
# │ 3.0 ┆ 4.0 ┆ true  │
# └─────┴─────┴───────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



3955
3956
3957
# File 'lib/polars/expr.rb', line 3955

def lt(other)
  self < other
end

#map_alias(&f) ⇒ Expr

Rename the output of an expression by mapping a function over the root name.

Examples:

df = Polars::DataFrame.new(
  {
    "A" => [1, 2],
    "B" => [3, 4]
  }
)
df.select(
  Polars.all.reverse.map_alias { |colName| colName + "_reverse" }
)
# =>
# shape: (2, 2)
# ┌───────────┬───────────┐
# │ A_reverse ┆ B_reverse │
# │ ---       ┆ ---       │
# │ i64       ┆ i64       │
# ╞═══════════╪═══════════╡
# │ 2         ┆ 4         │
# │ 1         ┆ 3         │
# └───────────┴───────────┘

Returns:



556
557
558
# File 'lib/polars/expr.rb', line 556

def map_alias(&f)
  name.map(&f)
end

#maxExpr

Get maximum value.

Examples:

df = Polars::DataFrame.new({"a" => [-1.0, Float::NAN, 1.0]})
df.select(Polars.col("a").max)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# └─────┘

Returns:



2358
2359
2360
# File 'lib/polars/expr.rb', line 2358

def max
  wrap_expr(_rbexpr.max)
end

#meanExpr

Get mean value.

Examples:

df = Polars::DataFrame.new({"a" => [-1, 0, 1]})
df.select(Polars.col("a").mean)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 0.0 │
# └─────┘

Returns:



2462
2463
2464
# File 'lib/polars/expr.rb', line 2462

def mean
  wrap_expr(_rbexpr.mean)
end

#medianExpr

Get median value using linear interpolation.

Examples:

df = Polars::DataFrame.new({"a" => [-1, 0, 1]})
df.select(Polars.col("a").median)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 0.0 │
# └─────┘

Returns:



2482
2483
2484
# File 'lib/polars/expr.rb', line 2482

def median
  wrap_expr(_rbexpr.median)
end

#metaMetaExpr

Create an object namespace of all meta related expression methods.

Returns:



8350
8351
8352
# File 'lib/polars/expr.rb', line 8350

def meta
  MetaExpr.new(self)
end

#minExpr

Get minimum value.

Examples:

df = Polars::DataFrame.new({"a" => [-1.0, Float::NAN, 1.0]})
df.select(Polars.col("a").min)
# =>
# shape: (1, 1)
# ┌──────┐
# │ a    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ -1.0 │
# └──────┘

Returns:



2378
2379
2380
# File 'lib/polars/expr.rb', line 2378

def min
  wrap_expr(_rbexpr.min)
end

#mod(other) ⇒ Expr

Method equivalent of modulus operator expr % other.

Examples:

df = Polars::DataFrame.new({"x" => [0, 1, 2, 3, 4]})
df.with_columns(Polars.col("x").mod(2).alias("x%2"))
# =>
# shape: (5, 2)
# ┌─────┬─────┐
# │ x   ┆ x%2 │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 0   ┆ 0   │
# │ 1   ┆ 1   │
# │ 2   ┆ 0   │
# │ 3   ┆ 1   │
# │ 4   ┆ 0   │
# └─────┴─────┘

Parameters:

  • other (Object)

    Numeric literal or expression value.

Returns:



4131
4132
4133
# File 'lib/polars/expr.rb', line 4131

def mod(other)
  self % other
end

#modeExpr

Compute the most occurring value(s).

Can return multiple Values.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 1, 2, 3],
    "b" => [1, 1, 2, 2]
  }
)
df.select(Polars.all.mode.first)
# =>
# shape: (2, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 1   ┆ 1   │
# │ 1   ┆ 2   │
# └─────┴─────┘

Returns:



1335
1336
1337
# File 'lib/polars/expr.rb', line 1335

def mode
  wrap_expr(_rbexpr.mode)
end

#mul(other) ⇒ Expr

Method equivalent of multiplication operator expr * other.

Examples:

df = Polars::DataFrame.new({"x" => [1, 2, 4, 8, 16]})
df.with_columns(
  Polars.col("x").mul(2).alias("x*2"),
  Polars.col("x").mul(Polars.col("x").log(2)).alias("x * xlog2"),
)
# =>
# shape: (5, 3)
# ┌─────┬─────┬───────────┐
# │ x   ┆ x*2 ┆ x * xlog2 │
# │ --- ┆ --- ┆ ---       │
# │ i64 ┆ i64 ┆ f64       │
# ╞═════╪═════╪═══════════╡
# │ 1   ┆ 2   ┆ 0.0       │
# │ 2   ┆ 4   ┆ 2.0       │
# │ 4   ┆ 8   ┆ 8.0       │
# │ 8   ┆ 16  ┆ 24.0      │
# │ 16  ┆ 32  ┆ 64.0      │
# └─────┴─────┴───────────┘

Parameters:

  • other (Object)

    Numeric literal or expression value.

Returns:



4161
4162
4163
# File 'lib/polars/expr.rb', line 4161

def mul(other)
  self * other
end

#n_uniqueExpr

Count unique values.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.col("a").n_unique)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 2   │
# └─────┘

Returns:



2522
2523
2524
# File 'lib/polars/expr.rb', line 2522

def n_unique
  wrap_expr(_rbexpr.n_unique)
end

#nameNameExpr

Create an object namespace of all expressions that modify expression names.

Returns:



8357
8358
8359
# File 'lib/polars/expr.rb', line 8357

def name
  NameExpr.new(self)
end

#nan_maxExpr

Get maximum value, but propagate/poison encountered NaN values.

Examples:

df = Polars::DataFrame.new({"a" => [0.0, Float::NAN]})
df.select(Polars.col("a").nan_max)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ NaN │
# └─────┘

Returns:



2398
2399
2400
# File 'lib/polars/expr.rb', line 2398

def nan_max
  wrap_expr(_rbexpr.nan_max)
end

#nan_minExpr

Get minimum value, but propagate/poison encountered NaN values.

Examples:

df = Polars::DataFrame.new({"a" => [0.0, Float::NAN]})
df.select(Polars.col("a").nan_min)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ NaN │
# └─────┘

Returns:



2418
2419
2420
# File 'lib/polars/expr.rb', line 2418

def nan_min
  wrap_expr(_rbexpr.nan_min)
end

#ne(other) ⇒ Expr

Method equivalent of inequality operator expr != other.

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [1.0, 2.0, Float::NAN, 4.0],
    "y" => [2.0, 2.0, Float::NAN, 4.0]
  }
)
df.with_columns(
  Polars.col("x").ne(Polars.col("y")).alias("x != y"),
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬────────┐
# │ x   ┆ y   ┆ x != y │
# │ --- ┆ --- ┆ ---    │
# │ f64 ┆ f64 ┆ bool   │
# ╞═════╪═════╪════════╡
# │ 1.0 ┆ 2.0 ┆ true   │
# │ 2.0 ┆ 2.0 ┆ false  │
# │ NaN ┆ NaN ┆ false  │
# │ 4.0 ┆ 4.0 ┆ false  │
# └─────┴─────┴────────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



3988
3989
3990
# File 'lib/polars/expr.rb', line 3988

def ne(other)
  self != other
end

#ne_missing(other) ⇒ Expr

Method equivalent of equality operator expr != other where nil == nil.

This differs from default ne where null values are propagated.

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [1.0, 2.0, Float::NAN, 4.0, nil, nil],
    "y" => [2.0, 2.0, Float::NAN, 4.0, 5.0, nil]
  }
)
df.with_columns(
  Polars.col("x").ne(Polars.col("y")).alias("x ne y"),
  Polars.col("x").ne_missing(Polars.col("y")).alias("x ne_missing y")
)
# =>
# shape: (6, 4)
# ┌──────┬──────┬────────┬────────────────┐
# │ x    ┆ y    ┆ x ne y ┆ x ne_missing y │
# │ ---  ┆ ---  ┆ ---    ┆ ---            │
# │ f64  ┆ f64  ┆ bool   ┆ bool           │
# ╞══════╪══════╪════════╪════════════════╡
# │ 1.0  ┆ 2.0  ┆ true   ┆ true           │
# │ 2.0  ┆ 2.0  ┆ false  ┆ false          │
# │ NaN  ┆ NaN  ┆ false  ┆ false          │
# │ 4.0  ┆ 4.0  ┆ false  ┆ false          │
# │ null ┆ 5.0  ┆ null   ┆ true           │
# │ null ┆ null ┆ null   ┆ false          │
# └──────┴──────┴────────┴────────────────┘

Parameters:

  • other (Object)

    A literal or expression value to compare with.

Returns:



4026
4027
4028
4029
# File 'lib/polars/expr.rb', line 4026

def ne_missing(other)
  other = Utils.parse_into_expression(other, str_as_lit: true)
  wrap_expr(_rbexpr.neq_missing(other))
end

#negExpr

Method equivalent of unary minus operator -expr.

Examples:

df = Polars::DataFrame.new({"a" => [-1, 0, 2, nil]})
df.with_columns(Polars.col("a").neg)
# =>
# shape: (4, 1)
# ┌──────┐
# │ a    │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ 1    │
# │ 0    │
# │ -2   │
# │ null │
# └──────┘

Returns:



4214
4215
4216
# File 'lib/polars/expr.rb', line 4214

def neg
  -self
end

#null_countExpr

Count null values.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [nil, 1, nil],
    "b" => [1, 2, 3]
  }
)
df.select(Polars.all.null_count)
# =>
# shape: (1, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ u32 ┆ u32 │
# ╞═════╪═════╡
# │ 2   ┆ 0   │
# └─────┴─────┘

Returns:



2570
2571
2572
# File 'lib/polars/expr.rb', line 2570

def null_count
  wrap_expr(_rbexpr.null_count)
end

#or_(*others) ⇒ Expr

Method equivalent of bitwise "or" operator expr | other | ....

Examples:

df = Polars::DataFrame.new(
  {
    "x" => [5, 6, 7, 4, 8],
    "y" => [1.5, 2.5, 1.0, 4.0, -5.75],
    "z" => [-9, 2, -1, 4, 8]
  }
)
df.select(
  (Polars.col("x") == Polars.col("y"))
  .or_(
    Polars.col("x") == Polars.col("y"),
    Polars.col("y") == Polars.col("z"),
    Polars.col("y").cast(Integer) == Polars.col("z"),
  )
  .alias("any")
)
# =>
# shape: (5, 1)
# ┌───────┐
# │ any   │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ false │
# │ true  │
# │ false │
# │ true  │
# │ false │
# └───────┘

Parameters:

  • others (Array)

    One or more integer or boolean expressions to evaluate/combine.

Returns:



3752
3753
3754
# File 'lib/polars/expr.rb', line 3752

def or_(*others)
  ([self] + others).reduce(:|)
end

#over(expr) ⇒ Expr

Apply window function over a subgroup.

This is similar to a group by + aggregation + self join. Or similar to window functions in Postgres.

Examples:

df = Polars::DataFrame.new(
  {
    "groups" => ["g1", "g1", "g2"],
    "values" => [1, 2, 3]
  }
)
df.with_column(
  Polars.col("values").max.over("groups").alias("max_by_group")
)
# =>
# shape: (3, 3)
# ┌────────┬────────┬──────────────┐
# │ groups ┆ values ┆ max_by_group │
# │ ---    ┆ ---    ┆ ---          │
# │ str    ┆ i64    ┆ i64          │
# ╞════════╪════════╪══════════════╡
# │ g1     ┆ 1      ┆ 2            │
# │ g1     ┆ 2      ┆ 2            │
# │ g2     ┆ 3      ┆ 3            │
# └────────┴────────┴──────────────┘
df = Polars::DataFrame.new(
  {
    "groups" => [1, 1, 2, 2, 1, 2, 3, 3, 1],
    "values" => [1, 2, 3, 4, 5, 6, 7, 8, 8]
  }
)
df.lazy
  .select([Polars.col("groups").sum.over("groups")])
  .collect
# =>
# shape: (9, 1)
# ┌────────┐
# │ groups │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 4      │
# │ 4      │
# │ 6      │
# │ 6      │
# │ 4      │
# │ 6      │
# │ 6      │
# │ 6      │
# │ 4      │
# └────────┘

Parameters:

  • expr (Object)

    Column(s) to group by.

Returns:



2767
2768
2769
2770
# File 'lib/polars/expr.rb', line 2767

def over(expr)
  rbexprs = Utils.parse_into_list_of_expressions(expr)
  wrap_expr(_rbexpr.over(rbexprs))
end

#pct_change(n: 1) ⇒ Expr

Computes percentage change between values.

Percentage change (as fraction) between current element and most-recent non-null element at least n period(s) before the current element.

Computes the change from the previous row by default.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [10, 11, 12, nil, 12]
  }
)
df.with_column(Polars.col("a").pct_change.alias("pct_change"))
# =>
# shape: (5, 2)
# ┌──────┬────────────┐
# │ a    ┆ pct_change │
# │ ---  ┆ ---        │
# │ i64  ┆ f64        │
# ╞══════╪════════════╡
# │ 10   ┆ null       │
# │ 11   ┆ 0.1        │
# │ 12   ┆ 0.090909   │
# │ null ┆ 0.0        │
# │ 12   ┆ 0.0        │
# └──────┴────────────┘

Parameters:

  • n (Integer) (defaults to: 1)

    Periods to shift for forming percent change.

Returns:



6618
6619
6620
6621
# File 'lib/polars/expr.rb', line 6618

def pct_change(n: 1)
  n = Utils.parse_into_expression(n)
  wrap_expr(_rbexpr.pct_change(n))
end

#peak_maxExpr

Get a boolean mask of the local maximum peaks.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 4, 5]})
df.select(Polars.col("a").peak_max)
# =>
# shape: (5, 1)
# ┌───────┐
# │ a     │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ false │
# │ false │
# │ false │
# │ false │
# │ true  │
# └───────┘

Returns:



2992
2993
2994
# File 'lib/polars/expr.rb', line 2992

def peak_max
  wrap_expr(_rbexpr.peak_max)
end

#peak_minExpr

Get a boolean mask of the local minimum peaks.

Examples:

df = Polars::DataFrame.new({"a" => [4, 1, 3, 2, 5]})
df.select(Polars.col("a").peak_min)
# =>
# shape: (5, 1)
# ┌───────┐
# │ a     │
# │ ---   │
# │ bool  │
# ╞═══════╡
# │ false │
# │ true  │
# │ false │
# │ true  │
# │ false │
# └───────┘

Returns:



3016
3017
3018
# File 'lib/polars/expr.rb', line 3016

def peak_min
  wrap_expr(_rbexpr.peak_min)
end

#pow(exponent) ⇒ Expr

Raise expression to the power of exponent.

Examples:

df = Polars::DataFrame.new({"x" => [1, 2, 4, 8]})
df.with_columns(
  Polars.col("x").pow(3).alias("cube"),
  Polars.col("x").pow(Polars.col("x").log(2)).alias("x ** xlog2")
)
# =>
# shape: (4, 3)
# ┌─────┬──────┬────────────┐
# │ x   ┆ cube ┆ x ** xlog2 │
# │ --- ┆ ---  ┆ ---        │
# │ i64 ┆ i64  ┆ f64        │
# ╞═════╪══════╪════════════╡
# │ 1   ┆ 1    ┆ 1.0        │
# │ 2   ┆ 8    ┆ 2.0        │
# │ 4   ┆ 64   ┆ 16.0       │
# │ 8   ┆ 512  ┆ 512.0      │
# └─────┴──────┴────────────┘

Returns:



4272
4273
4274
# File 'lib/polars/expr.rb', line 4272

def pow(exponent)
  self**exponent
end

#prefix(prefix) ⇒ Expr

Add a prefix to the root column name of the expression.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, 3],
    "b" => ["x", "y", "z"]
  }
)
df.with_columns(Polars.all.reverse.name.prefix("reverse_"))
# =>
# shape: (3, 4)
# ┌─────┬─────┬───────────┬───────────┐
# │ a   ┆ b   ┆ reverse_a ┆ reverse_b │
# │ --- ┆ --- ┆ ---       ┆ ---       │
# │ i64 ┆ str ┆ i64       ┆ str       │
# ╞═════╪═════╪═══════════╪═══════════╡
# │ 1   ┆ x   ┆ 3         ┆ z         │
# │ 2   ┆ y   ┆ 2         ┆ y         │
# │ 3   ┆ z   ┆ 1         ┆ x         │
# └─────┴─────┴───────────┴───────────┘

Returns:



501
502
503
# File 'lib/polars/expr.rb', line 501

def prefix(prefix)
  name.prefix(prefix)
end

#productExpr

Compute the product of an expression.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").product)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 6   │
# └─────┘

Returns:



2502
2503
2504
# File 'lib/polars/expr.rb', line 2502

def product
  wrap_expr(_rbexpr.product)
end

#qcut(quantiles, labels: nil, left_closed: false, allow_duplicates: false, include_breaks: false) ⇒ Expr

Bin continuous values into discrete categories based on their quantiles.

Examples:

Divide a column into three categories according to pre-defined quantile probabilities.

df = Polars::DataFrame.new({"foo" => [-2, -1, 0, 1, 2]})
df.with_columns(
  Polars.col("foo").qcut([0.25, 0.75], labels: ["a", "b", "c"]).alias("qcut")
)
# =>
# shape: (5, 2)
# ┌─────┬──────┐
# │ foo ┆ qcut │
# │ --- ┆ ---  │
# │ i64 ┆ cat  │
# ╞═════╪══════╡
# │ -2  ┆ a    │
# │ -1  ┆ a    │
# │ 0   ┆ b    │
# │ 1   ┆ b    │
# │ 2   ┆ c    │
# └─────┴──────┘

Divide a column into two categories using uniform quantile probabilities.

df.with_columns(
  Polars.col("foo")
    .qcut(2, labels: ["low", "high"], left_closed: true)
    .alias("qcut")
)
# =>
# shape: (5, 2)
# ┌─────┬──────┐
# │ foo ┆ qcut │
# │ --- ┆ ---  │
# │ i64 ┆ cat  │
# ╞═════╪══════╡
# │ -2  ┆ low  │
# │ -1  ┆ low  │
# │ 0   ┆ high │
# │ 1   ┆ high │
# │ 2   ┆ high │
# └─────┴──────┘

Add both the category and the breakpoint.

df.with_columns(
  Polars.col("foo").qcut([0.25, 0.75], include_breaks: true).alias("qcut")
).unnest("qcut")
# =>
# shape: (5, 3)
# ┌─────┬────────────┬────────────┐
# │ foo ┆ breakpoint ┆ category   │
# │ --- ┆ ---        ┆ ---        │
# │ i64 ┆ f64        ┆ cat        │
# ╞═════╪════════════╪════════════╡
# │ -2  ┆ -1.0       ┆ (-inf, -1] │
# │ -1  ┆ -1.0       ┆ (-inf, -1] │
# │ 0   ┆ 1.0        ┆ (-1, 1]    │
# │ 1   ┆ 1.0        ┆ (-1, 1]    │
# │ 2   ┆ inf        ┆ (1, inf]   │
# └─────┴────────────┴────────────┘

Parameters:

  • quantiles (Array)

    Either a list of quantile probabilities between 0 and 1 or a positive integer determining the number of bins with uniform probability.

  • labels (Array) (defaults to: nil)

    Names of the categories. The number of labels must be equal to the number of categories.

  • left_closed (Boolean) (defaults to: false)

    Set the intervals to be left-closed instead of right-closed.

  • allow_duplicates (Boolean) (defaults to: false)

    If set to true, duplicates in the resulting quantiles are dropped, rather than raising a DuplicateError. This can happen even with unique probabilities, depending on the data.

  • include_breaks (Boolean) (defaults to: false)

    Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from a Categorical to a Struct.

Returns:



3227
3228
3229
3230
3231
3232
3233
3234
3235
3236
3237
3238
3239
# File 'lib/polars/expr.rb', line 3227

def qcut(quantiles, labels: nil, left_closed: false, allow_duplicates: false, include_breaks: false)
  if quantiles.is_a?(Integer)
    rbexpr = _rbexpr.qcut_uniform(
      quantiles, labels, left_closed, allow_duplicates, include_breaks
    )
  else
    rbexpr = _rbexpr.qcut(
      quantiles, labels, left_closed, allow_duplicates, include_breaks
    )
  end

  wrap_expr(rbexpr)
end

#quantile(quantile, interpolation: "nearest") ⇒ Expr

Get quantile value.

Examples:

df = Polars::DataFrame.new({"a" => [0, 1, 2, 3, 4, 5]})
df.select(Polars.col("a").quantile(0.3))
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 2.0 │
# └─────┘
df.select(Polars.col("a").quantile(0.3, interpolation: "higher"))
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 2.0 │
# └─────┘
df.select(Polars.col("a").quantile(0.3, interpolation: "lower"))
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# └─────┘
df.select(Polars.col("a").quantile(0.3, interpolation: "midpoint"))
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.5 │
# └─────┘
df.select(Polars.col("a").quantile(0.3, interpolation: "linear"))
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.5 │
# └─────┘

Parameters:

  • quantile (Float)

    Quantile between 0.0 and 1.0.

  • interpolation ("nearest", "higher", "lower", "midpoint", "linear") (defaults to: "nearest")

    Interpolation method.

Returns:



3089
3090
3091
3092
# File 'lib/polars/expr.rb', line 3089

def quantile(quantile, interpolation: "nearest")
  quantile = Utils.parse_into_expression(quantile, str_as_lit: false)
  wrap_expr(_rbexpr.quantile(quantile, interpolation))
end

#radiansExpr

Convert from degrees to radians.

Examples:

df = Polars::DataFrame.new({"a" => [-720, -540, -360, -180, 0, 180, 360, 540, 720]})
df.select(Polars.col("a").radians)
# =>
# shape: (9, 1)
# ┌────────────┐
# │ a          │
# │ ---        │
# │ f64        │
# ╞════════════╡
# │ -12.566371 │
# │ -9.424778  │
# │ -6.283185  │
# │ -3.141593  │
# │ 0.0        │
# │ 3.141593   │
# │ 6.283185   │
# │ 9.424778   │
# │ 12.566371  │
# └────────────┘

Returns:



7165
7166
7167
# File 'lib/polars/expr.rb', line 7165

def radians
  wrap_expr(_rbexpr.radians)
end

#rank(method: "average", reverse: false, seed: nil) ⇒ Expr

Assign ranks to data, dealing with ties appropriately.

Examples:

The 'average' method:

df = Polars::DataFrame.new({"a" => [3, 6, 1, 1, 6]})
df.select(Polars.col("a").rank)
# =>
# shape: (5, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 3.0 │
# │ 4.5 │
# │ 1.5 │
# │ 1.5 │
# │ 4.5 │
# └─────┘

The 'ordinal' method:

df = Polars::DataFrame.new({"a" => [3, 6, 1, 1, 6]})
df.select(Polars.col("a").rank(method: "ordinal"))
# =>
# shape: (5, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 3   │
# │ 4   │
# │ 1   │
# │ 2   │
# │ 5   │
# └─────┘

Parameters:

  • method ("average", "min", "max", "dense", "ordinal", "random") (defaults to: "average")

    The method used to assign ranks to tied elements. The following methods are available:

    • 'average' : The average of the ranks that would have been assigned to all the tied values is assigned to each value.
    • 'min' : The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as "competition" ranking.)
    • 'max' : The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.
    • 'dense' : Like 'min', but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements.
    • 'ordinal' : All values are given a distinct rank, corresponding to the order that the values occur in the Series.
    • 'random' : Like 'ordinal', but the rank for ties is not dependent on the order that the values occur in the Series.
  • reverse (Boolean) (defaults to: false)

    Reverse the operation.

  • seed (Integer) (defaults to: nil)

    If method: "random", use this as seed.

Returns:



6550
6551
6552
# File 'lib/polars/expr.rb', line 6550

def rank(method: "average", reverse: false, seed: nil)
  wrap_expr(_rbexpr.rank(method, reverse, seed))
end

#rechunkExpr

Create a single chunk of memory for this Series.

Examples:

Create a Series with 3 nulls, append column a then rechunk

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.repeat(nil, 3).append(Polars.col("a")).rechunk)
# =>
# shape: (6, 1)
# ┌────────┐
# │ repeat │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ null   │
# │ null   │
# │ null   │
# │ 1      │
# │ 1      │
# │ 2      │
# └────────┘

Returns:



946
947
948
# File 'lib/polars/expr.rb', line 946

def rechunk
  wrap_expr(_rbexpr.rechunk)
end

#reinterpret(signed: false) ⇒ Expr

Reinterpret the underlying bits as a signed/unsigned integer.

This operation is only allowed for 64bit integers. For lower bits integers, you can safely use that cast operation.

Examples:

s = Polars::Series.new("a", [1, 1, 2], dtype: :u64)
df = Polars::DataFrame.new([s])
df.select(
  [
    Polars.col("a").reinterpret(signed: true).alias("reinterpreted"),
    Polars.col("a").alias("original")
  ]
)
# =>
# shape: (3, 2)
# ┌───────────────┬──────────┐
# │ reinterpreted ┆ original │
# │ ---           ┆ ---      │
# │ i64           ┆ u64      │
# ╞═══════════════╪══════════╡
# │ 1             ┆ 1        │
# │ 1             ┆ 1        │
# │ 2             ┆ 2        │
# └───────────────┴──────────┘

Parameters:

  • signed (Boolean) (defaults to: false)

    If true, reinterpret as :i64. Otherwise, reinterpret as :u64.

Returns:



4554
4555
4556
# File 'lib/polars/expr.rb', line 4554

def reinterpret(signed: false)
  wrap_expr(_rbexpr.reinterpret(signed))
end

#repeat_by(by) ⇒ Expr

Repeat the elements in this Series as specified in the given expression.

The repeated elements are expanded into a List.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => ["x", "y", "z"],
    "n" => [1, 2, 3]
  }
)
df.select(Polars.col("a").repeat_by("n"))
# =>
# shape: (3, 1)
# ┌─────────────────┐
# │ a               │
# │ ---             │
# │ list[str]       │
# ╞═════════════════╡
# │ ["x"]           │
# │ ["y", "y"]      │
# │ ["z", "z", "z"] │
# └─────────────────┘

Parameters:

  • by (Object)

    Numeric column that determines how often the values will be repeated. The column will be coerced to UInt32. Give this dtype to make the coercion a no-op.

Returns:



4365
4366
4367
4368
# File 'lib/polars/expr.rb', line 4365

def repeat_by(by)
  by = Utils.parse_into_expression(by, str_as_lit: false)
  wrap_expr(_rbexpr.repeat_by(by))
end

#replace(old, new = NO_DEFAULT, default: NO_DEFAULT, return_dtype: nil) ⇒ Expr

Replace values by different values.

Examples:

Replace a single value by another value. Values that were not replaced remain unchanged.

df = Polars::DataFrame.new({"a" => [1, 2, 2, 3]})
df.with_columns(replaced: Polars.col("a").replace(2, 100))
# =>
# shape: (4, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ i64 ┆ i64      │
# ╞═════╪══════════╡
# │ 1   ┆ 1        │
# │ 2   ┆ 100      │
# │ 2   ┆ 100      │
# │ 3   ┆ 3        │
# └─────┴──────────┘

Replace multiple values by passing sequences to the old and new parameters.

df.with_columns(replaced: Polars.col("a").replace([2, 3], [100, 200]))
# =>
# shape: (4, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ i64 ┆ i64      │
# ╞═════╪══════════╡
# │ 1   ┆ 1        │
# │ 2   ┆ 100      │
# │ 2   ┆ 100      │
# │ 3   ┆ 200      │
# └─────┴──────────┘

Passing a mapping with replacements is also supported as syntactic sugar. Specify a default to set all values that were not matched.

mapping = {2 => 100, 3 => 200}
df.with_columns(replaced: Polars.col("a").replace(mapping, default: -1))
# =>
# shape: (4, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ i64 ┆ i64      │
# ╞═════╪══════════╡
# │ 1   ┆ -1       │
# │ 2   ┆ 100      │
# │ 2   ┆ 100      │
# │ 3   ┆ 200      │
# └─────┴──────────┘

Replacing by values of a different data type sets the return type based on a combination of the new data type and either the original data type or the default data type if it was set.

df = Polars::DataFrame.new({"a" => ["x", "y", "z"]})
mapping = {"x" => 1, "y" => 2, "z" => 3}
df.with_columns(replaced: Polars.col("a").replace(mapping))
# =>
# shape: (3, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ str ┆ str      │
# ╞═════╪══════════╡
# │ x   ┆ 1        │
# │ y   ┆ 2        │
# │ z   ┆ 3        │
# └─────┴──────────┘
df.with_columns(replaced: Polars.col("a").replace(mapping, default: nil))
# =>
# shape: (3, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ str ┆ i64      │
# ╞═════╪══════════╡
# │ x   ┆ 1        │
# │ y   ┆ 2        │
# │ z   ┆ 3        │
# └─────┴──────────┘

Set the return_dtype parameter to control the resulting data type directly.

df.with_columns(
  replaced: Polars.col("a").replace(mapping, return_dtype: Polars::UInt8)
)
# =>
# shape: (3, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ str ┆ u8       │
# ╞═════╪══════════╡
# │ x   ┆ 1        │
# │ y   ┆ 2        │
# │ z   ┆ 3        │
# └─────┴──────────┘

Expression input is supported for all parameters.

df = Polars::DataFrame.new({"a" => [1, 2, 2, 3], "b" => [1.5, 2.5, 5.0, 1.0]})
df.with_columns(
  replaced: Polars.col("a").replace(
    Polars.col("a").max,
    Polars.col("b").sum,
    default: Polars.col("b")
  )
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬──────────┐
# │ a   ┆ b   ┆ replaced │
# │ --- ┆ --- ┆ ---      │
# │ i64 ┆ f64 ┆ f64      │
# ╞═════╪═════╪══════════╡
# │ 1   ┆ 1.5 ┆ 1.5      │
# │ 2   ┆ 2.5 ┆ 2.5      │
# │ 2   ┆ 5.0 ┆ 5.0      │
# │ 3   ┆ 1.0 ┆ 10.0     │
# └─────┴─────┴──────────┘

Parameters:

  • old (Object)

    Value or sequence of values to replace. Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals. Also accepts a mapping of values to their replacement.

  • new (Object) (defaults to: NO_DEFAULT)

    Value or sequence of values to replace by. Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals. Length must match the length of old or have length 1.

  • default (Object) (defaults to: NO_DEFAULT)

    Set values that were not replaced to this value. Defaults to keeping the original value. Accepts expression input. Non-expression inputs are parsed as literals.

  • return_dtype (Object) (defaults to: nil)

    The data type of the resulting expression. If set to nil (default), the data type is determined automatically based on the other inputs.

Returns:



7985
7986
7987
7988
7989
7990
7991
7992
7993
7994
7995
7996
7997
7998
7999
8000
8001
8002
8003
8004
8005
8006
8007
8008
8009
8010
8011
8012
# File 'lib/polars/expr.rb', line 7985

def replace(old, new = NO_DEFAULT, default: NO_DEFAULT, return_dtype: nil)
  if !default.eql?(NO_DEFAULT)
    return replace_strict(old, new, default: default, return_dtype: return_dtype)
  end

  if new.eql?(NO_DEFAULT) && old.is_a?(Hash)
    new = Series.new(old.values)
    old = Series.new(old.keys)
  else
    if old.is_a?(::Array)
      old = Series.new(old)
    end
    if new.is_a?(::Array)
      new = Series.new(new)
    end
  end

  old = Utils.parse_into_expression(old, str_as_lit: true)
  new = Utils.parse_into_expression(new, str_as_lit: true)

  result = wrap_expr(_rbexpr.replace(old, new))

  if !return_dtype.nil?
    result = result.cast(return_dtype)
  end

  result
end

#replace_strict(old, new = NO_DEFAULT, default: NO_DEFAULT, return_dtype: nil) ⇒ Expr

Note:

The global string cache must be enabled when replacing categorical values.

Replace all values by different values.

Examples:

Replace values by passing sequences to the old and new parameters.

df = Polars::DataFrame.new({"a" => [1, 2, 2, 3]})
df.with_columns(
  replaced: Polars.col("a").replace_strict([1, 2, 3], [100, 200, 300])
)
# =>
# shape: (4, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ i64 ┆ i64      │
# ╞═════╪══════════╡
# │ 1   ┆ 100      │
# │ 2   ┆ 200      │
# │ 2   ┆ 200      │
# │ 3   ┆ 300      │
# └─────┴──────────┘

By default, an error is raised if any non-null values were not replaced. Specify a default to set all values that were not matched.

mapping = {2 => 200, 3 => 300}
df.with_columns(replaced: Polars.col("a").replace_strict(mapping, default: -1))
# =>
# shape: (4, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ i64 ┆ i64      │
# ╞═════╪══════════╡
# │ 1   ┆ -1       │
# │ 2   ┆ 200      │
# │ 2   ┆ 200      │
# │ 3   ┆ 300      │
# └─────┴──────────┘

Replacing by values of a different data type sets the return type based on a combination of the new data type and the default data type.

df = Polars::DataFrame.new({"a" => ["x", "y", "z"]})
mapping = {"x" => 1, "y" => 2, "z" => 3}
df.with_columns(replaced: Polars.col("a").replace_strict(mapping))
# =>
# shape: (3, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ str ┆ i64      │
# ╞═════╪══════════╡
# │ x   ┆ 1        │
# │ y   ┆ 2        │
# │ z   ┆ 3        │
# └─────┴──────────┘
df.with_columns(replaced: Polars.col("a").replace_strict(mapping, default: "x"))
# =>
# shape: (3, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ str ┆ str      │
# ╞═════╪══════════╡
# │ x   ┆ 1        │
# │ y   ┆ 2        │
# │ z   ┆ 3        │
# └─────┴──────────┘

Set the return_dtype parameter to control the resulting data type directly.

df.with_columns(
  replaced: Polars.col("a").replace_strict(mapping, return_dtype: Polars::UInt8)
)
# =>
# shape: (3, 2)
# ┌─────┬──────────┐
# │ a   ┆ replaced │
# │ --- ┆ ---      │
# │ str ┆ u8       │
# ╞═════╪══════════╡
# │ x   ┆ 1        │
# │ y   ┆ 2        │
# │ z   ┆ 3        │
# └─────┴──────────┘

Expression input is supported for all parameters.

df = Polars::DataFrame.new({"a" => [1, 2, 2, 3], "b" => [1.5, 2.5, 5.0, 1.0]})
df.with_columns(
  replaced: Polars.col("a").replace_strict(
    Polars.col("a").max,
    Polars.col("b").sum,
    default: Polars.col("b")
  )
)
# =>
# shape: (4, 3)
# ┌─────┬─────┬──────────┐
# │ a   ┆ b   ┆ replaced │
# │ --- ┆ --- ┆ ---      │
# │ i64 ┆ f64 ┆ f64      │
# ╞═════╪═════╪══════════╡
# │ 1   ┆ 1.5 ┆ 1.5      │
# │ 2   ┆ 2.5 ┆ 2.5      │
# │ 2   ┆ 5.0 ┆ 5.0      │
# │ 3   ┆ 1.0 ┆ 10.0     │
# └─────┴─────┴──────────┘

Parameters:

  • old (Object)

    Value or sequence of values to replace. Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals. Also accepts a mapping of values to their replacement as syntactic sugar for replace_all(old: Series.new(mapping.keys), new: Series.new(mapping.values)).

  • new (Object) (defaults to: NO_DEFAULT)

    Value or sequence of values to replace by. Accepts expression input. Sequences are parsed as Series, other non-expression inputs are parsed as literals. Length must match the length of old or have length 1.

  • default (Object) (defaults to: NO_DEFAULT)

    Set values that were not replaced to this value. If no default is specified, (default), an error is raised if any values were not replaced. Accepts expression input. Non-expression inputs are parsed as literals.

  • return_dtype (Object) (defaults to: nil)

    The data type of the resulting expression. If set to nil (default), the data type is determined automatically based on the other inputs.

Returns:



8141
8142
8143
8144
8145
8146
8147
8148
8149
8150
8151
8152
8153
8154
8155
8156
8157
8158
8159
8160
# File 'lib/polars/expr.rb', line 8141

def replace_strict(
  old,
  new = NO_DEFAULT,
  default: NO_DEFAULT,
  return_dtype: nil
)
  if new.eql?(NO_DEFAULT) && old.is_a?(Hash)
    new = Series.new(old.values)
    old = Series.new(old.keys)
  end

  old = Utils.parse_into_expression(old, str_as_lit: true, list_as_series: true)
  new = Utils.parse_into_expression(new, str_as_lit: true, list_as_series: true)

  default = default.eql?(NO_DEFAULT) ? nil : Utils.parse_into_expression(default, str_as_lit: true)

  wrap_expr(
    _rbexpr.replace_strict(old, new, default, return_dtype)
  )
end

#reshape(dims) ⇒ Expr

Reshape this Expr to a flat Series or a Series of Lists.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2, 3, 4, 5, 6, 7, 8, 9]})
square = df.select(Polars.col("foo").reshape([3, 3]))
# =>
# shape: (3, 1)
# ┌───────────────┐
# │ foo           │
# │ ---           │
# │ array[i64, 3] │
# ╞═══════════════╡
# │ [1, 2, 3]     │
# │ [4, 5, 6]     │
# │ [7, 8, 9]     │
# └───────────────┘
square.select(Polars.col("foo").reshape([9]))
# =>
# shape: (9, 1)
# ┌─────┐
# │ foo │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# │ 2   │
# │ 3   │
# │ 4   │
# │ 5   │
# │ 6   │
# │ 7   │
# │ 8   │
# │ 9   │
# └─────┘

Parameters:

  • dims (Array)

    Tuple of the dimension sizes. If a -1 is used in any of the dimensions, that dimension is inferred.

Returns:



7211
7212
7213
# File 'lib/polars/expr.rb', line 7211

def reshape(dims)
  wrap_expr(_rbexpr.reshape(dims))
end

#reverseExpr

Reverse the selection.

Examples:

df = Polars::DataFrame.new(
  {
    "A" => [1, 2, 3, 4, 5],
    "fruits" => ["banana", "banana", "apple", "apple", "banana"],
    "B" => [5, 4, 3, 2, 1],
    "cars" => ["beetle", "audi", "beetle", "beetle", "beetle"]
  }
)
df.select(
  [
    Polars.all,
    Polars.all.reverse.name.suffix("_reverse")
  ]
)
# =>
# shape: (5, 8)
# ┌─────┬────────┬─────┬────────┬───────────┬────────────────┬───────────┬──────────────┐
# │ A   ┆ fruits ┆ B   ┆ cars   ┆ A_reverse ┆ fruits_reverse ┆ B_reverse ┆ cars_reverse │
# │ --- ┆ ---    ┆ --- ┆ ---    ┆ ---       ┆ ---            ┆ ---       ┆ ---          │
# │ i64 ┆ str    ┆ i64 ┆ str    ┆ i64       ┆ str            ┆ i64       ┆ str          │
# ╞═════╪════════╪═════╪════════╪═══════════╪════════════════╪═══════════╪══════════════╡
# │ 1   ┆ banana ┆ 5   ┆ beetle ┆ 5         ┆ banana         ┆ 1         ┆ beetle       │
# │ 2   ┆ banana ┆ 4   ┆ audi   ┆ 4         ┆ apple          ┆ 2         ┆ beetle       │
# │ 3   ┆ apple  ┆ 3   ┆ beetle ┆ 3         ┆ apple          ┆ 3         ┆ beetle       │
# │ 4   ┆ apple  ┆ 2   ┆ beetle ┆ 2         ┆ banana         ┆ 4         ┆ audi         │
# │ 5   ┆ banana ┆ 1   ┆ beetle ┆ 1         ┆ banana         ┆ 5         ┆ beetle       │
# └─────┴────────┴─────┴────────┴───────────┴────────────────┴───────────┴──────────────┘

Returns:



2292
2293
2294
# File 'lib/polars/expr.rb', line 2292

def reverse
  wrap_expr(_rbexpr.reverse)
end

#rleExpr

Get the lengths of runs of identical values.

Examples:

df = Polars::DataFrame.new(Polars::Series.new("s", [1, 1, 2, 1, nil, 1, 3, 3]))
df.select(Polars.col("s").rle).unnest("s")
# =>
# shape: (6, 2)
# ┌─────┬───────┐
# │ len ┆ value │
# │ --- ┆ ---   │
# │ u32 ┆ i64   │
# ╞═════╪═══════╡
# │ 2   ┆ 1     │
# │ 1   ┆ 2     │
# │ 1   ┆ 1     │
# │ 1   ┆ null  │
# │ 1   ┆ 1     │
# │ 2   ┆ 3     │
# └─────┴───────┘

Returns:



3262
3263
3264
# File 'lib/polars/expr.rb', line 3262

def rle
  wrap_expr(_rbexpr.rle)
end

#rle_idExpr

Map values to run IDs.

Similar to RLE, but it maps each value to an ID corresponding to the run into which it falls. This is especially useful when you want to define groups by runs of identical values rather than the values themselves.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 1, 1, 1], "b" => ["x", "x", nil, "y", "y"]})
df.with_columns([Polars.col("a").rle_id.alias("a_r"), Polars.struct(["a", "b"]).rle_id.alias("ab_r")])
# =>
# shape: (5, 4)
# ┌─────┬──────┬─────┬──────┐
# │ a   ┆ b    ┆ a_r ┆ ab_r │
# │ --- ┆ ---  ┆ --- ┆ ---  │
# │ i64 ┆ str  ┆ u32 ┆ u32  │
# ╞═════╪══════╪═════╪══════╡
# │ 1   ┆ x    ┆ 0   ┆ 0    │
# │ 2   ┆ x    ┆ 1   ┆ 1    │
# │ 1   ┆ null ┆ 2   ┆ 2    │
# │ 1   ┆ y    ┆ 2   ┆ 3    │
# │ 1   ┆ y    ┆ 2   ┆ 3    │
# └─────┴──────┴─────┴──────┘

Returns:



3290
3291
3292
# File 'lib/polars/expr.rb', line 3290

def rle_id
  wrap_expr(_rbexpr.rle_id)
end

#rolling(index_column:, period:, offset: nil, closed: "right") ⇒ Expr

Create rolling groups based on a temporal or integer column.

If you have a time series <t_0, t_1, ..., t_n>, then by default the windows created will be

  • (t_0 - period, t_0]
  • (t_1 - period, t_1]
  • ...
  • (t_n - period, t_n]

whereas if you pass a non-default offset, then the windows will be

  • (t_0 + offset, t_0 + offset + period]
  • (t_1 + offset, t_1 + offset + period]
  • ...
  • (t_n + offset, t_n + offset + period]

The period and offset arguments are created either from a timedelta, or by using the following string language:

  • 1ns (1 nanosecond)
  • 1us (1 microsecond)
  • 1ms (1 millisecond)
  • 1s (1 second)
  • 1m (1 minute)
  • 1h (1 hour)
  • 1d (1 calendar day)
  • 1w (1 calendar week)
  • 1mo (1 calendar month)
  • 1q (1 calendar quarter)
  • 1y (1 calendar year)
  • 1i (1 index count)

Or combine them: "3d12h4m25s" # 3 days, 12 hours, 4 minutes, and 25 seconds

By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

Examples:

dates = [
  "2020-01-01 13:45:48",
  "2020-01-01 16:42:13",
  "2020-01-01 16:45:09",
  "2020-01-02 18:12:48",
  "2020-01-03 19:45:32",
  "2020-01-08 23:16:43"
]
df = Polars::DataFrame.new({"dt" => dates, "a": [3, 7, 5, 9, 2, 1]}).with_columns(
  Polars.col("dt").str.strptime(Polars::Datetime).set_sorted
)
df.with_columns(
  sum_a: Polars.sum("a").rolling(index_column: "dt", period: "2d"),
  min_a: Polars.min("a").rolling(index_column: "dt", period: "2d"),
  max_a: Polars.max("a").rolling(index_column: "dt", period: "2d")
)
# =>
# shape: (6, 5)
# ┌─────────────────────┬─────┬───────┬───────┬───────┐
# │ dt                  ┆ a   ┆ sum_a ┆ min_a ┆ max_a │
# │ ---                 ┆ --- ┆ ---   ┆ ---   ┆ ---   │
# │ datetime[μs]        ┆ i64 ┆ i64   ┆ i64   ┆ i64   │
# ╞═════════════════════╪═════╪═══════╪═══════╪═══════╡
# │ 2020-01-01 13:45:48 ┆ 3   ┆ 3     ┆ 3     ┆ 3     │
# │ 2020-01-01 16:42:13 ┆ 7   ┆ 10    ┆ 3     ┆ 7     │
# │ 2020-01-01 16:45:09 ┆ 5   ┆ 15    ┆ 3     ┆ 7     │
# │ 2020-01-02 18:12:48 ┆ 9   ┆ 24    ┆ 3     ┆ 9     │
# │ 2020-01-03 19:45:32 ┆ 2   ┆ 11    ┆ 2     ┆ 9     │
# │ 2020-01-08 23:16:43 ┆ 1   ┆ 1     ┆ 1     ┆ 1     │
# └─────────────────────┴─────┴───────┴───────┴───────┘

Parameters:

  • index_column (Object)

    Column used to group based on the time window. Often of type Date/Datetime. This column must be sorted in ascending order. In case of a rolling group by on indices, dtype needs to be one of \{UInt32, UInt64, Int32, Int64}. Note that the first three get temporarily cast to Int64, so if performance matters use an Int64 column.

  • period (Object)

    Length of the window - must be non-negative.

  • offset (Object) (defaults to: nil)

    Offset of the window. Default is -period.

  • closed ('right', 'left', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive).

Returns:



2859
2860
2861
2862
2863
2864
2865
2866
2867
2868
2869
2870
2871
2872
2873
# File 'lib/polars/expr.rb', line 2859

def rolling(
  index_column:,
  period:,
  offset: nil,
  closed: "right"
)
  if offset.nil?
    offset = Utils.negate_duration_string(Utils.parse_as_duration_string(period))
  end

  period = Utils.parse_as_duration_string(period)
  offset = Utils.parse_as_duration_string(offset)

  wrap_expr(_rbexpr.rolling(index_column, period, offset, closed))
end

#rolling_kurtosis(window_size, fisher: true, bias: true, min_samples: nil, center: false) ⇒ Expr

Note:

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Compute a rolling kurtosis.

The window at a given row will include the row itself, and the window_size - 1 elements before it.

Examples:

df = Polars::DataFrame.new({"a" => [1, 4, 2, 9]})
df.select(Polars.col("a").rolling_kurtosis(3))
# =>
# shape: (4, 1)
# ┌──────┐
# │ a    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ null │
# │ -1.5 │
# │ -1.5 │
# └──────┘

Parameters:

  • window_size (Integer)

    Integer size of the rolling window.

  • fisher (Boolean) (defaults to: true)

    If true, Fisher's definition is used (normal ==> 0.0). If false, Pearson's definition is used (normal ==> 3.0).

  • bias (Boolean) (defaults to: true)

    If false, the calculations are corrected for statistical bias.

  • min_samples (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If set to nil (default), it will be set equal to window_size.

  • center (defaults to: false)

    Set the labels at the center of the window.

Returns:



6412
6413
6414
6415
6416
6417
6418
6419
6420
6421
6422
6423
6424
6425
6426
6427
6428
# File 'lib/polars/expr.rb', line 6412

def rolling_kurtosis(
  window_size,
  fisher: true,
  bias: true,
  min_samples: nil,
  center: false
)
  wrap_expr(
    _rbexpr.rolling_kurtosis(
      window_size,
      fisher,
      bias,
      min_samples,
      center
    )
  )
end

#rolling_max(window_size, weights: nil, min_periods: nil, center: false) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Apply a rolling max (moving max) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]})
df.select(
  [
    Polars.col("A").rolling_max(2)
  ]
)
# =>
# shape: (6, 1)
# ┌──────┐
# │ A    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ 2.0  │
# │ 3.0  │
# │ 4.0  │
# │ 5.0  │
# │ 6.0  │
# └──────┘

Parameters:

  • window_size (Integer)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

Returns:



5791
5792
5793
5794
5795
5796
5797
5798
5799
5800
5801
5802
# File 'lib/polars/expr.rb', line 5791

def rolling_max(
  window_size,
  weights: nil,
  min_periods: nil,
  center: false
)
  wrap_expr(
    _rbexpr.rolling_max(
      window_size, weights, min_periods, center
    )
  )
end

#rolling_max_by(by, window_size, min_periods: 1, closed: "right", warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Apply a rolling max based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
    {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling max with the temporal windows closed on the right (default)

df_temporal.with_columns(
  rolling_row_max: Polars.col("index").rolling_max_by("date", "2h")
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_max │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ u32             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0               │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 1               │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 2               │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 3               │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 4               │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 20              │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 21              │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 22              │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 23              │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 24              │
# └───────┴─────────────────────┴─────────────────┘

Compute the rolling max with the closure of windows on both sides

df_temporal.with_columns(
  rolling_row_max: Polars.col("index").rolling_max_by(
    "date", "2h", closed: "both"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_max │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ u32             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0               │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 1               │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 2               │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 3               │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 4               │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 20              │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 21              │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 22              │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 23              │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 24              │
# └───────┴─────────────────────┴─────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



4870
4871
4872
4873
4874
4875
4876
4877
4878
4879
4880
4881
4882
# File 'lib/polars/expr.rb', line 4870

def rolling_max_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_max_by(by, window_size, min_periods, closed)
  )
end

#rolling_mean(window_size, weights: nil, min_periods: nil, center: false) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Apply a rolling mean (moving mean) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 8.0, 6.0, 2.0, 16.0, 10.0]})
df.select(
  [
    Polars.col("A").rolling_mean(2)
  ]
)
# =>
# shape: (6, 1)
# ┌──────┐
# │ A    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ 4.5  │
# │ 7.0  │
# │ 4.0  │
# │ 9.0  │
# │ 13.0 │
# └──────┘

Parameters:

  • window_size (Integer)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

Returns:



5869
5870
5871
5872
5873
5874
5875
5876
5877
5878
5879
5880
# File 'lib/polars/expr.rb', line 5869

def rolling_mean(
  window_size,
  weights: nil,
  min_periods: nil,
  center: false
)
  wrap_expr(
    _rbexpr.rolling_mean(
      window_size, weights, min_periods, center
    )
  )
end

#rolling_mean_by(by, window_size, min_periods: 1, closed: "right", warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Apply a rolling mean based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
    {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling mean with the temporal windows closed on the right (default)

df_temporal.with_columns(
  rolling_row_mean: Polars.col("index").rolling_mean_by(
    "date", "2h"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬──────────────────┐
# │ index ┆ date                ┆ rolling_row_mean │
# │ ---   ┆ ---                 ┆ ---              │
# │ u32   ┆ datetime[ns]        ┆ f64              │
# ╞═══════╪═════════════════════╪══════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0.0              │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.5              │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1.5              │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 2.5              │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 3.5              │
# │ …     ┆ …                   ┆ …                │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 19.5             │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 20.5             │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 21.5             │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 22.5             │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 23.5             │
# └───────┴─────────────────────┴──────────────────┘

Compute the rolling mean with the closure of windows on both sides

df_temporal.with_columns(
  rolling_row_mean: Polars.col("index").rolling_mean_by(
    "date", "2h", closed: "both"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬──────────────────┐
# │ index ┆ date                ┆ rolling_row_mean │
# │ ---   ┆ ---                 ┆ ---              │
# │ u32   ┆ datetime[ns]        ┆ f64              │
# ╞═══════╪═════════════════════╪══════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0.0              │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.5              │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1.0              │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 2.0              │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 3.0              │
# │ …     ┆ …                   ┆ …                │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 19.0             │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 20.0             │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 21.0             │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 22.0             │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 23.0             │
# └───────┴─────────────────────┴──────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



5001
5002
5003
5004
5005
5006
5007
5008
5009
5010
5011
5012
5013
5014
5015
5016
5017
5018
# File 'lib/polars/expr.rb', line 5001

def rolling_mean_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_mean_by(
      by,
      window_size,
      min_periods,
      closed
    )
  )
end

#rolling_median(window_size, weights: nil, min_periods: nil, center: false) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Compute a rolling median.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]})
df.select(
  [
    Polars.col("A").rolling_median(3)
  ]
)
# =>
# shape: (6, 1)
# ┌──────┐
# │ A    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ null │
# │ 2.0  │
# │ 3.0  │
# │ 4.0  │
# │ 6.0  │
# └──────┘

Parameters:

  • window_size (Integer)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

Returns:



6183
6184
6185
6186
6187
6188
6189
6190
6191
6192
6193
6194
# File 'lib/polars/expr.rb', line 6183

def rolling_median(
  window_size,
  weights: nil,
  min_periods: nil,
  center: false
)
  wrap_expr(
    _rbexpr.rolling_median(
      window_size, weights, min_periods, center
    )
  )
end

#rolling_median_by(by, window_size, min_periods: 1, closed: "right", warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Compute a rolling median based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
  {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling median with the temporal windows closed on the right:

df_temporal.with_columns(
  rolling_row_median: Polars.col("index").rolling_median_by(
    "date", "2h"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬────────────────────┐
# │ index ┆ date                ┆ rolling_row_median │
# │ ---   ┆ ---                 ┆ ---                │
# │ u32   ┆ datetime[ns]        ┆ f64                │
# ╞═══════╪═════════════════════╪════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0.0                │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.5                │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1.5                │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 2.5                │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 3.5                │
# │ …     ┆ …                   ┆ …                  │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 19.5               │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 20.5               │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 21.5               │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 22.5               │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 23.5               │
# └───────┴─────────────────────┴────────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



5516
5517
5518
5519
5520
5521
5522
5523
5524
5525
5526
5527
5528
# File 'lib/polars/expr.rb', line 5516

def rolling_median_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_median_by(by, window_size, min_periods, closed)
  )
end

#rolling_min(window_size, weights: nil, min_periods: nil, center: false) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Apply a rolling min (moving min) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]})
df.select(
  [
    Polars.col("A").rolling_min(2)
  ]
)
# =>
# shape: (6, 1)
# ┌──────┐
# │ A    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ 1.0  │
# │ 2.0  │
# │ 3.0  │
# │ 4.0  │
# │ 5.0  │
# └──────┘

Parameters:

  • window_size (Integer)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

Returns:



5713
5714
5715
5716
5717
5718
5719
5720
5721
5722
5723
5724
# File 'lib/polars/expr.rb', line 5713

def rolling_min(
  window_size,
  weights: nil,
  min_periods: nil,
  center: false
)
  wrap_expr(
    _rbexpr.rolling_min(
      window_size, weights, min_periods, center
    )
  )
end

#rolling_min_by(by, window_size, min_periods: 1, closed: "right", warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Apply a rolling min based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
  {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling min with the temporal windows closed on the right (default)

df_temporal.with_columns(
  rolling_row_min: Polars.col("index").rolling_min_by("date", "2h")
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_min │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ u32             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0               │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0               │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1               │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 2               │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 3               │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 19              │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 20              │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 21              │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 22              │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 23              │
# └───────┴─────────────────────┴─────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



4741
4742
4743
4744
4745
4746
4747
4748
4749
4750
4751
4752
4753
# File 'lib/polars/expr.rb', line 4741

def rolling_min_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_min_by(by, window_size, min_periods, closed)
  )
end

#rolling_quantile(quantile, interpolation: "nearest", window_size: 2, weights: nil, min_periods: nil, center: false) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Compute a rolling quantile.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]})
df.select(
  [
    Polars.col("A").rolling_quantile(0.33, window_size: 3)
  ]
)
# =>
# shape: (6, 1)
# ┌──────┐
# │ A    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ null │
# │ 2.0  │
# │ 3.0  │
# │ 4.0  │
# │ 6.0  │
# └──────┘

Parameters:

  • quantile (Float)

    Quantile between 0.0 and 1.0.

  • interpolation ("nearest", "higher", "lower", "midpoint", "linear") (defaults to: "nearest")

    Interpolation method.

  • window_size (Integer) (defaults to: 2)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

Returns:



6261
6262
6263
6264
6265
6266
6267
6268
6269
6270
6271
6272
6273
6274
# File 'lib/polars/expr.rb', line 6261

def rolling_quantile(
  quantile,
  interpolation: "nearest",
  window_size: 2,
  weights: nil,
  min_periods: nil,
  center: false
)
  wrap_expr(
    _rbexpr.rolling_quantile(
      quantile, interpolation, window_size, weights, min_periods, center
    )
  )
end

#rolling_quantile_by(by, window_size, quantile:, interpolation: "nearest", min_periods: 1, closed: "right", warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Compute a rolling quantile based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
    {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling quantile with the temporal windows closed on the right:

df_temporal.with_columns(
  rolling_row_quantile: Polars.col("index").rolling_quantile_by(
    "date", "2h", quantile: 0.3
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬──────────────────────┐
# │ index ┆ date                ┆ rolling_row_quantile │
# │ ---   ┆ ---                 ┆ ---                  │
# │ u32   ┆ datetime[ns]        ┆ f64                  │
# ╞═══════╪═════════════════════╪══════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0.0                  │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.0                  │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1.0                  │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 2.0                  │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 3.0                  │
# │ …     ┆ …                   ┆ …                    │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 19.0                 │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 20.0                 │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 21.0                 │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 22.0                 │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 23.0                 │
# └───────┴─────────────────────┴──────────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • quantile (Float)

    Quantile between 0.0 and 1.0.

  • interpolation ('nearest', 'higher', 'lower', 'midpoint', 'linear') (defaults to: "nearest")

    Interpolation method.

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



5625
5626
5627
5628
5629
5630
5631
5632
5633
5634
5635
5636
5637
5638
5639
5640
5641
5642
5643
5644
5645
5646
# File 'lib/polars/expr.rb', line 5625

def rolling_quantile_by(
  by,
  window_size,
  quantile:,
  interpolation: "nearest",
  min_periods: 1,
  closed: "right",
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_quantile_by(
      by,
      quantile,
      interpolation,
      window_size,
      min_periods,
      closed,
    )
  )
end

#rolling_skew(window_size, bias: true, min_samples: nil, center: false) ⇒ Expr

Compute a rolling skew.

Examples:

df = Polars::DataFrame.new({"a" => [1, 4, 2, 9]})
df.select(Polars.col("a").rolling_skew(3))
# =>
# shape: (4, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ null     │
# │ null     │
# │ 0.381802 │
# │ 0.47033  │
# └──────────┘

Parameters:

  • window_size (Integer)

    Integer size of the rolling window.

  • bias (Boolean) (defaults to: true)

    If false, the calculations are corrected for statistical bias.

  • min_samples (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If set to nil (default), it will be set equal to window_size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window.

Returns:



6369
6370
6371
# File 'lib/polars/expr.rb', line 6369

def rolling_skew(window_size, bias: true, min_samples: nil, center: false)
  wrap_expr(_rbexpr.rolling_skew(window_size, bias, min_samples, center))
end

#rolling_std(window_size, weights: nil, min_periods: nil, center: false, ddof: 1) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Compute a rolling standard deviation.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]})
df.select(
  [
    Polars.col("A").rolling_std(3)
  ]
)
# =>
# shape: (6, 1)
# ┌──────────┐
# │ A        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ null     │
# │ null     │
# │ 1.0      │
# │ 1.0      │
# │ 1.527525 │
# │ 2.0      │
# └──────────┘

Parameters:

  • window_size (Integer)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

  • ddof (Integer) (defaults to: 1)

    "Delta Degrees of Freedom": The divisor for a length N window is N - ddof

Returns:



6027
6028
6029
6030
6031
6032
6033
6034
6035
6036
6037
6038
6039
# File 'lib/polars/expr.rb', line 6027

def rolling_std(
  window_size,
  weights: nil,
  min_periods: nil,
  center: false,
  ddof: 1
)
  wrap_expr(
    _rbexpr.rolling_std(
      window_size, weights, min_periods, center, ddof
    )
  )
end

#rolling_std_by(by, window_size, min_periods: 1, closed: "right", ddof: 1, warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Compute a rolling standard deviation based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
    {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling std with the temporal windows closed on the right (default)

df_temporal.with_columns(
  rolling_row_std: Polars.col("index").rolling_std_by("date", "2h")
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_std │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ f64             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ null            │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.707107        │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 0.707107        │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 0.707107        │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 0.707107        │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 0.707107        │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 0.707107        │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 0.707107        │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 0.707107        │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 0.707107        │
# └───────┴─────────────────────┴─────────────────┘

Compute the rolling std with the closure of windows on both sides

df_temporal.with_columns(
  rolling_row_std: Polars.col("index").rolling_std_by(
    "date", "2h", closed: "both"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_std │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ f64             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ null            │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.707107        │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1.0             │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 1.0             │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 1.0             │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 1.0             │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 1.0             │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 1.0             │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 1.0             │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 1.0             │
# └───────┴─────────────────────┴─────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • ddof (Integer) (defaults to: 1)

    "Delta Degrees of Freedom": The divisor for a length N window is N - ddof

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



5266
5267
5268
5269
5270
5271
5272
5273
5274
5275
5276
5277
5278
5279
5280
5281
5282
5283
5284
5285
# File 'lib/polars/expr.rb', line 5266

def rolling_std_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  ddof: 1,
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_std_by(
      by,
      window_size,
      min_periods,
      closed,
      ddof
    )
  )
end

#rolling_sum(window_size, weights: nil, min_periods: nil, center: false) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Apply a rolling sum (moving sum) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]})
df.select(
  [
    Polars.col("A").rolling_sum(2)
  ]
)
# =>
# shape: (6, 1)
# ┌──────┐
# │ A    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ null │
# │ 3.0  │
# │ 5.0  │
# │ 7.0  │
# │ 9.0  │
# │ 11.0 │
# └──────┘

Parameters:

  • window_size (Integer)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

Returns:



5947
5948
5949
5950
5951
5952
5953
5954
5955
5956
5957
5958
# File 'lib/polars/expr.rb', line 5947

def rolling_sum(
  window_size,
  weights: nil,
  min_periods: nil,
  center: false
)
  wrap_expr(
    _rbexpr.rolling_sum(
      window_size, weights, min_periods, center
    )
  )
end

#rolling_sum_by(by, window_size, min_periods: 1, closed: "right", warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Apply a rolling sum based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
    {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling sum with the temporal windows closed on the right (default)

df_temporal.with_columns(
  rolling_row_sum: Polars.col("index").rolling_sum_by("date", "2h")
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_sum │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ u32             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0               │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 1               │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 3               │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 5               │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 7               │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 39              │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 41              │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 43              │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 45              │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 47              │
# └───────┴─────────────────────┴─────────────────┘

Compute the rolling sum with the closure of windows on both sides

df_temporal.with_columns(
  rolling_row_sum: Polars.col("index").rolling_sum_by(
    "date", "2h", closed: "both"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_sum │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ u32             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ 0               │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 1               │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 3               │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 6               │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 9               │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 57              │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 60              │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 63              │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 66              │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 69              │
# └───────┴─────────────────────┴─────────────────┘

Parameters:

  • by (String)

    This column must of dtype {Date, Datetime}

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



5135
5136
5137
5138
5139
5140
5141
5142
5143
5144
5145
5146
5147
# File 'lib/polars/expr.rb', line 5135

def rolling_sum_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_sum_by(by, window_size, min_periods, closed)
  )
end

#rolling_var(window_size, weights: nil, min_periods: nil, center: false, ddof: 1) ⇒ Expr

Note:

This functionality is experimental and may change without it being considered a breaking change.

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using group_by_rolling this method can cache the window size computation.

Compute a rolling variance.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by the weight vector. The resulting values will be aggregated to their sum.

Examples:

df = Polars::DataFrame.new({"A" => [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]})
df.select(
  [
    Polars.col("A").rolling_var(3)
  ]
)
# =>
# shape: (6, 1)
# ┌──────────┐
# │ A        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ null     │
# │ null     │
# │ 1.0      │
# │ 1.0      │
# │ 2.333333 │
# │ 4.0      │
# └──────────┘

Parameters:

  • window_size (Integer)

    The length of the window. Can be a fixed integer size, or a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 day)
    • 1w (1 week)
    • 1mo (1 calendar month)
    • 1y (1 calendar year)
    • 1i (1 index count)

    If a timedelta or the dynamic string language is used, the by and closed arguments must also be set.

  • weights (Array) (defaults to: nil)

    An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

  • min_periods (Integer) (defaults to: nil)

    The number of values in the window that should be non-null before computing a result. If nil, it will be set equal to window size.

  • center (Boolean) (defaults to: false)

    Set the labels at the center of the window

  • ddof (Integer) (defaults to: 1)

    "Delta Degrees of Freedom": The divisor for a length N window is N - ddof

Returns:



6108
6109
6110
6111
6112
6113
6114
6115
6116
6117
6118
6119
6120
# File 'lib/polars/expr.rb', line 6108

def rolling_var(
  window_size,
  weights: nil,
  min_periods: nil,
  center: false,
  ddof: 1
)
  wrap_expr(
    _rbexpr.rolling_var(
      window_size, weights, min_periods, center, ddof
    )
  )
end

#rolling_var_by(by, window_size, min_periods: 1, closed: "right", ddof: 1, warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Compute a rolling variance based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
    {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling var with the temporal windows closed on the right (default)

df_temporal.with_columns(
  rolling_row_var: Polars.col("index").rolling_var_by("date", "2h")
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_var │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ f64             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ null            │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.5             │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 0.5             │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 0.5             │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 0.5             │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 0.5             │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 0.5             │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 0.5             │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 0.5             │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 0.5             │
# └───────┴─────────────────────┴─────────────────┘

Compute the rolling var with the closure of windows on both sides

df_temporal.with_columns(
  rolling_row_var: Polars.col("index").rolling_var_by(
    "date", "2h", closed: "both"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_var │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ f64             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ null            │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.5             │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1.0             │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 1.0             │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 1.0             │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 1.0             │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 1.0             │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 1.0             │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 1.0             │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 1.0             │
# └───────┴─────────────────────┴─────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • ddof (Integer) (defaults to: 1)

    "Delta Degrees of Freedom": The divisor for a length N window is N - ddof

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:



5404
5405
5406
5407
5408
5409
5410
5411
5412
5413
5414
5415
5416
5417
5418
5419
5420
5421
5422
5423
# File 'lib/polars/expr.rb', line 5404

def rolling_var_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  ddof: 1,
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  wrap_expr(
    _rbexpr.rolling_var_by(
      by,
      window_size,
      min_periods,
      closed,
      ddof
    )
  )
end

#round(decimals = 0, mode: "half_to_even") ⇒ Expr

Round underlying floating point data by decimals digits.

Examples:

df = Polars::DataFrame.new({"a" => [0.33, 0.52, 1.02, 1.17]})
df.select(Polars.col("a").round(1))
# =>
# shape: (4, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 0.3 │
# │ 0.5 │
# │ 1.0 │
# │ 1.2 │
# └─────┘

Parameters:

  • decimals (Integer) (defaults to: 0)

    Number of decimals to round by.

  • mode ('half_to_even', 'half_away_from_zero') (defaults to: "half_to_even")

    RoundMode.

    • half_to_even round to the nearest even number
    • half_away_from_zero round to the nearest number away from zero

Returns:



1253
1254
1255
# File 'lib/polars/expr.rb', line 1253

def round(decimals = 0, mode: "half_to_even")
  wrap_expr(_rbexpr.round(decimals, mode))
end

#round_sig_figs(digits) ⇒ Expr

Round to a number of significant figures.

Examples:

df = Polars::DataFrame.new({"a" => [0.01234, 3.333, 1234.0]})
df.with_columns(Polars.col("a").round_sig_figs(2).alias("round_sig_figs"))
# =>
# shape: (3, 2)
# ┌─────────┬────────────────┐
# │ a       ┆ round_sig_figs │
# │ ---     ┆ ---            │
# │ f64     ┆ f64            │
# ╞═════════╪════════════════╡
# │ 0.01234 ┆ 0.012          │
# │ 3.333   ┆ 3.3            │
# │ 1234.0  ┆ 1200.0         │
# └─────────┴────────────────┘

Parameters:

  • digits (Integer)

    Number of significant figures to round to.

Returns:



1278
1279
1280
# File 'lib/polars/expr.rb', line 1278

def round_sig_figs(digits)
  wrap_expr(_rbexpr.round_sig_figs(digits))
end

#sample(frac: nil, with_replacement: true, shuffle: false, seed: nil, n: nil) ⇒ Expr

Sample from this expression.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").sample(frac: 1.0, with_replacement: true, seed: 1))
# =>
# shape: (3, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 3   │
# │ 3   │
# │ 1   │
# └─────┘

Parameters:

  • frac (Float) (defaults to: nil)

    Fraction of items to return. Cannot be used with n.

  • with_replacement (Boolean) (defaults to: true)

    Allow values to be sampled more than once.

  • shuffle (Boolean) (defaults to: false)

    Shuffle the order of sampled data points.

  • seed (Integer) (defaults to: nil)

    Seed for the random number generator. If set to nil (default), a random seed is used.

  • n (Integer) (defaults to: nil)

    Number of items to return. Cannot be used with frac.

Returns:



7274
7275
7276
7277
7278
7279
7280
7281
7282
7283
7284
7285
7286
7287
7288
7289
7290
7291
7292
7293
7294
7295
7296
7297
# File 'lib/polars/expr.rb', line 7274

def sample(
  frac: nil,
  with_replacement: true,
  shuffle: false,
  seed: nil,
  n: nil
)
  if !n.nil? && !frac.nil?
    raise ArgumentError, "cannot specify both `n` and `frac`"
  end

  if !n.nil? && frac.nil?
    n = Utils.parse_into_expression(n)
    return wrap_expr(_rbexpr.sample_n(n, with_replacement, shuffle, seed))
  end

  if frac.nil?
    frac = 1.0
  end
  frac = Utils.parse_into_expression(frac)
  wrap_expr(
    _rbexpr.sample_frac(frac, with_replacement, shuffle, seed)
  )
end

#search_sorted(element, side: "any", descending: false) ⇒ Expr

Find indices where elements should be inserted to maintain order.

Examples:

df = Polars::DataFrame.new(
  {
    "values" => [1, 2, 3, 5]
  }
)
df.select(
  [
    Polars.col("values").search_sorted(0).alias("zero"),
    Polars.col("values").search_sorted(3).alias("three"),
    Polars.col("values").search_sorted(6).alias("six")
  ]
)
# =>
# shape: (1, 3)
# ┌──────┬───────┬─────┐
# │ zero ┆ three ┆ six │
# │ ---  ┆ ---   ┆ --- │
# │ u32  ┆ u32   ┆ u32 │
# ╞══════╪═══════╪═════╡
# │ 0    ┆ 2     ┆ 4   │
# └──────┴───────┴─────┘

Parameters:

  • element (Object)

    Expression or scalar value.

  • side ('any', 'left', 'right') (defaults to: "any")

    If 'any', the index of the first suitable location found is given. If 'left', the index of the leftmost suitable location found is given. If 'right', return the rightmost suitable location found is given.

  • descending (Boolean) (defaults to: false)

    Boolean indicating whether the values are descending or not (they are required to be sorted either way).

Returns:



1888
1889
1890
1891
# File 'lib/polars/expr.rb', line 1888

def search_sorted(element, side: "any", descending: false)
  element = Utils.parse_into_expression(element, str_as_lit: false)
  wrap_expr(_rbexpr.search_sorted(element, side, descending))
end

#set_sorted(descending: false) ⇒ Expr

Note:

This can lead to incorrect results if this Series is not sorted!! Use with care!

Flags the expression as 'sorted'.

Enables downstream code to user fast paths for sorted arrays.

Examples:

df = Polars::DataFrame.new({"values" => [1, 2, 3]})
df.select(Polars.col("values").set_sorted.max)
# =>
# shape: (1, 1)
# ┌────────┐
# │ values │
# │ ---    │
# │ i64    │
# ╞════════╡
# │ 3      │
# └────────┘

Parameters:

  • descending (Boolean) (defaults to: false)

    Whether the Series order is descending.

Returns:



7744
7745
7746
# File 'lib/polars/expr.rb', line 7744

def set_sorted(descending: false)
  wrap_expr(_rbexpr.set_sorted_flag(descending))
end

#shift(n = 1, fill_value: nil) ⇒ Expr

Shift the values by a given period.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2, 3, 4]})
df.select(Polars.col("foo").shift(1))
# =>
# shape: (4, 1)
# ┌──────┐
# │ foo  │
# │ ---  │
# │ i64  │
# ╞══════╡
# │ null │
# │ 1    │
# │ 2    │
# │ 3    │
# └──────┘

Parameters:

  • n (Integer) (defaults to: 1)

    Number of places to shift (may be negative).

  • fill_value (Object) (defaults to: nil)

    Fill the resulting null values with this value.

Returns:



2058
2059
2060
2061
2062
2063
2064
# File 'lib/polars/expr.rb', line 2058

def shift(n = 1, fill_value: nil)
  if !fill_value.nil?
    fill_value = Utils.parse_into_expression(fill_value, str_as_lit: true)
  end
  n = Utils.parse_into_expression(n)
  wrap_expr(_rbexpr.shift(n, fill_value))
end

#shift_and_fill(periods, fill_value) ⇒ Expr

Shift the values by a given period and fill the resulting null values.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2, 3, 4]})
df.select(Polars.col("foo").shift_and_fill(1, "a"))
# =>
# shape: (4, 1)
# ┌─────┐
# │ foo │
# │ --- │
# │ str │
# ╞═════╡
# │ a   │
# │ 1   │
# │ 2   │
# │ 3   │
# └─────┘

Parameters:

  • periods (Integer)

    Number of places to shift (may be negative).

  • fill_value (Object)

    Fill nil values with the result of this expression.

Returns:



2090
2091
2092
# File 'lib/polars/expr.rb', line 2090

def shift_and_fill(periods, fill_value)
  shift(periods, fill_value: fill_value)
end

#shrink_dtypeExpr

Shrink numeric columns to the minimal required datatype.

Shrink to the dtype needed to fit the extrema of this Series. This can be used to reduce memory pressure.

Returns:



7779
7780
7781
7782
# File 'lib/polars/expr.rb', line 7779

def shrink_dtype
  warn "`Expr.shrink_dtype` is deprecated and is a no-op; use `Series.shrink_dtype` instead."
  self
end

#shuffle(seed: nil) ⇒ Expr

Shuffle the contents of this expr.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3]})
df.select(Polars.col("a").shuffle(seed: 1))
# =>
# shape: (3, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 2   │
# │ 3   │
# │ 1   │
# └─────┘

Parameters:

  • seed (Integer) (defaults to: nil)

    Seed for the random number generator. If set to nil (default), a random seed is generated using the random module.

Returns:



7237
7238
7239
7240
7241
7242
# File 'lib/polars/expr.rb', line 7237

def shuffle(seed: nil)
  if seed.nil?
    seed = rand(10000)
  end
  wrap_expr(_rbexpr.shuffle(seed))
end

#signExpr

Compute the element-wise indication of the sign.

Examples:

df = Polars::DataFrame.new({"a" => [-9.0, -0.0, 0.0, 4.0, nil]})
df.select(Polars.col("a").sign)
# =>
# shape: (5, 1)
# ┌──────┐
# │ a    │
# │ ---  │
# │ f64  │
# ╞══════╡
# │ -1.0 │
# │ -0.0 │
# │ 0.0  │
# │ 1.0  │
# │ null │
# └──────┘

Returns:



6849
6850
6851
# File 'lib/polars/expr.rb', line 6849

def sign
  wrap_expr(_rbexpr.sign)
end

#sinExpr

Compute the element-wise value for the sine.

Examples:

df = Polars::DataFrame.new({"a" => [0.0]})
df.select(Polars.col("a").sin)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 0.0 │
# └─────┘

Returns:



6869
6870
6871
# File 'lib/polars/expr.rb', line 6869

def sin
  wrap_expr(_rbexpr.sin)
end

#sinhExpr

Compute the element-wise value for the hyperbolic sine.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").sinh)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.175201 │
# └──────────┘

Returns:



7009
7010
7011
# File 'lib/polars/expr.rb', line 7009

def sinh
  wrap_expr(_rbexpr.sinh)
end

#skew(bias: true) ⇒ Expr

Compute the sample skewness of a data set.

For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 2, 1]})
df.select(Polars.col("a").skew)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.343622 │
# └──────────┘

Parameters:

  • bias (Boolean) (defaults to: true)

    If false, the calculations are corrected for statistical bias.

Returns:



6648
6649
6650
# File 'lib/polars/expr.rb', line 6648

def skew(bias: true)
  wrap_expr(_rbexpr.skew(bias))
end

#slice(offset, length = nil) ⇒ Expr

Get a slice of this expression.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [8, 9, 10, 11],
    "b" => [nil, 4, 4, 4]
  }
)
df.select(Polars.all.slice(1, 2))
# =>
# shape: (2, 2)
# ┌─────┬─────┐
# │ a   ┆ b   │
# │ --- ┆ --- │
# │ i64 ┆ i64 │
# ╞═════╪═════╡
# │ 9   ┆ 4   │
# │ 10  ┆ 4   │
# └─────┴─────┘

Parameters:

  • offset (Integer)

    Start index. Negative indexing is supported.

  • length (Integer) (defaults to: nil)

    Length of the slice. If set to nil, all rows starting at the offset will be selected.

Returns:



881
882
883
884
885
886
887
888
889
# File 'lib/polars/expr.rb', line 881

def slice(offset, length = nil)
  if !offset.is_a?(Expr)
    offset = Polars.lit(offset)
  end
  if !length.is_a?(Expr)
    length = Polars.lit(length)
  end
  wrap_expr(_rbexpr.slice(offset._rbexpr, length._rbexpr))
end

#sort(reverse: false, nulls_last: false) ⇒ Expr

Sort this column. In projection/ selection context the whole column is sorted.

If used in a group by context, the groups are sorted.

Examples:

df = Polars::DataFrame.new(
  {
    "group" => [
      "one",
      "one",
      "one",
      "two",
      "two",
      "two"
    ],
    "value" => [1, 98, 2, 3, 99, 4]
  }
)
df.select(Polars.col("value").sort)
# =>
# shape: (6, 1)
# ┌───────┐
# │ value │
# │ ---   │
# │ i64   │
# ╞═══════╡
# │ 1     │
# │ 2     │
# │ 3     │
# │ 4     │
# │ 98    │
# │ 99    │
# └───────┘
df.select(Polars.col("value").sort)
# =>
# shape: (6, 1)
# ┌───────┐
# │ value │
# │ ---   │
# │ i64   │
# ╞═══════╡
# │ 1     │
# │ 2     │
# │ 3     │
# │ 4     │
# │ 98    │
# │ 99    │
# └───────┘
df.group_by("group").agg(Polars.col("value").sort)
# =>
# shape: (2, 2)
# ┌───────┬────────────┐
# │ group ┆ value      │
# │ ---   ┆ ---        │
# │ str   ┆ list[i64]  │
# ╞═══════╪════════════╡
# │ two   ┆ [3, 4, 99] │
# │ one   ┆ [1, 2, 98] │
# └───────┴────────────┘

Parameters:

  • reverse (Boolean) (defaults to: false)

    false -> order from small to large. true -> order from large to small.

  • nulls_last (Boolean) (defaults to: false)

    If true nulls are considered to be larger than any valid value.

Returns:



1449
1450
1451
# File 'lib/polars/expr.rb', line 1449

def sort(reverse: false, nulls_last: false)
  wrap_expr(_rbexpr.sort_with(reverse, nulls_last))
end

#sort_by(by, *more_by, reverse: false, nulls_last: false, multithreaded: true, maintain_order: false) ⇒ Expr

Sort this column by the ordering of another column, or multiple other columns.

In projection/ selection context the whole column is sorted. If used in a group by context, the groups are sorted.

Examples:

df = Polars::DataFrame.new(
  {
    "group" => [
      "one",
      "one",
      "one",
      "two",
      "two",
      "two"
    ],
    "value" => [1, 98, 2, 3, 99, 4]
  }
)
df.select(Polars.col("group").sort_by("value"))
# =>
# shape: (6, 1)
# ┌───────┐
# │ group │
# │ ---   │
# │ str   │
# ╞═══════╡
# │ one   │
# │ one   │
# │ two   │
# │ two   │
# │ one   │
# │ two   │
# └───────┘

Parameters:

  • by (Object)

    The column(s) used for sorting.

  • more_by (Array)

    Additional columns to sort by, specified as positional arguments.

  • reverse (Boolean) (defaults to: false)

    false -> order from small to large. true -> order from large to small.

  • nulls_last (Boolean) (defaults to: false)

    Place null values last; can specify a single boolean applying to all columns or a sequence of booleans for per-column control.

  • multithreaded (Boolean) (defaults to: true)

    Sort using multiple threads.

  • maintain_order (Boolean) (defaults to: false)

    Whether the order should be maintained if elements are equal.

Returns:



1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
# File 'lib/polars/expr.rb', line 1944

def sort_by(by, *more_by, reverse: false, nulls_last: false, multithreaded: true, maintain_order: false)
  by = Utils.parse_into_list_of_expressions(by, *more_by)
  reverse = Utils.extend_bool(reverse, by.length, "reverse", "by")
  nulls_last = Utils.extend_bool(nulls_last, by.length, "nulls_last", "by")
  wrap_expr(
    _rbexpr.sort_by(
      by, reverse, nulls_last, multithreaded, maintain_order
    )
  )
end

#sqrtExpr

Compute the square root of the elements.

Examples:

df = Polars::DataFrame.new({"values" => [1.0, 2.0, 4.0]})
df.select(Polars.col("values").sqrt)
# =>
# shape: (3, 1)
# ┌──────────┐
# │ values   │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.0      │
# │ 1.414214 │
# │ 2.0      │
# └──────────┘

Returns:



309
310
311
# File 'lib/polars/expr.rb', line 309

def sqrt
  wrap_expr(_rbexpr.sqrt)
end

#std(ddof: 1) ⇒ Expr

Get standard deviation.

Examples:

df = Polars::DataFrame.new({"a" => [-1, 0, 1]})
df.select(Polars.col("a").std)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# └─────┘

Parameters:

  • ddof (Integer) (defaults to: 1)

    Degrees of freedom.

Returns:



2315
2316
2317
# File 'lib/polars/expr.rb', line 2315

def std(ddof: 1)
  wrap_expr(_rbexpr.std(ddof))
end

#strStringExpr

Create an object namespace of all string related methods.

Returns:



8364
8365
8366
# File 'lib/polars/expr.rb', line 8364

def str
  StringExpr.new(self)
end

#structStructExpr

Create an object namespace of all struct related methods.

Returns:



8371
8372
8373
# File 'lib/polars/expr.rb', line 8371

def struct
  StructExpr.new(self)
end

#sub(other) ⇒ Expr

Method equivalent of subtraction operator expr - other.

Examples:

df = Polars::DataFrame.new({"x" => [0, 1, 2, 3, 4]})
df.with_columns(
  Polars.col("x").sub(2).alias("x-2"),
  Polars.col("x").sub(Polars.col("x").cum_sum).alias("x-expr"),
)
# =>
# shape: (5, 3)
# ┌─────┬─────┬────────┐
# │ x   ┆ x-2 ┆ x-expr │
# │ --- ┆ --- ┆ ---    │
# │ i64 ┆ i64 ┆ i64    │
# ╞═════╪═════╪════════╡
# │ 0   ┆ -2  ┆ 0      │
# │ 1   ┆ -1  ┆ 0      │
# │ 2   ┆ 0   ┆ -1     │
# │ 3   ┆ 1   ┆ -3     │
# │ 4   ┆ 2   ┆ -6     │
# └─────┴─────┴────────┘

Parameters:

  • other (Object)

    Numeric literal or expression value.

Returns:



4191
4192
4193
# File 'lib/polars/expr.rb', line 4191

def sub(other)
  self - other
end

#suffix(suffix) ⇒ Expr

Add a suffix to the root column name of the expression.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, 3],
    "b" => ["x", "y", "z"]
  }
)
df.with_columns(Polars.all.reverse.name.suffix("_reverse"))
# =>
# shape: (3, 4)
# ┌─────┬─────┬───────────┬───────────┐
# │ a   ┆ b   ┆ a_reverse ┆ b_reverse │
# │ --- ┆ --- ┆ ---       ┆ ---       │
# │ i64 ┆ str ┆ i64       ┆ str       │
# ╞═════╪═════╪═══════════╪═══════════╡
# │ 1   ┆ x   ┆ 3         ┆ z         │
# │ 2   ┆ y   ┆ 2         ┆ y         │
# │ 3   ┆ z   ┆ 1         ┆ x         │
# └─────┴─────┴───────────┴───────────┘

Returns:



528
529
530
# File 'lib/polars/expr.rb', line 528

def suffix(suffix)
  name.suffix(suffix)
end

#sumExpr

Note:

Dtypes in :i8, :u8, :i16, and :u16 are cast to :i64 before summing to prevent overflow issues.

Get sum value.

Examples:

df = Polars::DataFrame.new({"a" => [-1, 0, 1]})
df.select(Polars.col("a").sum)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 0   │
# └─────┘

Returns:



2442
2443
2444
# File 'lib/polars/expr.rb', line 2442

def sum
  wrap_expr(_rbexpr.sum)
end

#tail(n = 10) ⇒ Expr

Get the last n rows.

Examples:

df = Polars::DataFrame.new({"foo" => [1, 2, 3, 4, 5, 6, 7]})
df.tail(3)
# =>
# shape: (3, 1)
# ┌─────┐
# │ foo │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 5   │
# │ 6   │
# │ 7   │
# └─────┘

Parameters:

  • n (Integer) (defaults to: 10)

    Number of rows to return.

Returns:



3642
3643
3644
# File 'lib/polars/expr.rb', line 3642

def tail(n = 10)
  wrap_expr(_rbexpr.tail(n))
end

#tanExpr

Compute the element-wise value for the tangent.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").tan)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 1.557408 │
# └──────────┘

Returns:



6909
6910
6911
# File 'lib/polars/expr.rb', line 6909

def tan
  wrap_expr(_rbexpr.tan)
end

#tanhExpr

Compute the element-wise value for the hyperbolic tangent.

Examples:

df = Polars::DataFrame.new({"a" => [1.0]})
df.select(Polars.col("a").tanh)
# =>
# shape: (1, 1)
# ┌──────────┐
# │ a        │
# │ ---      │
# │ f64      │
# ╞══════════╡
# │ 0.761594 │
# └──────────┘

Returns:



7049
7050
7051
# File 'lib/polars/expr.rb', line 7049

def tanh
  wrap_expr(_rbexpr.tanh)
end

#to_physicalExpr

Cast to physical representation of the logical dtype.

  • :date -> :i32
  • :datetime -> :i64
  • :time -> :i64
  • :duration -> :i64
  • :cat -> :u32
  • Other data types will be left unchanged.

Examples:

Polars::DataFrame.new({"vals" => ["a", "x", nil, "a"]}).with_columns(
  [
    Polars.col("vals").cast(:cat),
    Polars.col("vals")
      .cast(:cat)
      .to_physical
      .alias("vals_physical")
  ]
)
# =>
# shape: (4, 2)
# ┌──────┬───────────────┐
# │ vals ┆ vals_physical │
# │ ---  ┆ ---           │
# │ cat  ┆ u32           │
# ╞══════╪═══════════════╡
# │ a    ┆ 0             │
# │ x    ┆ 1             │
# │ null ┆ null          │
# │ a    ┆ 0             │
# └──────┴───────────────┘

Returns:



216
217
218
# File 'lib/polars/expr.rb', line 216

def to_physical
  wrap_expr(_rbexpr.to_physical)
end

#to_sString Also known as: inspect

Returns a string representing the Expr.

Returns:



20
21
22
# File 'lib/polars/expr.rb', line 20

def to_s
  _rbexpr.to_str
end

#top_k(k: 5) ⇒ Expr

Return the k largest elements.

If 'reverse: true` the smallest elements will be given.

Examples:

df = Polars::DataFrame.new(
  {
    "value" => [1, 98, 2, 3, 99, 4]
  }
)
df.select(
  [
    Polars.col("value").top_k.alias("top_k"),
    Polars.col("value").bottom_k.alias("bottom_k")
  ]
)
# =>
# shape: (5, 2)
# ┌───────┬──────────┐
# │ top_k ┆ bottom_k │
# │ ---   ┆ ---      │
# │ i64   ┆ i64      │
# ╞═══════╪══════════╡
# │ 99    ┆ 1        │
# │ 98    ┆ 2        │
# │ 4     ┆ 3        │
# │ 3     ┆ 4        │
# │ 2     ┆ 98       │
# └───────┴──────────┘

Parameters:

  • k (Integer) (defaults to: 5)

    Number of elements to return.

Returns:



1487
1488
1489
1490
# File 'lib/polars/expr.rb', line 1487

def top_k(k: 5)
  k = Utils.parse_into_expression(k)
  wrap_expr(_rbexpr.top_k(k))
end

#top_k_by(by, k: 5, reverse: false) ⇒ Expr

Return the elements corresponding to the k largest elements of the by column(s).

Non-null elements are always preferred over null elements, regardless of the value of reverse. The output is not guaranteed to be in any particular order, call :func:sort after this function if you wish the output to be sorted.

Examples:

df = Polars::DataFrame.new(
  {
    "a" => [1, 2, 3, 4, 5, 6],
    "b" => [6, 5, 4, 3, 2, 1],
    "c" => ["Apple", "Orange", "Apple", "Apple", "Banana", "Banana"]
  }
)
# =>
# shape: (6, 3)
# ┌─────┬─────┬────────┐
# │ a   ┆ b   ┆ c      │
# │ --- ┆ --- ┆ ---    │
# │ i64 ┆ i64 ┆ str    │
# ╞═════╪═════╪════════╡
# │ 1   ┆ 6   ┆ Apple  │
# │ 2   ┆ 5   ┆ Orange │
# │ 3   ┆ 4   ┆ Apple  │
# │ 4   ┆ 3   ┆ Apple  │
# │ 5   ┆ 2   ┆ Banana │
# │ 6   ┆ 1   ┆ Banana │
# └─────┴─────┴────────┘

Get the top 2 rows by column a or b.

df.select(
  Polars.all.top_k_by("a", k: 2).name.suffix("_top_by_a"),
  Polars.all.top_k_by("b", k: 2).name.suffix("_top_by_b")
)
# =>
# shape: (2, 6)
# ┌────────────┬────────────┬────────────┬────────────┬────────────┬────────────┐
# │ a_top_by_a ┆ b_top_by_a ┆ c_top_by_a ┆ a_top_by_b ┆ b_top_by_b ┆ c_top_by_b │
# │ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        ┆ ---        │
# │ i64        ┆ i64        ┆ str        ┆ i64        ┆ i64        ┆ str        │
# ╞════════════╪════════════╪════════════╪════════════╪════════════╪════════════╡
# │ 6          ┆ 1          ┆ Banana     ┆ 1          ┆ 6          ┆ Apple      │
# │ 5          ┆ 2          ┆ Banana     ┆ 2          ┆ 5          ┆ Orange     │
# └────────────┴────────────┴────────────┴────────────┴────────────┴────────────┘

Get the top 2 rows by multiple columns with given order.

df.select(
  Polars.all
  .top_k_by(["c", "a"], k: 2, reverse: [false, true])
  .name.suffix("_by_ca"),
  Polars.all
  .top_k_by(["c", "b"], k: 2, reverse: [false, true])
  .name.suffix("_by_cb")
)
# =>
# shape: (2, 6)
# ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
# │ a_by_ca ┆ b_by_ca ┆ c_by_ca ┆ a_by_cb ┆ b_by_cb ┆ c_by_cb │
# │ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     ┆ ---     │
# │ i64     ┆ i64     ┆ str     ┆ i64     ┆ i64     ┆ str     │
# ╞═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
# │ 2       ┆ 5       ┆ Orange  ┆ 2       ┆ 5       ┆ Orange  │
# │ 5       ┆ 2       ┆ Banana  ┆ 6       ┆ 1       ┆ Banana  │
# └─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘

Get the top 2 rows by column a in each group.

df.group_by("c", maintain_order: true)
  .agg(Polars.all.top_k_by("a", k: 2))
  .explode(Polars.all.exclude("c"))
# =>
# shape: (5, 3)
# ┌────────┬─────┬─────┐
# │ c      ┆ a   ┆ b   │
# │ ---    ┆ --- ┆ --- │
# │ str    ┆ i64 ┆ i64 │
# ╞════════╪═════╪═════╡
# │ Apple  ┆ 4   ┆ 3   │
# │ Apple  ┆ 3   ┆ 4   │
# │ Orange ┆ 2   ┆ 5   │
# │ Banana ┆ 6   ┆ 1   │
# │ Banana ┆ 5   ┆ 2   │
# └────────┴─────┴─────┘

Parameters:

  • by (Object)

    Column(s) used to determine the largest elements. Accepts expression input. Strings are parsed as column names.

  • k (Integer) (defaults to: 5)

    Number of elements to return.

  • reverse (Object) (defaults to: false)

    Consider the k smallest elements of the by column(s) (instead of the k largest). This can be specified per column by passing a sequence of booleans.

Returns:



1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
# File 'lib/polars/expr.rb', line 1587

def top_k_by(
  by,
  k: 5,
  reverse: false
)
  k = Utils.parse_into_expression(k)
  by = Utils.parse_into_list_of_expressions(by)
  reverse = Utils.extend_bool(reverse, by.length, "reverse", "by")
  wrap_expr(_rbexpr.top_k_by(by, k, reverse))
end

#truediv(other) ⇒ Expr

Method equivalent of float division operator expr / other.

Examples:

df = Polars::DataFrame.new(
  {"x" => [-2, -1, 0, 1, 2], "y" => [0.5, 0.0, 0.0, -4.0, -0.5]}
)
df.with_columns(
  Polars.col("x").truediv(2).alias("x/2"),
  Polars.col("x").truediv(Polars.col("y")).alias("x/y")
)
# =>
# shape: (5, 4)
# ┌─────┬──────┬──────┬───────┐
# │ x   ┆ y    ┆ x/2  ┆ x/y   │
# │ --- ┆ ---  ┆ ---  ┆ ---   │
# │ i64 ┆ f64  ┆ f64  ┆ f64   │
# ╞═════╪══════╪══════╪═══════╡
# │ -2  ┆ 0.5  ┆ -1.0 ┆ -4.0  │
# │ -1  ┆ 0.0  ┆ -0.5 ┆ -inf  │
# │ 0   ┆ 0.0  ┆ 0.0  ┆ NaN   │
# │ 1   ┆ -4.0 ┆ 0.5  ┆ -0.25 │
# │ 2   ┆ -0.5 ┆ 1.0  ┆ -4.0  │
# └─────┴──────┴──────┴───────┘

Parameters:

  • other (Object)

    Numeric literal or expression value.

Returns:



4246
4247
4248
# File 'lib/polars/expr.rb', line 4246

def truediv(other)
  self / other
end

#unique(maintain_order: false) ⇒ Expr

Get unique values of this expression.

Examples:

df = Polars::DataFrame.new({"a" => [1, 1, 2]})
df.select(Polars.col("a").unique(maintain_order: true))
# =>
# shape: (2, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ i64 │
# ╞═════╡
# │ 1   │
# │ 2   │
# └─────┘

Parameters:

  • maintain_order (Boolean) (defaults to: false)

    Maintain order of data. This requires more work.

Returns:



2660
2661
2662
2663
2664
2665
2666
# File 'lib/polars/expr.rb', line 2660

def unique(maintain_order: false)
  if maintain_order
    wrap_expr(_rbexpr.unique_stable)
  else
    wrap_expr(_rbexpr.unique)
  end
end

#unique_countsExpr

Return a count of the unique values in the order of appearance.

This method differs from value_counts in that it does not return the values, only the counts and might be faster

Examples:

df = Polars::DataFrame.new(
  {
    "id" => ["a", "b", "b", "c", "c", "c"]
  }
)
df.select(
  [
    Polars.col("id").unique_counts
  ]
)
# =>
# shape: (3, 1)
# ┌─────┐
# │ id  │
# │ --- │
# │ u32 │
# ╞═════╡
# │ 1   │
# │ 2   │
# │ 3   │
# └─────┘

Returns:



7580
7581
7582
# File 'lib/polars/expr.rb', line 7580

def unique_counts
  wrap_expr(_rbexpr.unique_counts)
end

#upper_boundExpr

Calculate the upper bound.

Returns a unit Series with the highest value possible for the dtype of this expression.

Examples:

df = Polars::DataFrame.new({"a" => [1, 2, 3, 2, 1]})
df.select(Polars.col("a").upper_bound)
# =>
# shape: (1, 1)
# ┌─────────────────────┐
# │ a                   │
# │ ---                 │
# │ i64                 │
# ╞═════════════════════╡
# │ 9223372036854775807 │
# └─────────────────────┘

Returns:



6825
6826
6827
# File 'lib/polars/expr.rb', line 6825

def upper_bound
  wrap_expr(_rbexpr.upper_bound)
end

#value_counts(sort: false, parallel: false, name: nil, normalize: false) ⇒ Expr

Count all unique values and create a struct mapping value to count.

Examples:

df = Polars::DataFrame.new(
  {
    "id" => ["a", "b", "b", "c", "c", "c"]
  }
)
df.select(
  [
    Polars.col("id").value_counts(sort: true),
  ]
)
# =>
# shape: (3, 1)
# ┌───────────┐
# │ id        │
# │ ---       │
# │ struct[2] │
# ╞═══════════╡
# │ {"c",3}   │
# │ {"b",2}   │
# │ {"a",1}   │
# └───────────┘

Parameters:

  • sort (Boolean) (defaults to: false)

    Sort the output by count in descending order. If set to false (default), the order of the output is random.

  • parallel (Boolean) (defaults to: false)

    Execute the computation in parallel.

  • name (String) (defaults to: nil)

    Give the resulting count column a specific name; if normalize is true defaults to "count", otherwise defaults to "proportion".

  • normalize (Boolean) (defaults to: false)

    If true gives relative frequencies of the unique values

Returns:



7533
7534
7535
7536
7537
7538
7539
7540
7541
7542
7543
7544
7545
7546
7547
7548
7549
# File 'lib/polars/expr.rb', line 7533

def value_counts(
  sort: false,
  parallel: false,
  name: nil,
  normalize: false
)
  if name.nil?
    if normalize
      name = "proportion"
    else
      name = "count"
    end
  end
  wrap_expr(
    _rbexpr.value_counts(sort, parallel, name, normalize)
  )
end

#var(ddof: 1) ⇒ Expr

Get variance.

Examples:

df = Polars::DataFrame.new({"a" => [-1, 0, 1]})
df.select(Polars.col("a").var)
# =>
# shape: (1, 1)
# ┌─────┐
# │ a   │
# │ --- │
# │ f64 │
# ╞═════╡
# │ 1.0 │
# └─────┘

Parameters:

  • ddof (Integer) (defaults to: 1)

    Degrees of freedom.

Returns:



2338
2339
2340
# File 'lib/polars/expr.rb', line 2338

def var(ddof: 1)
  wrap_expr(_rbexpr.var(ddof))
end

#where(predicate) ⇒ Expr

Filter a single column.

Alias for #filter.

Examples:

df = Polars::DataFrame.new(
  {
    "group_col" => ["g1", "g1", "g2"],
    "b" => [1, 2, 3]
  }
)
(
  df.group_by("group_col").agg(
    [
      Polars.col("b").where(Polars.col("b") < 2).sum.alias("lt"),
      Polars.col("b").where(Polars.col("b") >= 2).sum.alias("gte")
    ]
  )
).sort("group_col")
# =>
# shape: (2, 3)
# ┌───────────┬─────┬─────┐
# │ group_col ┆ lt  ┆ gte │
# │ ---       ┆ --- ┆ --- │
# │ str       ┆ i64 ┆ i64 │
# ╞═══════════╪═════╪═════╡
# │ g1        ┆ 1   ┆ 2   │
# │ g2        ┆ 0   ┆ 3   │
# └───────────┴─────┴─────┘

Parameters:

  • predicate (Expr)

    Boolean expression.

Returns:



3367
3368
3369
# File 'lib/polars/expr.rb', line 3367

def where(predicate)
  filter(predicate)
end

#xor(other) ⇒ Expr

Method equivalent of bitwise exclusive-or operator expr ^ other.

Examples:

df = Polars::DataFrame.new(
  {"x" => [true, false, true, false], "y" => [true, true, false, false]}
)
df.with_columns(Polars.col("x").xor(Polars.col("y")).alias("x ^ y"))
# =>
# shape: (4, 3)
# ┌───────┬───────┬───────┐
# │ x     ┆ y     ┆ x ^ y │
# │ ---   ┆ ---   ┆ ---   │
# │ bool  ┆ bool  ┆ bool  │
# ╞═══════╪═══════╪═══════╡
# │ true  ┆ true  ┆ false │
# │ false ┆ true  ┆ true  │
# │ true  ┆ false ┆ true  │
# │ false ┆ false ┆ false │
# └───────┴───────┴───────┘

Parameters:

  • other (Object)

    Integer or boolean value; accepts expression input.

Returns:



4300
4301
4302
# File 'lib/polars/expr.rb', line 4300

def xor(other)
  self ^ other
end

#|(other) ⇒ Expr

Bitwise OR.

Returns:



44
45
46
47
# File 'lib/polars/expr.rb', line 44

def |(other)
  other = Utils.parse_into_expression(other)
  wrap_expr(_rbexpr.or_(other))
end