Method: Polars::LazyFrame#unique

Defined in:
lib/polars/lazy_frame.rb

#unique(maintain_order: true, subset: nil, keep: "first") ⇒ LazyFrame

Drop duplicate rows from this DataFrame.

Note that this fails if there is a column of type List in the DataFrame or subset.

Examples:

lf = Polars::LazyFrame.new(
  {
    "foo" => [1, 2, 3, 1],
    "bar" => ["a", "a", "a", "a"],
    "ham" => ["b", "b", "b", "b"]
  }
)
lf.unique(maintain_order: true).collect
# =>
# shape: (3, 3)
# ┌─────┬─────┬─────┐
# │ foo ┆ bar ┆ ham │
# │ --- ┆ --- ┆ --- │
# │ i64 ┆ str ┆ str │
# ╞═════╪═════╪═════╡
# │ 1   ┆ a   ┆ b   │
# │ 2   ┆ a   ┆ b   │
# │ 3   ┆ a   ┆ b   │
# └─────┴─────┴─────┘
lf.unique(subset: ["bar", "ham"], maintain_order: true).collect
# =>
# shape: (1, 3)
# ┌─────┬─────┬─────┐
# │ foo ┆ bar ┆ ham │
# │ --- ┆ --- ┆ --- │
# │ i64 ┆ str ┆ str │
# ╞═════╪═════╪═════╡
# │ 1   ┆ a   ┆ b   │
# └─────┴─────┴─────┘
lf.unique(keep: "last", maintain_order: true).collect
# =>
# shape: (3, 3)
# ┌─────┬─────┬─────┐
# │ foo ┆ bar ┆ ham │
# │ --- ┆ --- ┆ --- │
# │ i64 ┆ str ┆ str │
# ╞═════╪═════╪═════╡
# │ 2   ┆ a   ┆ b   │
# │ 3   ┆ a   ┆ b   │
# │ 1   ┆ a   ┆ b   │
# └─────┴─────┴─────┘

Parameters:

  • maintain_order (Boolean) (defaults to: true)

    Keep the same order as the original DataFrame. This requires more work to compute.

  • subset (Object) (defaults to: nil)

    Subset to use to compare rows.

  • keep ("first", "last") (defaults to: "first")

    Which of the duplicate rows to keep.

Returns:



3925
3926
3927
3928
3929
3930
3931
# File 'lib/polars/lazy_frame.rb', line 3925

def unique(maintain_order: true, subset: nil, keep: "first")
  selector_subset = nil
  if !subset.nil?
    selector_subset = Utils.parse_list_into_selector(subset)._rbselector
  end
  _from_rbldf(_ldf.unique(maintain_order, selector_subset, keep))
end