Class: Polars::CatExpr
- Inherits:
-
Object
- Object
- Polars::CatExpr
- Defined in:
- lib/polars/cat_expr.rb
Overview
Namespace for categorical related expressions.
Instance Method Summary collapse
-
#ends_with(suffix) ⇒ Expr
Check if string representations of values end with a substring.
-
#get_categories ⇒ Expr
Get the categories stored in this data type.
-
#len_bytes ⇒ Expr
Return the byte-length of the string representation of each value.
-
#len_chars ⇒ Expr
Return the number of characters of the string representation of each value.
-
#slice(offset, length = nil) ⇒ Expr
Extract a substring from the string representation of each value.
-
#starts_with(prefix) ⇒ Expr
Check if string representations of values start with a substring.
Instance Method Details
#ends_with(suffix) ⇒ Expr
Whereas str.ends_with
allows expression inputs, cat.ends_with
requires a
literal string value.
Check if string representations of values end with a substring.
195 196 197 198 199 200 201 |
# File 'lib/polars/cat_expr.rb', line 195 def ends_with(suffix) if !suffix.is_a?(::String) msg = "'suffix' must be a string; found #{suffix.inspect}" raise TypeError, msg end Utils.wrap_expr(_rbexpr.cat_ends_with(suffix)) end |
#get_categories ⇒ Expr
Get the categories stored in this data type.
32 33 34 |
# File 'lib/polars/cat_expr.rb', line 32 def get_categories Utils.wrap_expr(_rbexpr.cat_get_categories) end |
#len_bytes ⇒ Expr
When working with non-ASCII text, the length in bytes is not the same as the
length in characters. You may want to use len_chars
instead.
Note that len_bytes
is much more performant (O(1)) than
len_chars
(O(n)).
Return the byte-length of the string representation of each value.
# => # shape: (4, 3) # ┌──────┬─────────┬─────────┐ # │ a ┆ n_bytes ┆ n_chars │ # │ --- ┆ --- ┆ --- │ # │ cat ┆ u32 ┆ u32 │ # ╞══════╪═════════╪═════════╡ # │ Café ┆ 5 ┆ 4 │ # │ 345 ┆ 3 ┆ 3 │ # │ 東京 ┆ 6 ┆ 2 │ # │ null ┆ null ┆ null │ # └──────┴─────────┴─────────┘
66 67 68 |
# File 'lib/polars/cat_expr.rb', line 66 def len_bytes Utils.wrap_expr(_rbexpr.cat_len_bytes) end |
#len_chars ⇒ Expr
When working with ASCII text, use len_bytes
instead to achieve
equivalent output with much better performance:
len_bytes
runs in O(1), while len_chars
runs in (O(n)).
A character is defined as a Unicode scalar value. A single character is represented by a single byte when working with ASCII text, and a maximum of 4 bytes otherwise.
Return the number of characters of the string representation of each value.
103 104 105 |
# File 'lib/polars/cat_expr.rb', line 103 def len_chars Utils.wrap_expr(_rbexpr.cat_len_chars) end |
#slice(offset, length = nil) ⇒ Expr
Both the offset
and length
inputs are defined in terms of the number
of characters in the (UTF8) string. A character is defined as a
Unicode scalar value. A single character is represented by a single byte
when working with ASCII text, and a maximum of 4 bytes otherwise.
Extract a substring from the string representation of each value.
256 257 258 |
# File 'lib/polars/cat_expr.rb', line 256 def slice(offset, length = nil) Utils.wrap_expr(_rbexpr.cat_slice(offset, length)) end |
#starts_with(prefix) ⇒ Expr
Whereas str.starts_with
allows expression inputs, cat.starts_with
requires
a literal string value.
Check if string representations of values start with a substring.
148 149 150 151 152 153 154 |
# File 'lib/polars/cat_expr.rb', line 148 def starts_with(prefix) if !prefix.is_a?(::String) msg = "'prefix' must be a string; found #{prefix.inspect}" raise TypeError, msg end Utils.wrap_expr(_rbexpr.cat_starts_with(prefix)) end |