Module: Polars
- Extended by:
- Convert, Functions, IO, LazyFunctions
- Defined in:
- lib/polars.rb,
lib/polars/io.rb,
lib/polars/expr.rb,
lib/polars/plot.rb,
lib/polars/when.rb,
lib/polars/slice.rb,
lib/polars/utils.rb,
lib/polars/config.rb,
lib/polars/series.rb,
lib/polars/convert.rb,
lib/polars/version.rb,
lib/polars/cat_expr.rb,
lib/polars/group_by.rb,
lib/polars/functions.rb,
lib/polars/list_expr.rb,
lib/polars/meta_expr.rb,
lib/polars/name_expr.rb,
lib/polars/when_then.rb,
lib/polars/array_expr.rb,
lib/polars/data_frame.rb,
lib/polars/data_types.rb,
lib/polars/exceptions.rb,
lib/polars/lazy_frame.rb,
lib/polars/binary_expr.rb,
lib/polars/sql_context.rb,
lib/polars/string_expr.rb,
lib/polars/struct_expr.rb,
lib/polars/expr_dispatch.rb,
lib/polars/lazy_group_by.rb,
lib/polars/cat_name_space.rb,
lib/polars/date_time_expr.rb,
lib/polars/lazy_functions.rb,
lib/polars/list_name_space.rb,
lib/polars/array_name_space.rb,
lib/polars/dynamic_group_by.rb,
lib/polars/rolling_group_by.rb,
lib/polars/binary_name_space.rb,
lib/polars/string_name_space.rb,
lib/polars/struct_name_space.rb,
lib/polars/batched_csv_reader.rb,
lib/polars/date_time_name_space.rb
Defined Under Namespace
Modules: Convert, Functions, IO, LazyFunctions, Plot Classes: Array, ArrayExpr, ArrayNameSpace, Binary, BinaryExpr, BinaryNameSpace, Boolean, CatExpr, CatNameSpace, Categorical, Config, DataFrame, DataType, Date, DateTimeExpr, DateTimeNameSpace, Datetime, Decimal, Duration, DynamicGroupBy, Expr, Field, Float32, Float64, FloatType, FractionalType, GroupBy, Int16, Int32, Int64, Int8, IntegralType, LazyFrame, LazyGroupBy, List, ListExpr, ListNameSpace, MetaExpr, NameExpr, NestedType, Null, NumericType, Object, RollingGroupBy, SQLContext, Series, String, StringExpr, StringNameSpace, Struct, StructExpr, StructNameSpace, TemporalType, Time, UInt16, UInt32, UInt64, UInt8, Unknown
Constant Summary collapse
- Utf8 =
Allow Utf8 as an alias for String
String
Class Method Summary collapse
-
.align_frames(*frames, on:, select: nil, reverse: false) ⇒ Object
extended
from Functions
Align a sequence of frames using the uique values from one or more columns as a key.
-
.all(name = nil) ⇒ Expr
extended
from LazyFunctions
Do one of two things.
-
.any(name) ⇒ Expr
extended
from LazyFunctions
Evaluate columnwise or elementwise with a bitwise OR operation.
-
.arg_sort_by(exprs, reverse: false) ⇒ Expr
(also: #argsort_by)
extended
from LazyFunctions
Find the indexes that would sort the columns.
-
.arg_where(condition, eager: false) ⇒ Expr, Series
extended
from LazyFunctions
Return indices where
conditionevaluatestrue. -
.avg(column) ⇒ Expr, Float
extended
from LazyFunctions
Get the mean value.
-
.coalesce(exprs, *more_exprs) ⇒ Expr
extended
from LazyFunctions
Folds the expressions from left to right, keeping the first non-null value.
-
.col(name) ⇒ Expr
extended
from LazyFunctions
Return an expression representing a column in a DataFrame.
-
.collect_all(lazy_frames, type_coercion: true, predicate_pushdown: true, projection_pushdown: true, simplify_expression: true, string_cache: false, no_optimization: false, slice_pushdown: true, common_subplan_elimination: true, allow_streaming: false) ⇒ Array
extended
from LazyFunctions
Collect multiple LazyFrames at the same time.
-
.concat(items, rechunk: true, how: "vertical", parallel: true) ⇒ Object
extended
from Functions
Aggregate multiple Dataframes/Series to a single DataFrame/Series.
-
.concat_list(exprs) ⇒ Expr
extended
from LazyFunctions
Concat the arrays in a Series dtype List in linear time.
-
.concat_str(exprs, sep: "") ⇒ Expr
extended
from LazyFunctions
Horizontally concat Utf8 Series in linear time.
-
.count(column = nil) ⇒ Expr, Integer
extended
from LazyFunctions
Count the number of values in this column/context.
-
.cov(a, b) ⇒ Expr
extended
from LazyFunctions
Compute the covariance between two columns/ expressions.
-
.cumfold(acc, f, exprs, include_init: false) ⇒ Object
extended
from LazyFunctions
Cumulatively accumulate over multiple columns horizontally/row wise with a left fold.
-
.cumsum(column) ⇒ Object
extended
from LazyFunctions
Cumulatively sum values in a column/Series, or horizontally across list of columns/expressions.
-
.date_range(start, stop, interval, lazy: false, closed: "both", name: nil, time_unit: nil, time_zone: nil) ⇒ Object
extended
from Functions
Create a range of type
Datetime(orDate). -
.duration(weeks: nil, days: nil, hours: nil, minutes: nil, seconds: nil, milliseconds: nil, microseconds: nil, nanoseconds: nil, time_unit: "us") ⇒ Expr
extended
from LazyFunctions
Create polars
Durationfrom distinct time components. -
.element ⇒ Expr
extended
from LazyFunctions
Alias for an element in evaluated in an
evalexpression. -
.exclude(columns) ⇒ Object
extended
from LazyFunctions
Exclude certain columns from a wildcard/regex selection.
-
.first(column = nil) ⇒ Object
extended
from LazyFunctions
Get the first value.
-
.fold(acc, f, exprs) ⇒ Expr
extended
from LazyFunctions
Accumulate over multiple columns horizontally/row wise with a left fold.
-
.format(fstring, *args) ⇒ Expr
extended
from LazyFunctions
Format expressions as a string.
-
.from_epoch(column, unit: "s", eager: false) ⇒ Object
extended
from LazyFunctions
Utility function that parses an epoch timestamp (or Unix time) to Polars Date(time).
-
.from_hash(data, schema: nil, columns: nil) ⇒ DataFrame
extended
from Convert
Construct a DataFrame from a dictionary of sequences.
-
.get_dummies(df, columns: nil) ⇒ DataFrame
extended
from Functions
Convert categorical variables into dummy/indicator variables.
-
.groups(column) ⇒ Object
extended
from LazyFunctions
Syntactic sugar for
Polars.col("foo").agg_groups. -
.head(column, n = 10) ⇒ Object
extended
from LazyFunctions
Get the first
nrows. -
.int_range(start, stop, step: 1, eager: false, dtype: nil) ⇒ Expr, Series
(also: #arange)
extended
from LazyFunctions
Create a range expression (or Series).
-
.last(column = nil) ⇒ Object
extended
from LazyFunctions
Get the last value.
-
.lit(value, dtype: nil, allow_object: nil) ⇒ Expr
extended
from LazyFunctions
Return an expression representing a literal value.
-
.max(column) ⇒ Expr, Object
extended
from LazyFunctions
Get the maximum value.
-
.mean(column) ⇒ Expr, Float
extended
from LazyFunctions
Get the mean value.
-
.median(column) ⇒ Object
extended
from LazyFunctions
Get the median value.
-
.min(column) ⇒ Expr, Object
extended
from LazyFunctions
Get the minimum value.
-
.n_unique(column) ⇒ Object
extended
from LazyFunctions
Count unique values.
-
.ones(n, dtype: nil) ⇒ Series
extended
from Functions
Return a new Series of given length and type, filled with ones.
-
.pearson_corr(a, b, ddof: 1) ⇒ Expr
extended
from LazyFunctions
Compute the pearson's correlation between two columns.
-
.quantile(column, quantile, interpolation: "nearest") ⇒ Expr
extended
from LazyFunctions
Syntactic sugar for
Polars.col("foo").quantile(...). -
.read_avro(source, columns: nil, n_rows: nil) ⇒ DataFrame
extended
from IO
Read into a DataFrame from Apache Avro format.
-
.read_csv(source, has_header: true, columns: nil, new_columns: nil, sep: ",", comment_char: nil, quote_char: '"', skip_rows: 0, dtypes: nil, null_values: nil, ignore_errors: false, parse_dates: false, n_threads: nil, infer_schema_length: 100, batch_size: 8192, n_rows: nil, encoding: "utf8", low_memory: false, rechunk: true, storage_options: nil, skip_rows_after_header: 0, row_count_name: nil, row_count_offset: 0, sample_size: 1024, eol_char: "\n") ⇒ DataFrame
extended
from IO
Read a CSV file into a DataFrame.
-
.read_csv_batched(source, has_header: true, columns: nil, new_columns: nil, sep: ",", comment_char: nil, quote_char: '"', skip_rows: 0, dtypes: nil, null_values: nil, ignore_errors: false, parse_dates: false, n_threads: nil, infer_schema_length: 100, batch_size: 50_000, n_rows: nil, encoding: "utf8", low_memory: false, rechunk: true, skip_rows_after_header: 0, row_count_name: nil, row_count_offset: 0, sample_size: 1024, eol_char: "\n") ⇒ BatchedCsvReader
extended
from IO
Read a CSV file in batches.
-
.read_database(query) ⇒ DataFrame
(also: #read_sql)
extended
from IO
Read a SQL query into a DataFrame.
-
.read_ipc(source, columns: nil, n_rows: nil, memory_map: true, storage_options: nil, row_count_name: nil, row_count_offset: 0, rechunk: true) ⇒ DataFrame
extended
from IO
Read into a DataFrame from Arrow IPC (Feather v2) file.
-
.read_ipc_schema(source) ⇒ Hash
extended
from IO
Get a schema of the IPC file without reading data.
-
.read_json(source) ⇒ DataFrame
extended
from IO
Read into a DataFrame from a JSON file.
-
.read_ndjson(source) ⇒ DataFrame
extended
from IO
Read into a DataFrame from a newline delimited JSON file.
-
.read_parquet(source, columns: nil, n_rows: nil, storage_options: nil, parallel: "auto", row_count_name: nil, row_count_offset: 0, low_memory: false, use_statistics: true, rechunk: true) ⇒ DataFrame
extended
from IO
Read into a DataFrame from a parquet file.
-
.read_parquet_schema(source) ⇒ Hash
extended
from IO
Get a schema of the Parquet file without reading data.
-
.repeat(value, n, dtype: nil, eager: false, name: nil) ⇒ Expr
extended
from LazyFunctions
Repeat a single value n times.
-
.scan_csv(source, has_header: true, sep: ",", comment_char: nil, quote_char: '"', skip_rows: 0, dtypes: nil, null_values: nil, ignore_errors: false, cache: true, with_column_names: nil, infer_schema_length: 100, n_rows: nil, encoding: "utf8", low_memory: false, rechunk: true, skip_rows_after_header: 0, row_count_name: nil, row_count_offset: 0, parse_dates: false, eol_char: "\n") ⇒ LazyFrame
extended
from IO
Lazily read from a CSV file or multiple files via glob patterns.
-
.scan_ipc(source, n_rows: nil, cache: true, rechunk: true, row_count_name: nil, row_count_offset: 0, storage_options: nil, memory_map: true) ⇒ LazyFrame
extended
from IO
Lazily read from an Arrow IPC (Feather v2) file or multiple files via glob patterns.
-
.scan_ndjson(source, infer_schema_length: 100, batch_size: 1024, n_rows: nil, low_memory: false, rechunk: true, row_count_name: nil, row_count_offset: 0) ⇒ LazyFrame
extended
from IO
Lazily read from a newline delimited JSON file.
-
.scan_parquet(source, n_rows: nil, cache: true, parallel: "auto", rechunk: true, row_count_name: nil, row_count_offset: 0, storage_options: nil, low_memory: false) ⇒ LazyFrame
extended
from IO
Lazily read from a parquet file or multiple files via glob patterns.
-
.select(exprs) ⇒ DataFrame
extended
from LazyFunctions
Run polars expressions without a context.
-
.spearman_rank_corr(a, b, ddof: 1, propagate_nans: false) ⇒ Expr
extended
from LazyFunctions
Compute the spearman rank correlation between two columns.
-
.std(column, ddof: 1) ⇒ Object
extended
from LazyFunctions
Get the standard deviation.
-
.struct(exprs, eager: false) ⇒ Object
extended
from LazyFunctions
Collect several columns into a Series of dtype Struct.
-
.sum(column) ⇒ Object
extended
from LazyFunctions
Sum values in a column/Series, or horizontally across list of columns/expressions.
-
.tail(column, n = 10) ⇒ Object
extended
from LazyFunctions
Get the last
nrows. -
.to_list(name) ⇒ Expr
extended
from LazyFunctions
Aggregate to list.
-
.var(column, ddof: 1) ⇒ Object
extended
from LazyFunctions
Get the variance.
-
.when(expr) ⇒ When
extended
from LazyFunctions
Start a "when, then, otherwise" expression.
-
.zeros(n, dtype: nil) ⇒ Series
extended
from Functions
Return a new Series of given length and type, filled with zeros.
Class Method Details
.align_frames(*frames, on:, select: nil, reverse: false) ⇒ Object Originally defined in module Functions
Align a sequence of frames using the uique values from one or more columns as a key.
Frames that do not contain the given key values have rows injected (with nulls filling the non-key columns), and each resulting frame is sorted by the key.
The original column order of input frames is not changed unless select is
specified (in which case the final column order is determined from that).
Note that this does not result in a joined frame - you receive the same number of frames back that you passed in, but each is now aligned by key and has the same number of rows.
.all(name = nil) ⇒ Expr Originally defined in module LazyFunctions
Do one of two things.
- function can do a columnwise or elementwise AND operation
- a wildcard column selection
.any(name) ⇒ Expr Originally defined in module LazyFunctions
Evaluate columnwise or elementwise with a bitwise OR operation.
.arg_sort_by(exprs, reverse: false) ⇒ Expr Also known as: argsort_by Originally defined in module LazyFunctions
Find the indexes that would sort the columns.
Argsort by multiple columns. The first column will be used for the ordering. If there are duplicates in the first column, the second column will be used to determine the ordering and so on.
.arg_where(condition, eager: false) ⇒ Expr, Series Originally defined in module LazyFunctions
Return indices where condition evaluates true.
.avg(column) ⇒ Expr, Float Originally defined in module LazyFunctions
Get the mean value.
.coalesce(exprs, *more_exprs) ⇒ Expr Originally defined in module LazyFunctions
Folds the expressions from left to right, keeping the first non-null value.
.col(name) ⇒ Expr Originally defined in module LazyFunctions
Return an expression representing a column in a DataFrame.
.collect_all(lazy_frames, type_coercion: true, predicate_pushdown: true, projection_pushdown: true, simplify_expression: true, string_cache: false, no_optimization: false, slice_pushdown: true, common_subplan_elimination: true, allow_streaming: false) ⇒ Array Originally defined in module LazyFunctions
Collect multiple LazyFrames at the same time.
This runs all the computation graphs in parallel on Polars threadpool.
.concat(items, rechunk: true, how: "vertical", parallel: true) ⇒ Object Originally defined in module Functions
Aggregate multiple Dataframes/Series to a single DataFrame/Series.
.concat_list(exprs) ⇒ Expr Originally defined in module LazyFunctions
Concat the arrays in a Series dtype List in linear time.
.concat_str(exprs, sep: "") ⇒ Expr Originally defined in module LazyFunctions
Horizontally concat Utf8 Series in linear time. Non-Utf8 columns are cast to Utf8.
.count(column = nil) ⇒ Expr, Integer Originally defined in module LazyFunctions
Count the number of values in this column/context.
.cov(a, b) ⇒ Expr Originally defined in module LazyFunctions
Compute the covariance between two columns/ expressions.
.cumfold(acc, f, exprs, include_init: false) ⇒ Object Originally defined in module LazyFunctions
If you simply want the first encountered expression as accumulator,
consider using cumreduce.
Cumulatively accumulate over multiple columns horizontally/row wise with a left fold.
Every cumulative result is added as a separate field in a Struct column.
.cumsum(column) ⇒ Object Originally defined in module LazyFunctions
Cumulatively sum values in a column/Series, or horizontally across list of columns/expressions.
.date_range(start, stop, interval, lazy: false, closed: "both", name: nil, time_unit: nil, time_zone: nil) ⇒ Object Originally defined in module Functions
If both low and high are passed as date types (not datetime), and the
interval granularity is no finer than 1d, the returned range is also of
type date. All other permutations return a datetime Series.
Create a range of type Datetime (or Date).
.duration(weeks: nil, days: nil, hours: nil, minutes: nil, seconds: nil, milliseconds: nil, microseconds: nil, nanoseconds: nil, time_unit: "us") ⇒ Expr Originally defined in module LazyFunctions
Create polars Duration from distinct time components.
.element ⇒ Expr Originally defined in module LazyFunctions
Alias for an element in evaluated in an eval expression.
.exclude(columns) ⇒ Object Originally defined in module LazyFunctions
Exclude certain columns from a wildcard/regex selection.
.first(column = nil) ⇒ Object Originally defined in module LazyFunctions
Get the first value.
.fold(acc, f, exprs) ⇒ Expr Originally defined in module LazyFunctions
Accumulate over multiple columns horizontally/row wise with a left fold.
.format(fstring, *args) ⇒ Expr Originally defined in module LazyFunctions
Format expressions as a string.
.from_epoch(column, unit: "s", eager: false) ⇒ Object Originally defined in module LazyFunctions
Utility function that parses an epoch timestamp (or Unix time) to Polars Date(time).
Depending on the unit provided, this function will return a different dtype:
- unit: "d" returns pl.Date
- unit: "s" returns pl.Datetime"us"
- unit: "ms" returns pl.Datetime["ms"]
- unit: "us" returns pl.Datetime["us"]
- unit: "ns" returns pl.Datetime["ns"]
.from_hash(data, schema: nil, columns: nil) ⇒ DataFrame Originally defined in module Convert
Construct a DataFrame from a dictionary of sequences.
This operation clones data, unless you pass in a Hash<String, Series>.
.get_dummies(df, columns: nil) ⇒ DataFrame Originally defined in module Functions
Convert categorical variables into dummy/indicator variables.
.groups(column) ⇒ Object Originally defined in module LazyFunctions
Syntactic sugar for Polars.col("foo").agg_groups.
.head(column, n = 10) ⇒ Object Originally defined in module LazyFunctions
Get the first n rows.
.int_range(start, stop, step: 1, eager: false, dtype: nil) ⇒ Expr, Series Also known as: arange Originally defined in module LazyFunctions
Create a range expression (or Series).
This can be used in a select, with_column, etc. Be sure that the resulting
range size is equal to the length of the DataFrame you are collecting.
.last(column = nil) ⇒ Object Originally defined in module LazyFunctions
Get the last value.
Depending on the input type this function does different things:
- nil -> expression to take last column of a context.
- String -> syntactic sugar for
Polars.col(..).last - Series -> Take last value in
Series
.lit(value, dtype: nil, allow_object: nil) ⇒ Expr Originally defined in module LazyFunctions
Return an expression representing a literal value.
.max(column) ⇒ Expr, Object Originally defined in module LazyFunctions
Get the maximum value.
.mean(column) ⇒ Expr, Float Originally defined in module LazyFunctions
Get the mean value.
.median(column) ⇒ Object Originally defined in module LazyFunctions
Get the median value.
.min(column) ⇒ Expr, Object Originally defined in module LazyFunctions
Get the minimum value.
.n_unique(column) ⇒ Object Originally defined in module LazyFunctions
Count unique values.
.ones(n, dtype: nil) ⇒ Series Originally defined in module Functions
In the lazy API you should probably not use this, but use lit(1)
instead.
Return a new Series of given length and type, filled with ones.
.pearson_corr(a, b, ddof: 1) ⇒ Expr Originally defined in module LazyFunctions
Compute the pearson's correlation between two columns.
.quantile(column, quantile, interpolation: "nearest") ⇒ Expr Originally defined in module LazyFunctions
Syntactic sugar for Polars.col("foo").quantile(...).
.read_avro(source, columns: nil, n_rows: nil) ⇒ DataFrame Originally defined in module IO
Read into a DataFrame from Apache Avro format.
.read_csv(source, has_header: true, columns: nil, new_columns: nil, sep: ",", comment_char: nil, quote_char: '"', skip_rows: 0, dtypes: nil, null_values: nil, ignore_errors: false, parse_dates: false, n_threads: nil, infer_schema_length: 100, batch_size: 8192, n_rows: nil, encoding: "utf8", low_memory: false, rechunk: true, storage_options: nil, skip_rows_after_header: 0, row_count_name: nil, row_count_offset: 0, sample_size: 1024, eol_char: "\n") ⇒ DataFrame Originally defined in module IO
This operation defaults to a rechunk operation at the end, meaning that
all data will be stored continuously in memory.
Set rechunk: false if you are benchmarking the csv-reader. A rechunk is
an expensive operation.
Read a CSV file into a DataFrame.
.read_csv_batched(source, has_header: true, columns: nil, new_columns: nil, sep: ",", comment_char: nil, quote_char: '"', skip_rows: 0, dtypes: nil, null_values: nil, ignore_errors: false, parse_dates: false, n_threads: nil, infer_schema_length: 100, batch_size: 50_000, n_rows: nil, encoding: "utf8", low_memory: false, rechunk: true, skip_rows_after_header: 0, row_count_name: nil, row_count_offset: 0, sample_size: 1024, eol_char: "\n") ⇒ BatchedCsvReader Originally defined in module IO
Read a CSV file in batches.
Upon creation of the BatchedCsvReader,
polars will gather statistics and determine the
file chunks. After that work will only be done
if next_batches is called.
.read_database(query) ⇒ DataFrame Also known as: read_sql Originally defined in module IO
Read a SQL query into a DataFrame.
.read_ipc(source, columns: nil, n_rows: nil, memory_map: true, storage_options: nil, row_count_name: nil, row_count_offset: 0, rechunk: true) ⇒ DataFrame Originally defined in module IO
Read into a DataFrame from Arrow IPC (Feather v2) file.
.read_ipc_schema(source) ⇒ Hash Originally defined in module IO
Get a schema of the IPC file without reading data.
.read_json(source) ⇒ DataFrame Originally defined in module IO
Read into a DataFrame from a JSON file.
.read_ndjson(source) ⇒ DataFrame Originally defined in module IO
Read into a DataFrame from a newline delimited JSON file.
.read_parquet(source, columns: nil, n_rows: nil, storage_options: nil, parallel: "auto", row_count_name: nil, row_count_offset: 0, low_memory: false, use_statistics: true, rechunk: true) ⇒ DataFrame Originally defined in module IO
This operation defaults to a rechunk operation at the end, meaning that
all data will be stored continuously in memory.
Set rechunk: false if you are benchmarking the parquet-reader. A rechunk is
an expensive operation.
Read into a DataFrame from a parquet file.
.read_parquet_schema(source) ⇒ Hash Originally defined in module IO
Get a schema of the Parquet file without reading data.
.repeat(value, n, dtype: nil, eager: false, name: nil) ⇒ Expr Originally defined in module LazyFunctions
Repeat a single value n times.
.scan_csv(source, has_header: true, sep: ",", comment_char: nil, quote_char: '"', skip_rows: 0, dtypes: nil, null_values: nil, ignore_errors: false, cache: true, with_column_names: nil, infer_schema_length: 100, n_rows: nil, encoding: "utf8", low_memory: false, rechunk: true, skip_rows_after_header: 0, row_count_name: nil, row_count_offset: 0, parse_dates: false, eol_char: "\n") ⇒ LazyFrame Originally defined in module IO
Lazily read from a CSV file or multiple files via glob patterns.
This allows the query optimizer to push down predicates and projections to the scan level, thereby potentially reducing memory overhead.
.scan_ipc(source, n_rows: nil, cache: true, rechunk: true, row_count_name: nil, row_count_offset: 0, storage_options: nil, memory_map: true) ⇒ LazyFrame Originally defined in module IO
Lazily read from an Arrow IPC (Feather v2) file or multiple files via glob patterns.
This allows the query optimizer to push down predicates and projections to the scan level, thereby potentially reducing memory overhead.
.scan_ndjson(source, infer_schema_length: 100, batch_size: 1024, n_rows: nil, low_memory: false, rechunk: true, row_count_name: nil, row_count_offset: 0) ⇒ LazyFrame Originally defined in module IO
Lazily read from a newline delimited JSON file.
This allows the query optimizer to push down predicates and projections to the scan level, thereby potentially reducing memory overhead.
.scan_parquet(source, n_rows: nil, cache: true, parallel: "auto", rechunk: true, row_count_name: nil, row_count_offset: 0, storage_options: nil, low_memory: false) ⇒ LazyFrame Originally defined in module IO
Lazily read from a parquet file or multiple files via glob patterns.
This allows the query optimizer to push down predicates and projections to the scan level, thereby potentially reducing memory overhead.
.select(exprs) ⇒ DataFrame Originally defined in module LazyFunctions
Run polars expressions without a context.
.spearman_rank_corr(a, b, ddof: 1, propagate_nans: false) ⇒ Expr Originally defined in module LazyFunctions
Compute the spearman rank correlation between two columns.
Missing data will be excluded from the computation.
.std(column, ddof: 1) ⇒ Object Originally defined in module LazyFunctions
Get the standard deviation.
.struct(exprs, eager: false) ⇒ Object Originally defined in module LazyFunctions
Collect several columns into a Series of dtype Struct.
.sum(column) ⇒ Object Originally defined in module LazyFunctions
Sum values in a column/Series, or horizontally across list of columns/expressions.
.tail(column, n = 10) ⇒ Object Originally defined in module LazyFunctions
Get the last n rows.
.to_list(name) ⇒ Expr Originally defined in module LazyFunctions
Aggregate to list.
.var(column, ddof: 1) ⇒ Object Originally defined in module LazyFunctions
Get the variance.
.when(expr) ⇒ When Originally defined in module LazyFunctions
Start a "when, then, otherwise" expression.