Module: Polars::LazyFunctions
- Included in:
- Polars
- Defined in:
- lib/polars/lazy_functions.rb
Instance Method Summary collapse
-
#all(name = nil) ⇒ Expr
Do one of two things.
-
#any(name) ⇒ Expr
Evaluate columnwise or elementwise with a bitwise OR operation.
-
#arg_sort_by(exprs, reverse: false) ⇒ Expr
(also: #argsort_by)
Find the indexes that would sort the columns.
-
#arg_where(condition, eager: false) ⇒ Expr, Series
Return indices where
conditionevaluatestrue. -
#avg(column) ⇒ Expr, Float
Get the mean value.
-
#coalesce(exprs, *more_exprs) ⇒ Expr
Folds the expressions from left to right, keeping the first non-null value.
-
#col(name) ⇒ Expr
Return an expression representing a column in a DataFrame.
-
#collect_all(lazy_frames, type_coercion: true, predicate_pushdown: true, projection_pushdown: true, simplify_expression: true, string_cache: false, no_optimization: false, slice_pushdown: true, common_subplan_elimination: true, allow_streaming: false) ⇒ Array
Collect multiple LazyFrames at the same time.
-
#concat_list(exprs) ⇒ Expr
Concat the arrays in a Series dtype List in linear time.
-
#concat_str(exprs, sep: "") ⇒ Expr
Horizontally concat Utf8 Series in linear time.
-
#count(column = nil) ⇒ Expr, Integer
Count the number of values in this column/context.
-
#cov(a, b) ⇒ Expr
Compute the covariance between two columns/ expressions.
-
#cumfold(acc, f, exprs, include_init: false) ⇒ Object
Cumulatively accumulate over multiple columns horizontally/row wise with a left fold.
-
#cumsum(column) ⇒ Object
Cumulatively sum values in a column/Series, or horizontally across list of columns/expressions.
-
#duration(weeks: nil, days: nil, hours: nil, minutes: nil, seconds: nil, milliseconds: nil, microseconds: nil, nanoseconds: nil, time_unit: "us") ⇒ Expr
Create polars
Durationfrom distinct time components. -
#element ⇒ Expr
Alias for an element in evaluated in an
evalexpression. -
#exclude(columns) ⇒ Object
Exclude certain columns from a wildcard/regex selection.
-
#first(column = nil) ⇒ Object
Get the first value.
-
#fold(acc, f, exprs) ⇒ Expr
Accumulate over multiple columns horizontally/row wise with a left fold.
-
#format(fstring, *args) ⇒ Expr
Format expressions as a string.
-
#from_epoch(column, unit: "s", eager: false) ⇒ Object
Utility function that parses an epoch timestamp (or Unix time) to Polars Date(time).
-
#groups(column) ⇒ Object
Syntactic sugar for
Polars.col("foo").agg_groups. -
#head(column, n = 10) ⇒ Object
Get the first
nrows. -
#int_range(start, stop, step: 1, eager: false, dtype: nil) ⇒ Expr, Series
(also: #arange)
Create a range expression (or Series).
-
#last(column = nil) ⇒ Object
Get the last value.
-
#lit(value, dtype: nil, allow_object: nil) ⇒ Expr
Return an expression representing a literal value.
-
#max(column) ⇒ Expr, Object
Get the maximum value.
-
#mean(column) ⇒ Expr, Float
Get the mean value.
-
#median(column) ⇒ Object
Get the median value.
-
#min(column) ⇒ Expr, Object
Get the minimum value.
-
#n_unique(column) ⇒ Object
Count unique values.
-
#pearson_corr(a, b, ddof: 1) ⇒ Expr
Compute the pearson's correlation between two columns.
-
#quantile(column, quantile, interpolation: "nearest") ⇒ Expr
Syntactic sugar for
Polars.col("foo").quantile(...). -
#repeat(value, n, dtype: nil, eager: false, name: nil) ⇒ Expr
Repeat a single value n times.
-
#select(exprs) ⇒ DataFrame
Run polars expressions without a context.
-
#spearman_rank_corr(a, b, ddof: 1, propagate_nans: false) ⇒ Expr
Compute the spearman rank correlation between two columns.
-
#std(column, ddof: 1) ⇒ Object
Get the standard deviation.
-
#struct(exprs, eager: false) ⇒ Object
Collect several columns into a Series of dtype Struct.
-
#sum(column) ⇒ Object
Sum values in a column/Series, or horizontally across list of columns/expressions.
-
#tail(column, n = 10) ⇒ Object
Get the last
nrows. -
#to_list(name) ⇒ Expr
Aggregate to list.
-
#var(column, ddof: 1) ⇒ Object
Get the variance.
-
#when(expr) ⇒ When
Start a "when, then, otherwise" expression.
Instance Method Details
#all(name = nil) ⇒ Expr
Do one of two things.
- function can do a columnwise or elementwise AND operation
- a wildcard column selection
576 577 578 579 580 581 582 583 584 |
# File 'lib/polars/lazy_functions.rb', line 576 def all(name = nil) if name.nil? col("*") elsif Utils.strlike?(name) col(name).all else raise Todo end end |
#any(name) ⇒ Expr
Evaluate columnwise or elementwise with a bitwise OR operation.
481 482 483 484 485 486 487 |
# File 'lib/polars/lazy_functions.rb', line 481 def any(name) if Utils.strlike?(name) col(name).any else fold(lit(false), ->(a, b) { a.cast(:bool) | b.cast(:bool) }, name).alias("any") end end |
#arg_sort_by(exprs, reverse: false) ⇒ Expr Also known as: argsort_by
Find the indexes that would sort the columns.
Argsort by multiple columns. The first column will be used for the ordering. If there are duplicates in the first column, the second column will be used to determine the ordering and so on.
662 663 664 665 666 667 668 669 670 671 |
# File 'lib/polars/lazy_functions.rb', line 662 def arg_sort_by(exprs, reverse: false) if !exprs.is_a?(::Array) exprs = [exprs] end if reverse == true || reverse == false reverse = [reverse] * exprs.length end exprs = Utils.selection_to_rbexpr_list(exprs) Utils.wrap_expr(RbExpr.arg_sort_by(exprs, reverse)) end |
#arg_where(condition, eager: false) ⇒ Expr, Series
Return indices where condition evaluates true.
1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 |
# File 'lib/polars/lazy_functions.rb', line 1048 def arg_where(condition, eager: false) if eager if !condition.is_a?(Series) raise ArgumentError, "expected 'Series' in 'arg_where' if 'eager=True', got #{condition.class.name}" end condition.to_frame.select(arg_where(Polars.col(condition.name))).to_series else condition = Utils.expr_to_lit_or_expr(condition, str_to_lit: true) Utils.wrap_expr(_arg_where(condition._rbexpr)) end end |
#avg(column) ⇒ Expr, Float
Get the mean value.
165 166 167 |
# File 'lib/polars/lazy_functions.rb', line 165 def avg(column) mean(column) end |
#coalesce(exprs, *more_exprs) ⇒ Expr
Folds the expressions from left to right, keeping the first non-null value.
1090 1091 1092 1093 1094 1095 1096 |
# File 'lib/polars/lazy_functions.rb', line 1090 def coalesce(exprs, *more_exprs) exprs = Utils.selection_to_rbexpr_list(exprs) if more_exprs.any? exprs.concat(Utils.selection_to_rbexpr_list(more_exprs)) end Utils.wrap_expr(_coalesce_exprs(exprs)) end |
#col(name) ⇒ Expr
Return an expression representing a column in a DataFrame.
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
# File 'lib/polars/lazy_functions.rb', line 6 def col(name) if name.is_a?(Series) name = name.to_a end if name.is_a?(Class) && name < DataType name = [name] end if name.is_a?(DataType) Utils.wrap_expr(_dtype_cols([name])) elsif name.is_a?(::Array) if name.length == 0 || Utils.strlike?(name[0]) name = name.map { |v| v.is_a?(Symbol) ? v.to_s : v } Utils.wrap_expr(RbExpr.cols(name)) elsif Utils.is_polars_dtype(name[0]) Utils.wrap_expr(_dtype_cols(name)) else raise ArgumentError, "Expected list values to be all `str` or all `DataType`" end else name = name.to_s if name.is_a?(Symbol) Utils.wrap_expr(RbExpr.col(name)) end end |
#collect_all(lazy_frames, type_coercion: true, predicate_pushdown: true, projection_pushdown: true, simplify_expression: true, string_cache: false, no_optimization: false, slice_pushdown: true, common_subplan_elimination: true, allow_streaming: false) ⇒ Array
Collect multiple LazyFrames at the same time.
This runs all the computation graphs in parallel on Polars threadpool.
889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 |
# File 'lib/polars/lazy_functions.rb', line 889 def collect_all( lazy_frames, type_coercion: true, predicate_pushdown: true, projection_pushdown: true, simplify_expression: true, string_cache: false, no_optimization: false, slice_pushdown: true, common_subplan_elimination: true, allow_streaming: false ) if no_optimization predicate_pushdown = false projection_pushdown = false slice_pushdown = false common_subplan_elimination = false end prepared = [] lazy_frames.each do |lf| ldf = lf._ldf.optimization_toggle( type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, common_subplan_elimination, allow_streaming, false ) prepared << ldf end out = _collect_all(prepared) # wrap the rbdataframes into dataframe result = out.map { |rbdf| Utils.wrap_df(rbdf) } result end |
#concat_list(exprs) ⇒ Expr
Concat the arrays in a Series dtype List in linear time.
858 859 860 861 |
# File 'lib/polars/lazy_functions.rb', line 858 def concat_list(exprs) exprs = Utils.selection_to_rbexpr_list(exprs) Utils.wrap_expr(RbExpr.concat_lst(exprs)) end |
#concat_str(exprs, sep: "") ⇒ Expr
Horizontally concat Utf8 Series in linear time. Non-Utf8 columns are cast to Utf8.
797 798 799 800 |
# File 'lib/polars/lazy_functions.rb', line 797 def concat_str(exprs, sep: "") exprs = Utils.selection_to_rbexpr_list(exprs) return Utils.wrap_expr(RbExpr.concat_str(exprs, sep)) end |
#count(column = nil) ⇒ Expr, Integer
Count the number of values in this column/context.
66 67 68 69 70 71 72 73 74 75 76 |
# File 'lib/polars/lazy_functions.rb', line 66 def count(column = nil) if column.nil? return Utils.wrap_expr(RbExpr.count) end if column.is_a?(Series) column.len else col(column).count end end |
#cov(a, b) ⇒ Expr
Compute the covariance between two columns/ expressions.
413 414 415 416 417 418 419 420 421 |
# File 'lib/polars/lazy_functions.rb', line 413 def cov(a, b) if Utils.strlike?(a) a = col(a) end if Utils.strlike?(b) b = col(b) end Utils.wrap_expr(RbExpr.cov(a._rbexpr, b._rbexpr)) end |
#cumfold(acc, f, exprs, include_init: false) ⇒ Object
If you simply want the first encountered expression as accumulator,
consider using cumreduce.
Cumulatively accumulate over multiple columns horizontally/row wise with a left fold.
Every cumulative result is added as a separate field in a Struct column.
465 466 467 468 469 470 471 472 473 |
# File 'lib/polars/lazy_functions.rb', line 465 def cumfold(acc, f, exprs, include_init: false) acc = Utils.expr_to_lit_or_expr(acc, str_to_lit: true) if exprs.is_a?(Expr) exprs = [exprs] end exprs = Utils.selection_to_rbexpr_list(exprs) Utils.wrap_expr(RbExpr.cumfold(acc._rbexpr, f, exprs, include_init)) end |
#cumsum(column) ⇒ Object
Cumulatively sum values in a column/Series, or horizontally across list of columns/expressions.
349 350 351 352 353 354 355 356 357 |
# File 'lib/polars/lazy_functions.rb', line 349 def cumsum(column) if column.is_a?(Series) column.cumsum elsif Utils.strlike?(column) col(column).cumsum else cumfold(lit(0).cast(:u32), ->(a, b) { a + b }, column).alias("cumsum") end end |
#duration(weeks: nil, days: nil, hours: nil, minutes: nil, seconds: nil, milliseconds: nil, microseconds: nil, nanoseconds: nil, time_unit: "us") ⇒ Expr
Create polars Duration from distinct time components.
706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 |
# File 'lib/polars/lazy_functions.rb', line 706 def duration( weeks: nil, days: nil, hours: nil, minutes: nil, seconds: nil, milliseconds: nil, microseconds: nil, nanoseconds: nil, time_unit: "us" ) if !weeks.nil? weeks = Utils.expr_to_lit_or_expr(weeks, str_to_lit: false)._rbexpr end if !days.nil? days = Utils.expr_to_lit_or_expr(days, str_to_lit: false)._rbexpr end if !hours.nil? hours = Utils.expr_to_lit_or_expr(hours, str_to_lit: false)._rbexpr end if !minutes.nil? minutes = Utils.expr_to_lit_or_expr(minutes, str_to_lit: false)._rbexpr end if !seconds.nil? seconds = Utils.expr_to_lit_or_expr(seconds, str_to_lit: false)._rbexpr end if !milliseconds.nil? milliseconds = Utils.expr_to_lit_or_expr(milliseconds, str_to_lit: false)._rbexpr end if !microseconds.nil? microseconds = Utils.expr_to_lit_or_expr(microseconds, str_to_lit: false)._rbexpr end if !nanoseconds.nil? nanoseconds = Utils.expr_to_lit_or_expr(nanoseconds, str_to_lit: false)._rbexpr end Utils.wrap_expr( _rb_duration( weeks, days, hours, minutes, seconds, milliseconds, microseconds, nanoseconds, time_unit ) ) end |
#element ⇒ Expr
Alias for an element in evaluated in an eval expression.
52 53 54 |
# File 'lib/polars/lazy_functions.rb', line 52 def element col("") end |
#exclude(columns) ⇒ Object
Exclude certain columns from a wildcard/regex selection.
548 549 550 |
# File 'lib/polars/lazy_functions.rb', line 548 def exclude(columns) col("*").exclude(columns) end |
#first(column = nil) ⇒ Object
Get the first value.
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 |
# File 'lib/polars/lazy_functions.rb', line 194 def first(column = nil) if column.nil? return Utils.wrap_expr(RbExpr.first) end if column.is_a?(Series) if column.len > 0 column[0] else raise IndexError, "The series is empty, so no first value can be returned." end else col(column).first end end |
#fold(acc, f, exprs) ⇒ Expr
Accumulate over multiple columns horizontally/row wise with a left fold.
432 433 434 435 436 437 438 439 440 |
# File 'lib/polars/lazy_functions.rb', line 432 def fold(acc, f, exprs) acc = Utils.expr_to_lit_or_expr(acc, str_to_lit: true) if exprs.is_a?(Expr) exprs = [exprs] end exprs = Utils.selection_to_rbexpr_list(exprs) Utils.wrap_expr(RbExpr.fold(acc._rbexpr, f, exprs)) end |
#format(fstring, *args) ⇒ Expr
Format expressions as a string.
835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 |
# File 'lib/polars/lazy_functions.rb', line 835 def format(fstring, *args) if fstring.scan("{}").length != args.length raise ArgumentError, "number of placeholders should equal the number of arguments" end exprs = [] arguments = args.each fstring.split(/(\{\})/).each do |s| if s == "{}" e = Utils.expr_to_lit_or_expr(arguments.next, str_to_lit: false) exprs << e elsif s.length > 0 exprs << lit(s) end end concat_str(exprs, sep: "") end |
#from_epoch(column, unit: "s", eager: false) ⇒ Object
Utility function that parses an epoch timestamp (or Unix time) to Polars Date(time).
Depending on the unit provided, this function will return a different dtype:
- unit: "d" returns pl.Date
- unit: "s" returns pl.Datetime"us"
- unit: "ms" returns pl.Datetime["ms"]
- unit: "us" returns pl.Datetime["us"]
- unit: "ns" returns pl.Datetime["ns"]
1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 |
# File 'lib/polars/lazy_functions.rb', line 1129 def from_epoch(column, unit: "s", eager: false) if Utils.strlike?(column) column = col(column) elsif !column.is_a?(Series) && !column.is_a?(Expr) column = Series.new(column) end if unit == "d" expr = column.cast(Date) elsif unit == "s" expr = (column.cast(Int64) * 1_000_000).cast(Datetime.new("us")) elsif Utils::DTYPE_TEMPORAL_UNITS.include?(unit) expr = column.cast(Datetime.new(unit)) else raise ArgumentError, "'unit' must be one of {{'ns', 'us', 'ms', 's', 'd'}}, got '#{unit}'." end if eager if !column.is_a?(Series) raise ArgumentError, "expected Series or Array if eager: true, got #{column.class.name}" else column.to_frame.select(expr).to_series end else expr end end |
#groups(column) ⇒ Object
Syntactic sugar for Polars.col("foo").agg_groups.
589 590 591 |
# File 'lib/polars/lazy_functions.rb', line 589 def groups(column) col(column).agg_groups end |
#head(column, n = 10) ⇒ Object
Get the first n rows.
242 243 244 245 246 247 248 |
# File 'lib/polars/lazy_functions.rb', line 242 def head(column, n = 10) if column.is_a?(Series) column.head(n) else col(column).head(n) end end |
#int_range(start, stop, step: 1, eager: false, dtype: nil) ⇒ Expr, Series Also known as: arange
Create a range expression (or Series).
This can be used in a select, with_column, etc. Be sure that the resulting
range size is equal to the length of the DataFrame you are collecting.
635 636 637 638 639 640 641 642 643 644 645 646 647 |
# File 'lib/polars/lazy_functions.rb', line 635 def int_range(start, stop, step: 1, eager: false, dtype: nil) start = Utils.parse_as_expression(start) stop = Utils.parse_as_expression(stop) dtype ||= Int64 dtype = dtype.to_s if dtype.is_a?(Symbol) result = Utils.wrap_expr(RbExpr.int_range(start, stop, step, dtype)).alias("arange") if eager return select(result).to_series end result end |
#last(column = nil) ⇒ Object
Get the last value.
Depending on the input type this function does different things:
- nil -> expression to take last column of a context.
- String -> syntactic sugar for
Polars.col(..).last - Series -> Take last value in
Series
219 220 221 222 223 224 225 226 227 228 229 230 231 232 |
# File 'lib/polars/lazy_functions.rb', line 219 def last(column = nil) if column.nil? return Utils.wrap_expr(_last) end if column.is_a?(Series) if column.len > 0 return column[-1] else raise IndexError, "The series is empty, so no last value can be returned" end end col(column).last end |
#lit(value, dtype: nil, allow_object: nil) ⇒ Expr
Return an expression representing a literal value.
269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 |
# File 'lib/polars/lazy_functions.rb', line 269 def lit(value, dtype: nil, allow_object: nil) if value.is_a?(::Time) || value.is_a?(::DateTime) time_unit = dtype&.time_unit || "ns" time_zone = dtype.&time_zone e = lit(Utils.(value, time_unit)).cast(Datetime.new(time_unit)) if time_zone return e.dt.replace_time_zone(time_zone.to_s) else return e end elsif value.is_a?(::Date) return lit(::Time.utc(value.year, value.month, value.day)).cast(Date) elsif value.is_a?(Polars::Series) name = value.name value = value._s e = Utils.wrap_expr(RbExpr.lit(value, allow_object)) if name == "" return e end return e.alias(name) elsif (defined?(Numo::NArray) && value.is_a?(Numo::NArray)) || value.is_a?(::Array) return lit(Series.new("", value)) elsif dtype return Utils.wrap_expr(RbExpr.lit(value, allow_object)).cast(dtype) end Utils.wrap_expr(RbExpr.lit(value, allow_object)) end |
#max(column) ⇒ Expr, Object
Get the maximum value.
113 114 115 116 117 118 119 |
# File 'lib/polars/lazy_functions.rb', line 113 def max(column) if column.is_a?(Series) column.max else col(column).max end end |
#mean(column) ⇒ Expr, Float
Get the mean value.
154 155 156 157 158 159 160 |
# File 'lib/polars/lazy_functions.rb', line 154 def mean(column) if column.is_a?(Series) column.mean else col(column).mean end end |
#median(column) ⇒ Object
Get the median value.
172 173 174 175 176 177 178 |
# File 'lib/polars/lazy_functions.rb', line 172 def median(column) if column.is_a?(Series) column.median else col(column).median end end |
#min(column) ⇒ Expr, Object
Get the minimum value.
127 128 129 130 131 132 133 |
# File 'lib/polars/lazy_functions.rb', line 127 def min(column) if column.is_a?(Series) column.min else col(column).min end end |
#n_unique(column) ⇒ Object
Count unique values.
183 184 185 186 187 188 189 |
# File 'lib/polars/lazy_functions.rb', line 183 def n_unique(column) if column.is_a?(Series) column.n_unique else col(column).n_unique end end |
#pearson_corr(a, b, ddof: 1) ⇒ Expr
Compute the pearson's correlation between two columns.
395 396 397 398 399 400 401 402 403 |
# File 'lib/polars/lazy_functions.rb', line 395 def pearson_corr(a, b, ddof: 1) if Utils.strlike?(a) a = col(a) end if Utils.strlike?(b) b = col(b) end Utils.wrap_expr(RbExpr.pearson_corr(a._rbexpr, b._rbexpr, ddof)) end |
#quantile(column, quantile, interpolation: "nearest") ⇒ Expr
Syntactic sugar for Polars.col("foo").quantile(...).
603 604 605 |
# File 'lib/polars/lazy_functions.rb', line 603 def quantile(column, quantile, interpolation: "nearest") col(column).quantile(quantile, interpolation: interpolation) end |
#repeat(value, n, dtype: nil, eager: false, name: nil) ⇒ Expr
Repeat a single value n times.
1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 |
# File 'lib/polars/lazy_functions.rb', line 1005 def repeat(value, n, dtype: nil, eager: false, name: nil) if !name.nil? warn "the `name` argument is deprecated. Use the `alias` method instead." end if n.is_a?(Integer) n = lit(n) end value = Utils.parse_as_expression(value, str_as_lit: true) expr = Utils.wrap_expr(RbExpr.repeat(value, n._rbexpr, dtype)) if !name.nil? expr = expr.alias(name) end if eager return select(expr).to_series end expr end |
#select(exprs) ⇒ DataFrame
Run polars expressions without a context.
935 936 937 |
# File 'lib/polars/lazy_functions.rb', line 935 def select(exprs) DataFrame.new([]).select(exprs) end |
#spearman_rank_corr(a, b, ddof: 1, propagate_nans: false) ⇒ Expr
Compute the spearman rank correlation between two columns.
Missing data will be excluded from the computation.
375 376 377 378 379 380 381 382 383 |
# File 'lib/polars/lazy_functions.rb', line 375 def spearman_rank_corr(a, b, ddof: 1, propagate_nans: false) if Utils.strlike?(a) a = col(a) end if Utils.strlike?(b) b = col(b) end Utils.wrap_expr(RbExpr.spearman_rank_corr(a._rbexpr, b._rbexpr, ddof, propagate_nans)) end |
#std(column, ddof: 1) ⇒ Object
Get the standard deviation.
88 89 90 91 92 93 94 |
# File 'lib/polars/lazy_functions.rb', line 88 def std(column, ddof: 1) if column.is_a?(Series) column.std(ddof: ddof) else col(column).std(ddof: ddof) end end |
#struct(exprs, eager: false) ⇒ Object
Collect several columns into a Series of dtype Struct.
985 986 987 988 989 990 991 |
# File 'lib/polars/lazy_functions.rb', line 985 def struct(exprs, eager: false) if eager Polars.select(struct(exprs, eager: false)).to_series end exprs = Utils.selection_to_rbexpr_list(exprs) Utils.wrap_expr(_as_struct(exprs)) end |
#sum(column) ⇒ Object
Sum values in a column/Series, or horizontally across list of columns/expressions.
138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/polars/lazy_functions.rb', line 138 def sum(column) if column.is_a?(Series) column.sum elsif Utils.strlike?(column) col(column.to_s).sum elsif column.is_a?(::Array) exprs = Utils.selection_to_rbexpr_list(column) Utils.wrap_expr(_sum_horizontal(exprs)) else fold(lit(0).cast(:u32), ->(a, b) { a + b }, column).alias("sum") end end |
#tail(column, n = 10) ⇒ Object
Get the last n rows.
258 259 260 261 262 263 264 |
# File 'lib/polars/lazy_functions.rb', line 258 def tail(column, n = 10) if column.is_a?(Series) column.tail(n) else col(column).tail(n) end end |
#to_list(name) ⇒ Expr
Aggregate to list.
81 82 83 |
# File 'lib/polars/lazy_functions.rb', line 81 def to_list(name) col(name).list end |
#var(column, ddof: 1) ⇒ Object
Get the variance.
99 100 101 102 103 104 105 |
# File 'lib/polars/lazy_functions.rb', line 99 def var(column, ddof: 1) if column.is_a?(Series) column.var(ddof: ddof) else col(column).var(ddof: ddof) end end |
#when(expr) ⇒ When
Start a "when, then, otherwise" expression.
1175 1176 1177 1178 1179 |
# File 'lib/polars/lazy_functions.rb', line 1175 def when(expr) expr = Utils.expr_to_lit_or_expr(expr) pw = RbExpr.when(expr._rbexpr) When.new(pw) end |