Class: Sequel::Unionize::Unionizer::Chunk

Inherits:
Object
  • Object
show all
Defined in:
lib/sequel/extensions/unionize.rb

Overview

Represents a chunk of datasets to be combined via UNION.

Each chunk handles a subset of datasets, creates a temporary table/view for the combined result, and provides access to the unified dataset.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(db, dses, opts) ⇒ Chunk

Creates a new chunk instance.

Parameters:

  • db (Sequel::Database)

    The database connection

  • dses (Array<Sequel::Dataset>)

    The datasets to combine

  • opts (Hash)

    Options for the union operation



53
54
55
56
57
# File 'lib/sequel/extensions/unionize.rb', line 53

def initialize(db, dses, opts)
  @db = db
  @dses = dses
  @opts = opts
end

Instance Attribute Details

#dbSequel::Database (readonly)

Returns The database connection.

Returns:

  • (Sequel::Database)

    The database connection



46
47
48
# File 'lib/sequel/extensions/unionize.rb', line 46

def db
  @db
end

#dsesArray<Sequel::Dataset> (readonly)

Returns The datasets in this chunk.

Returns:

  • (Array<Sequel::Dataset>)

    The datasets in this chunk



46
# File 'lib/sequel/extensions/unionize.rb', line 46

attr_reader :db, :dses, :opts

#optsObject (readonly)

Returns the value of attribute opts.



46
# File 'lib/sequel/extensions/unionize.rb', line 46

attr_reader :db, :dses, :opts

Instance Method Details

#createvoid

This method returns an undefined value.

Creates a temporary table or view for this chunk’s union result.

The method used depends on the database type:

  • Spark: Creates a temporary view

  • DuckDB: Creates a temporary table

Raises:

  • (RuntimeError)

    If the database type is not supported



84
85
86
87
88
89
90
91
92
# File 'lib/sequel/extensions/unionize.rb', line 84

def create
  if db.database_type == :spark
    db.create_view(name, union, temp: true)
  elsif db.database_type == :duckdb
    db.create_table(name, temp: true, as: union)
  else
    raise "Unsupported database type: #{db.database_type}"
  end
end

#nameSymbol

Generates a unique name for the temporary table/view.

The name is based on a hash of the SQL query to ensure uniqueness and avoid collisions when multiple unionize operations are running.

Returns:

  • (Symbol)

    The temporary table/view name



72
73
74
# File 'lib/sequel/extensions/unionize.rb', line 72

def name
  @name ||= :"#{opts[:temp_table_prefix]}_#{Digest::SHA1.hexdigest(union.sql)}"
end

#unionSequel::Dataset

Returns the unified dataset created by combining all datasets in this chunk.

Returns:

  • (Sequel::Dataset)

    The combined dataset



62
63
64
# File 'lib/sequel/extensions/unionize.rb', line 62

def union
  @union ||= dses.reduce { |a, b| a.union(b, all: opts[:all], from_self: opts[:from_self]) }
end