SeaDuck

Apache Iceberg for Ruby, powered by libduckdb

Build Status

Installation

First, install libduckdb. For Homebrew, use:

brew install duckdb

Then add this line to your application’s Gemfile:

gem "seaduck"

Getting Started

Create a client for an Iceberg catalog

catalog = SeaDuck::S3TablesCatalog.new(arn: "arn:aws:s3tables:...")

Note: SeaDuck requires a default namespace, which is main by default. This namespace is created if it does not exist. Pass default_namespace to use a different one.

Create a table

catalog.sql("CREATE TABLE events (id bigint, name text)")

Load data from a file

catalog.sql("COPY events FROM 'data.csv'")

You can also load data directly from other data sources

catalog.attach("blog", "postgres://localhost:5432/blog")
catalog.sql("INSERT INTO events SELECT * FROM blog.ahoy_events")

Query the data

catalog.sql("SELECT COUNT(*) FROM events").to_a

Namespaces

List namespaces

catalog.list_namespaces

Create a namespace

catalog.create_namespace("main")

Check if a namespace exists

catalog.namespace_exists?("main")

Drop a namespace

catalog.drop_namespace("main")

Tables

List tables

catalog.list_tables

Check if a table exists

catalog.table_exists?("events")

Drop a table

catalog.drop_table("events")

Snapshots

Get snapshots for a table

catalog.snapshots("events")

Query the data at a specific snapshot version or time

catalog.sql("SELECT * FROM events AT (VERSION => ?)", [3])
# or
catalog.sql("SELECT * FROM events AT (TIMESTAMP => ?)", [Date.today - 7])

SQL Safety

Use parameterized queries when possible

catalog.sql("SELECT * FROM events WHERE id = ?", [1])

For places that do not support parameters, use quote or quote_identifier

quoted_table = catalog.quote_identifier("events")
quoted_file = catalog.quote("path/to/data.csv")
catalog.sql("COPY #{quoted_table} FROM #{quoted_file}")

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/seaduck.git
cd seaduck
bundle install

# REST catalog
docker compose up
bundle exec rake test:rest

# S3 Tables catalog
bundle exec rake test:s3tables

# Glue catalog
bundle exec rake test:glue