Red Arrow - Apache Arrow Ruby

Red Arrow is the Ruby bindings of Apache Arrow. Red Arrow is based on GObject Introspection.

Apache Arrow is an in-memory columnar data store. It's used by many products for data analytics.

GObject Introspection is a middleware for language bindings of C library. GObject Introspection can generate language bindings automatically at runtime.

Red Arrow uses Apache Arrow GLib and gobject-introspection gem to generate Ruby bindings of Apache Arrow.

Apache Arrow GLib is a C wrapper for Apache Arrow C++. GObject Introspection can't use Apache Arrow C++ directly. Apache Arrow GLib is a bridge between Apache Arrow C++ and GObject Introspection.

gobject-introspection gem is a Ruby bindings of GObject Introspection. Red Arrow uses GObject Introspection via gobject-introspection gem.

Install

Install Apache Arrow GLib before install Red Arrow. Use packages.red-data-tools.org for installing Apache Arrow GLib.

Note that the Apache Arrow GLib packages are "unofficial". "Official" packages will be released in the future.

Install Red Arrow after you install Apache Arrow GLib:

% gem install red-arrow

Usage

require "arrow"

table = Arrow::Table.load("/dev/shm/data.arrow")
# Process data in table
table.save("/dev/shm/data-processed.arrow")