Red Arrow - Apache Arrow Ruby

Red Arrow is the Ruby bindings of Apache Arrow. Red Arrow is based on GObject Introspection.

Apache Arrow is an in-memory columnar data store. It’s used by many products for data analytics.

GObject Introspection is a middleware for language bindings of C library. GObject Introspection can generate language bindings automatically at runtime.

Red Arrow uses Apache Arrow GLib and gobject-introspection gem to generate Ruby bindings of Apache Arrow.

Apache Arrow GLib is a C wrapper for Apache Arrow C++. GObject Introspection can’t use Apache Arrow C++ directly. Apache Arrow GLib is a bridge between Apache Arrow C++ and GObject Introspection.

gobject-introspection gem is a Ruby bindings of GObject Introspection. Red Arrow uses GObject Introspection via gobject-introspection gem.

Install

Install Apache Arrow GLib before install Red Arrow. See Apache Arrow install document for details.

Install Red Arrow after you install Apache Arrow GLib:

“nsole % gem install red-arrow

Usage

“by require “arrow”

table = Arrow::Table.load(“/dev/shm/data.arrow”)

Process data in table

table.save(“/dev/shm/data-processed.arrow”)

Development

Note that you need to install Apache Arrow C++/GLib at master before preparing Red Arrow. See also:

  • For Apache Arrow C++: https://arrow.apache.org/docs/developers/cpp/building.html
  • For Apache Arrow GLib: https://github.com/apache/arrow/blob/main/c_glib/README.md

“nsole $ cd ruby/red-arrow $ bundle install $ bundle exec rake test

For macOS with Homebrew

“nsole $ cd ruby/red-arrow $ bundle install $ brew install apache-arrow –head $ brew install apache-arrow-glib –head $ bundle exec rake test