Red Arrow - Apache Arrow Ruby

Red Arrow is the Ruby bindings of Apache Arrow. Red Arrow is based on GObject Introspection.

Apache Arrow is an in-memory columnar data store. It's used by many products for data analytics.

GObject Introspection is a middleware for language bindings of C library. GObject Introspection can generate language bindings automatically at runtime.

Red Arrow uses Apache Arrow GLib and gobject-introspection gem to generate Ruby bindings of Apache Arrow.

Apache Arrow GLib is a C wrapper for Apache Arrow C++. GObject Introspection can't use Apache Arrow C++ directly. Apache Arrow GLib is a bridge between Apache Arrow C++ and GObject Introspection.

gobject-introspection gem is a Ruby bindings of GObject Introspection. Red Arrow uses GObject Introspection via gobject-introspection gem.

Install

You need to install Apache Arrow GLib to install Red Arrow. You can automate it by enabling rubygems-requirements-system. If you want to install Apache Arrow GLib manually, see Apache Arrow install document for details.

If you want to install Red Arrow by Bundler, you can add the followings to your Gemfile:

plugin "rubygems-requirements-system"

gem "red-arrow"

If you want to install Red Arrow by RubyGems, you can use the following command line:

$ gem install rubygems-requirements-system red-arrow

Usage

require "arrow"

table = Arrow::Table.load("/dev/shm/data.arrow")
# Process data in table
table.save("/dev/shm/data-processed.arrow")

Development

Note that you need to install Apache Arrow C++/GLib at master before preparing Red Arrow. See also:

$ cd ruby/red-arrow
$ bundle install
$ bundle exec rake test

For macOS with Homebrew

$ cd ruby/red-arrow
$ bundle install
$ brew install apache-arrow --head
$ brew install apache-arrow-glib --head
$ bundle exec rake test