Red Arrow - Apache Arrow Ruby
Red Arrow is the Ruby bindings of Apache Arrow. Red Arrow is based on GObject Introspection.
Apache Arrow is an in-memory columnar data store. It’s used by many products for data analytics.
GObject Introspection is a middleware for language bindings of C library. GObject Introspection can generate language bindings automatically at runtime.
Red Arrow uses Apache Arrow GLib and gobject-introspection gem to generate Ruby bindings of Apache Arrow.
Apache Arrow GLib is a C wrapper for Apache Arrow C++. GObject Introspection can’t use Apache Arrow C++ directly. Apache Arrow GLib is a bridge between Apache Arrow C++ and GObject Introspection.
gobject-introspection gem is a Ruby bindings of GObject Introspection. Red Arrow uses GObject Introspection via gobject-introspection gem.
Install
Install Apache Arrow GLib before install Red Arrow. See Apache Arrow install document for details.
Install Red Arrow after you install Apache Arrow GLib:
“nsole % gem install red-arrow
“
Usage
“by require “arrow”
table = Arrow::Table.load(“/dev/shm/data.arrow”)
Process data in table
table.save(“/dev/shm/data-processed.arrow”)
“
Development
Note that you need to install Apache Arrow C++/GLib at master before preparing Red Arrow. See also:
- For Apache Arrow C++: https://arrow.apache.org/docs/developers/cpp/building.html
- For Apache Arrow GLib: https://github.com/apache/arrow/blob/main/c_glib/README.md
“nsole $ cd ruby/red-arrow $ bundle install $ bundle exec rake test
“
For macOS with Homebrew
“nsole $ cd ruby/red-arrow $ bundle install $ brew install apache-arrow –head $ brew install apache-arrow-glib –head $ bundle exec rake test
“