Cumo

Cumo (pronounced like "koomo") is CUDA-aware numerical library whose interface is highly compatible with Ruby Numo. This library provides the benefit of speedup using GPU by replacing Numo with only a small piece of codes.

cumo logo

Requirements

  • Ruby 2.5 or later
  • NVIDIA GPU Compute Capability 6.0 (Pascal) or later
  • CUDA 9.0 or later

Preparation

Install CUDA and setup environment variables as follows:

export CUDA_PATH="/usr/local/cuda"
export CPATH="$CUDA_PATH/include:$CPATH"
export LD_LIBRARY_PATH="$CUDA_PATH/lib64:$CUDA_PATH/lib:$LD_LIBRARY_PATH"
export PATH="$CUDA_PATH/bin:$PATH"
export LIBRARY_PATH="$CUDA_PATH/lib64:$CUDA_PATH/lib:$LIBRARY_PATH"

Installation

Add a following line to your Gemfile:

gem 'cumo'

And then execute:

$ bundle

Or install it yourself as:

$ gem install cumo

How To Use

Quick start

An example:

[1] pry(main)> require "cumo/narray"
=> true
[2] pry(main)> a = Cumo::DFloat.new(3,5).seq
=> Cumo::DFloat#shape=[3,5]
[[0, 1, 2, 3, 4],
 [5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14]]
[3] pry(main)> a.shape
=> [3, 5]
[4] pry(main)> a.ndim
=> 2
[5] pry(main)> a.class
=> Cumo::DFloat
[6] pry(main)> a.size
=> 15

How to switch from Numo to Cumo

Basically, following command should make it work with Cumo.

find . -type f | xargs sed -i -e 's/Numo/Cumo/g' -e 's/numo/cumo/g'

If you want to switch Numo and Cumo dynamically, following snippets should work:

if gpu
  require 'cumo/narray'
  xm = Cumo
else
  require 'numo/narray'
  xm = Numo
end

a = xm::DFloat.new(3,5).seq

Incompatibility With Numo

Following methods behave incompatibly with Numo as default for performance.

  • extract
  • []
  • count_true
  • count_false

Numo returns a Ruby numeric object for 0-dimensional NArray, but Cumo returns the 0-dimensional NArray instead of a Ruby numeric object. This is to avoid synchnoziation between CPU and GPU for performance.

You may set CUMO_COMPATIBLE_MODE=ON environment variable to force Cumo NArray behave compatibly with Numo NArray.

You may enable or disable compatible_mode as:

require 'cumo'
Cumo.enable_compatible_mode # enable
Cumo.compatible_mode_enabled? #=> true
Cumo.disable_compatible_mode # disable
Cumo.compatible_mode_enabled? #=> false

You can also use following methods which behaves as Numo NArray's methods. Behaviors of these methods do not depend on compatible_mode.

  • extract_cpu
  • aref_cpu(*idx)
  • count_true_cpu
  • count_false_cpu

Select a GPU device ID

Set CUDA_VISIBLE_DEVICES=id environment variable, or

require 'cumo'
Cumo::CUDA::Runtime.cudaSetDevice(id)

where id is an integer.

Disable GPU Memory Pool

GPU memory pool is enabled as default. To disable, set CUMO_MEMORY_POOL=OFF environment variable , or

require 'cumo'
Cumo::CUDA::MemoryPool.disable

Documentation

See https://github.com/ruby-numo/numo-narray#documentation and replace Numo to Cumo.

Contributions

This project is still under development. See issues for future works.

Development

Install ruby dependencies:

bundle install --path vendor/bundle

Compile:

bundle exec rake compile

Run tests:

bundle exec rake test

Generate docs:

bundle exec rake docs

Advanced Tips on Development

ccache

ccache would be useful to speedup compilation time. Install ccache and setup as:

export PATH="$HOME/opt/ccache/bin:$PATH"
ln -sf "$HOME/opt/ccache/bin/ccache" "$HOME/opt/ccache/bin/gcc"
ln -sf "$HOME/opt/ccache/bin/ccache" "$HOME/opt/ccache/bin/g++"
ln -sf "$HOME/opt/ccache/bin/ccache" "$HOME/opt/ccache/bin/nvcc"

Build in parallel

Use MAKEFLAGS environment variable to specify make command options. You can build in parallel as:

bundle exec env MAKEFLAG=-j8 rake compile

Specify nvcc --generate-code options

bundle exec env CUMO_NVCC_GENERATE_CODE=arch=compute_60,code=sm_60 rake compile

This is useful even on development because it makes possible to skip JIT compilation of PTX to cubin occurring on runtime.

Run tests with gdb

Compile with debug option:

bundle exec DEBUG=1 rake compile

Run tests with gdb:

bundle exec gdb -x run.gdb --args ruby test/narray_test.rb

You may put a breakpoint by calling cumo_debug_breakpoint() at C source codes.

Run tests only a specific line

--location option is available as:

bundle exec ruby test/narray_test.rb --location 121

Compile and run tests only a specific type

DTYPE environment variable is available as:

bundle exec DTYPE=dfloat rake compile
bundle exec DTYPE=dfloat ruby test/narray_test.rb

Run program always synchronizing CPU and GPU

bundle exec CUDA_LAUNCH_BLOCKING=1

Show GPU synchnoziation warnings

Cumo shows warnings if CPU and GPU synchronization occurs if:

export CUMO_SHOW_WARNING=ON

As default, it shows warnings occurred at the same place only once. You may want to show warnings everytime rather than once as:

export CUMO_SHOW_WARNING=ON
export CUMO_SHOW_WARNING_ONCE=OFF

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/sonots/cumo.

License