Xgb

XGBoost - the high performance machine learning library - for Ruby

:fire: Uses the C API for blazing performance

Build Status

Installation

First, install XGBoost. On Mac, copy lib/libxgboost.dylib to /usr/local/lib.

Add this line to your application’s Gemfile:

gem 'xgb'

Getting Started

This library follows the Core Data Structure, Learning and Scikit-Learn APIs of the Python library. Some methods and options are missing at the moment. PRs welcome!

Learning API

Train a model

params = {objective: "reg:squarederror"}
dtrain = Xgb::DMatrix.new(x_train, label: y_train)
booster = Xgb.train(params, dtrain)

Predict

booster.predict(x_test)

Save the model to a file

booster.save_model("my.model")

Load the model from a file

booster = Xgb::Booster.new(model_file: "my.model")

Get the importance of features

booster.score

Early stopping

Xgb.train(params, dtrain, evals: [[dtrain, "train"], [dtest, "eval"]], early_stopping_rounds: 5)

CV

Xgb.cv(params, dtrain, nfold: 3, verbose_eval: true)

Scikit-Learn API

Prep your data

x = [[1, 2], [3, 4], [5, 6], [7, 8]]
y = [1, 2, 3, 4]

Train a model

model = Xgb::Regressor.new
model.fit(x, y)

For classification, use Xgb::Classifier

Predict

model.predict(x)

For classification, use predict_proba for probabilities

Save the model to a file

model.save_model("my.model")

Load the model from a file

model.load_model("my.model")

Get the importance of features

model.feature_importances

Data

Data can be an array of arrays

[[1, 2, 3], [4, 5, 6]]

Or a Daru data frame

Daru::DataFrame.from_csv("houses.csv")

Or a Numo NArray

Numo::DFloat.new(3, 2).seq

Helpful Resources

  • LightGBM - LightGBM for Ruby
  • Eps - Machine Learning for Ruby

Credits

Thanks to the xgboost gem for serving as an initial reference, and Selva Prabhakaran for the test datasets.

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help: