Xgb
XGBoost - the high performance machine learning library - for Ruby
:fire: Uses the C API for blazing performance
Installation
First, install XGBoost. On Mac, copy lib/libxgboost.dylib
to /usr/local/lib
.
Add this line to your application’s Gemfile:
gem 'xgb'
Getting Started
This library follows the Core Data Structure, Learning and Scikit-Learn APIs of the Python library. Some methods and options are missing at the moment. PRs welcome!
Learning API
Train a model
params = {objective: "reg:squarederror"}
dtrain = Xgb::DMatrix.new(x_train, label: y_train)
booster = Xgb.train(params, dtrain)
Predict
booster.predict(x_test)
Save the model to a file
booster.save_model("my.model")
Load the model from a file
booster = Xgb::Booster.new(model_file: "my.model")
Get the importance of features
booster.score
Early stopping
Xgb.train(params, dtrain, evals: [[dtrain, "train"], [dtest, "eval"]], early_stopping_rounds: 5)
CV
Xgb.cv(params, dtrain, nfold: 3, verbose_eval: true)
Scikit-Learn API
Prep your data
x = [[1, 2], [3, 4], [5, 6], [7, 8]]
y = [1, 2, 3, 4]
Train a model
model = Xgb::Regressor.new
model.fit(x, y)
For classification, use
Xgb::Classifier
Predict
model.predict(x)
For classification, use
predict_proba
for probabilities
Save the model to a file
model.save_model("my.model")
Load the model from a file
model.load_model("my.model")
Get the importance of features
model.feature_importances
Data
Data can be an array of arrays
[[1, 2, 3], [4, 5, 6]]
Or a Daru data frame
Daru::DataFrame.from_csv("houses.csv")
Or a Numo NArray
Numo::DFloat.new(3, 2).seq
Helpful Resources
Related Projects
Credits
Thanks to the xgboost gem for serving as an initial reference, and Selva Prabhakaran for the test datasets.
History
View the changelog
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features