LIBMF Ruby
LIBMF - large-scale sparse matrix factorization - for Ruby
Check out Disco for higher-level collaborative filtering
Installation
Add this line to your application’s Gemfile:
gem "libmf"
Getting Started
Prep your data in the format row_index, column_index, value
data = Libmf::Matrix.new
data.push(0, 0, 5.0)
data.push(0, 2, 3.5)
data.push(1, 1, 4.0)
Create a model
model = Libmf::Model.new
model.fit(data)
Make predictions
model.predict(row_index, column_index)
Get the latent factors (these approximate the training matrix)
model.p_factors
model.q_factors
Get the bias (average of all elements in the training matrix)
model.bias
Save the model to a file
model.save("model.txt")
Load the model from a file
model = Libmf::Model.load("model.txt")
Pass a validation set
model.fit(data, eval_set: eval_set)
Cross-Validation
Perform cross-validation
model.cv(data)
Specify the number of folds
model.cv(data, folds: 5)
Parameters
Pass parameters - default values below
Libmf::Model.new(
loss: :real_l2, # loss function
factors: 8, # number of latent factors
threads: 12, # number of threads used
bins: 25, # number of bins
iterations: 20, # number of iterations
lambda_p1: 0, # coefficient of L1-norm regularization on P
lambda_p2: 0.1, # coefficient of L2-norm regularization on P
lambda_q1: 0, # coefficient of L1-norm regularization on Q
lambda_q2: 0.1, # coefficient of L2-norm regularization on Q
learning_rate: 0.1, # learning rate
alpha: 1, # importance of negative entries
c: 0.0001, # desired value of negative entries
nmf: false, # perform non-negative MF (NMF)
quiet: false # no outputs to stdout
)
Loss Functions
For real-valued matrix factorization
:real_l2
- squared error (L2-norm):real_l1
- absolute error (L1-norm):real_kl
- generalized KL-divergence
For binary matrix factorization
:binary_log
- logarithmic error:binary_l2
- squared hinge loss:binary_l1
- hinge loss
For one-class matrix factorization
:one_class_row
- row-oriented pair-wise logarithmic loss:one_class_col
- column-oriented pair-wise logarithmic loss:one_class_l2
- squared error (L2-norm)
Metrics
Calculate RMSE (for real-valued MF)
model.rmse(data)
Calculate MAE (for real-valued MF)
model.mae(data)
Calculate generalized KL-divergence (for non-negative real-valued MF)
model.gkl(data)
Calculate logarithmic loss (for binary MF)
model.logloss(data)
Calculate accuracy (for binary MF)
model.accuracy(data)
Calculate MPR (for one-class MF)
model.mpr(data, transpose)
Calculate AUC (for one-class MF)
model.auc(data, transpose)
Performance
For performance, read data directly from files
model.fit("train.txt", eval_set: "validate.txt")
model.cv("train.txt")
Data should be in the format row_index column_index value
:
0 0 5.0
0 2 3.5
1 1 4.0
Numo
Get latent factors as Numo arrays
model.p_factors(format: :numo)
model.q_factors(format: :numo)
Resources
History
View the changelog
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/ankane/libmf-ruby.git
cd libmf-ruby
bundle install
bundle exec rake vendor:all
bundle exec rake test