Class: SupervisedLearning::LinearRegression
- Inherits:
-
Object
- Object
- SupervisedLearning::LinearRegression
- Defined in:
- lib/supervised_learning.rb
Overview
This class uses linear regression to make predictions based on a training set. For datasets of less than 1000 columns, use #predict since this will give the most accurate prediction. For larger datasets where the #predict method is too slow, use #predict_advanced. The algorithms in #predict and #predict_advanced were provided by Andrew Ng (Stanford University).
Instance Method Summary collapse
-
#initialize(training_set) ⇒ LinearRegression
constructor
Initializes a LinearRegression object with a training set.
-
#predict(prediction) ⇒ Object
Makes prediction using normalization.
-
#predict_advanced(prediction, learning_rate = 0.01, iterations = 1000, debug = false) ⇒ Object
Makes prediction using gradient descent.
Constructor Details
#initialize(training_set) ⇒ LinearRegression
Initializes a LinearRegression object with a training set
17 18 19 20 21 22 23 24 25 26 27 28 29 |
# File 'lib/supervised_learning.rb', line 17 def initialize(training_set) @training_set = training_set raise ArgumentError, 'input is not a Matrix' unless @training_set.is_a? Matrix raise ArgumentError, 'Matrix must have at least 2 columns and 1 row' unless @training_set.column_size > 1 @number_of_features = @training_set.column_size - 1 @number_of_training_examples = @training_set.row_size @feature_set = @training_set.clone @feature_set.hpop # remove output set @output_set = @training_set.column_vectors.last end |
Instance Method Details
#predict(prediction) ⇒ Object
Makes prediction using normalization. This algorithm is the most accurate one but with large sets (more than 1000 columns) it might take too long to calculate.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# File 'lib/supervised_learning.rb', line 35 def predict(prediction) # add ones to feature set feature_set = Matrix.hconcat(Matrix.one(@number_of_training_examples, 1), @feature_set) validate_prediction_input(prediction) transposed_feature_set = feature_set.transpose # only transpose once for efficiency theta = (transposed_feature_set * feature_set).inverse * transposed_feature_set * @output_set # add column of ones to prediction prediction = Matrix.hconcat(Matrix.one(prediction.row_size, 1), prediction) result_vectorized = prediction * theta result = result_vectorized.to_a.first.to_f end |
#predict_advanced(prediction, learning_rate = 0.01, iterations = 1000, debug = false) ⇒ Object
Makes prediction using gradient descent. This algorithm is requires less computing power than #predict but is less accurate since it uses approximation.
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
# File 'lib/supervised_learning.rb', line 55 def predict_advanced(prediction, learning_rate = 0.01, iterations = 1000, debug = false) validate_prediction_input(prediction) feature_set = normalize_feature_set(@feature_set) # add ones to feature set after normalization feature_set = Matrix.hconcat(Matrix.one(@number_of_training_examples, 1), feature_set) # prepare theta column vector with zeros theta = Matrix.zero(@number_of_features+1, 1) iterations.times do theta = theta - (learning_rate * (1.0/@number_of_training_examples) * (feature_set * theta - @output_set).transpose * feature_set).transpose if debug puts "Theta: #{theta}" puts "Cost: #{calculate_cost(feature_set, theta)}" end end # normalize prediction prediction = normalize_prediction(prediction) # add column of ones to prediction prediction = Matrix.hconcat(Matrix.one(prediction.row_size, 1), prediction) result_vectorized = prediction * theta result = result_vectorized[0,0] end |