Class: ClassicBandit::EpsilonGreedy
- Inherits:
-
Object
- Object
- ClassicBandit::EpsilonGreedy
- Includes:
- ArmUpdatable
- Defined in:
- lib/classic_bandit/epsilon_greedy.rb
Overview
Implements the Epsilon-Greedy algorithm for multi-armed bandit problems. This algorithm makes a random choice with probability epsilon (exploration) and chooses the arm with the highest mean reward with probability 1-epsilon (exploitation).
Instance Attribute Summary collapse
-
#arms ⇒ Object
readonly
Returns the value of attribute arms.
-
#epsilon ⇒ Object
readonly
Returns the value of attribute epsilon.
Instance Method Summary collapse
-
#initialize(arms:, epsilon: 0.1) ⇒ EpsilonGreedy
constructor
A new instance of EpsilonGreedy.
- #select_arm ⇒ Object
Methods included from ArmUpdatable
Constructor Details
#initialize(arms:, epsilon: 0.1) ⇒ EpsilonGreedy
Returns a new instance of EpsilonGreedy.
21 22 23 24 25 26 |
# File 'lib/classic_bandit/epsilon_greedy.rb', line 21 def initialize(arms:, epsilon: 0.1) @arms = arms @epsilon = epsilon validate_epsilon! end |
Instance Attribute Details
#arms ⇒ Object (readonly)
Returns the value of attribute arms.
19 20 21 |
# File 'lib/classic_bandit/epsilon_greedy.rb', line 19 def arms @arms end |
#epsilon ⇒ Object (readonly)
Returns the value of attribute epsilon.
19 20 21 |
# File 'lib/classic_bandit/epsilon_greedy.rb', line 19 def epsilon @epsilon end |
Instance Method Details
#select_arm ⇒ Object
28 29 30 31 32 33 34 35 36 37 38 39 |
# File 'lib/classic_bandit/epsilon_greedy.rb', line 28 def select_arm # If no arms have been tried, do random selection return @arms.sample if @arms.all? { |arm| arm.trials.zero? } if rand < @epsilon # Exploration: random selection @arms.sample else # Exploitation: select arm with highest mean reward @arms.max_by(&:mean_reward) end end |