Class: ClassicBandit::Ucb1
- Inherits:
-
Object
- Object
- ClassicBandit::Ucb1
- Includes:
- ArmUpdatable
- Defined in:
- lib/classic_bandit/ucb1.rb
Overview
Implements the UCB1 (Upper Confidence Bound) algorithm for multi-armed bandit problems. This algorithm selects arms based on their mean rewards plus a confidence term, balancing exploration and exploitation without requiring an explicit epsilon parameter.
Instance Attribute Summary collapse
-
#arms ⇒ Array<Arm>
readonly
Available arms for selection.
Instance Method Summary collapse
-
#initialize(arms:) ⇒ Ucb1
constructor
Initialize a new UCB1 bandit.
-
#select_arm ⇒ Arm
Select an arm using the UCB1 algorithm.
Methods included from ArmUpdatable
Constructor Details
#initialize(arms:) ⇒ Ucb1
Initialize a new UCB1 bandit
24 25 26 |
# File 'lib/classic_bandit/ucb1.rb', line 24 def initialize(arms:) @arms = arms end |
Instance Attribute Details
#arms ⇒ Array<Arm> (readonly)
Returns Available arms for selection.
20 21 22 |
# File 'lib/classic_bandit/ucb1.rb', line 20 def arms @arms end |
Instance Method Details
#select_arm ⇒ Arm
Select an arm using the UCB1 algorithm. Initially tries each arm once, then uses UCB1 formula for selection.
31 32 33 34 35 36 37 38 |
# File 'lib/classic_bandit/ucb1.rb', line 31 def select_arm # use untried arm if exists. untried_arm = @arms.find { |arm| arm.trials.zero? } return untried_arm if untried_arm total_trials = @arms.sum(&:trials) @arms.max_by { |arm| ucb_score(arm, total_trials) } end |