Class: ClassicBandit::Ucb1

Inherits:
Object
  • Object
show all
Includes:
ArmUpdatable
Defined in:
lib/classic_bandit/ucb1.rb

Overview

Implements the UCB1 (Upper Confidence Bound) algorithm for multi-armed bandit problems. This algorithm selects arms based on their mean rewards plus a confidence term, balancing exploration and exploitation without requiring an explicit epsilon parameter.

Examples:

Create and use UCB1 bandit

arms = [
  ClassicBandit::Arm.new(id: 1, name: "banner_a"),
  ClassicBandit::Arm.new(id: 2, name: "banner_b")
]
bandit = ClassicBandit::Ucb1.new(arms: arms)
selected_arm = bandit.select_arm
bandit.update(selected_arm, reward: 1)

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods included from ArmUpdatable

#update

Constructor Details

#initialize(arms:) ⇒ Ucb1

Initialize a new UCB1 bandit

Parameters:

  • arms (Array<Arm>)

    List of arms to choose from



24
25
26
# File 'lib/classic_bandit/ucb1.rb', line 24

def initialize(arms:)
  @arms = arms
end

Instance Attribute Details

#armsArray<Arm> (readonly)

Returns Available arms for selection.

Returns:

  • (Array<Arm>)

    Available arms for selection



20
21
22
# File 'lib/classic_bandit/ucb1.rb', line 20

def arms
  @arms
end

Instance Method Details

#select_armArm

Select an arm using the UCB1 algorithm. Initially tries each arm once, then uses UCB1 formula for selection.

Returns:

  • (Arm)

    Selected arm



31
32
33
34
35
36
37
38
# File 'lib/classic_bandit/ucb1.rb', line 31

def select_arm
  # use untried arm if exists.
  untried_arm = @arms.find { |arm| arm.trials.zero? }
  return untried_arm if untried_arm

  total_trials = @arms.sum(&:trials)
  @arms.max_by { |arm| ucb_score(arm, total_trials) }
end