Module: ClassicBandit::ArmUpdatable

Included in:
EpsilonGreedy, Softmax, ThompsonSampling, Ucb1
Defined in:
lib/classic_bandit/arm_updatable.rb

Overview

Provides common update functionality for bandit algorithms to update arm statistics with observed rewards.

Examples:

Update an arm with a reward

class MyBandit
  include ArmUpdatable
end

bandit = MyBandit.new
bandit.update(selected_arm, reward: 1)

Instance Method Summary collapse

Instance Method Details

#update(arm, reward) ⇒ Object

Update the selected arm with the observed reward

Parameters:

  • arm (Arm)

    The arm that was selected

  • reward (Integer)

    The observed reward (0 or 1)



18
19
20
21
22
23
# File 'lib/classic_bandit/arm_updatable.rb', line 18

def update(arm, reward)
  validate_reward!(reward)

  arm.trials += 1
  arm.successes += reward
end