Class: Vanity::Experiment::AbTest

Inherits:
Base
  • Object
show all
Defined in:
lib/vanity/experiment/ab_test.rb

Overview

The meat.

Constant Summary collapse

DEFAULT_SCORE_METHOD =
:z_score

Instance Attribute Summary

Attributes inherited from Base

#id, #name, #playground

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Base

#active?, #complete_if, #completed_at, #created_at, #description, #identify, load, #on_assignment, #reject, #type, type

Constructor Details

#initialize(*args) ⇒ AbTest

Returns a new instance of AbTest.



24
25
26
27
28
29
# File 'lib/vanity/experiment/ab_test.rb', line 24

def initialize(*args)
  super
  @score_method = DEFAULT_SCORE_METHOD
  @use_probabilities = nil
  @is_default_set = false
end

Class Method Details

.friendly_nameObject



17
18
19
# File 'lib/vanity/experiment/ab_test.rb', line 17

def friendly_name
  "A/B Test"
end

.probability(score) ⇒ Object

Convert z-score to probability.



11
12
13
14
15
# File 'lib/vanity/experiment/ab_test.rb', line 11

def probability(score)
  score = score.abs
  probability = AbTest::Z_TO_PROBABILITY.find { |z, _p| score >= z }
  probability ? probability.last : 0
end

Instance Method Details

#alternative(value) ⇒ Object

Returns an Alternative with the specified value.

Examples:

alternative(:red) == alternatives[0]
alternative(:blue) == alternatives[2]


137
138
139
# File 'lib/vanity/experiment/ab_test.rb', line 137

def alternative(value)
  alternatives.find { |alt| alt.value == value }
end

#alternatives(*args) ⇒ Object

Call this method once to set alternative values for this experiment (requires at least two values). Call without arguments to obtain current list of alternatives. Call with a hash to set custom probabilities. If providing a hash of alternates, you may need to specify a default unless your hashes are ordered. (Ruby >= 1.9)

Examples:

Define A/B test with three alternatives

ab_test "Background color" do
  metrics :coolness
  alternatives "red", "blue", "orange"
end

Define A/B test with custom probabilities

ab_test "Background color" do
  metrics :coolness
  alternatives "red" => 10, "blue" => 5, "orange => 1
  default "red"
end

Find out which alternatives this test uses

alts = experiment(:background_color).alternatives
puts "#{alts.count} alternatives, with the colors: #{alts.map(&:value).join(", ")}"


124
125
126
127
128
129
130
# File 'lib/vanity/experiment/ab_test.rb', line 124

def alternatives(*args)
  if has_alternative_weights?(args)
    build_alternatives_with_weights(args)
  else
    build_alternatives(args)
  end
end

#bayes_bandit_score(_probability = 90) ⇒ Object

Scores alternatives based on the current tracking data, using Bayesian estimates of the best binomial bandit. Based on the R bandit package, cran.r-project.org/web/packages/bandit, which is based on Steven L. Scott, A modern Bayesian look at the multi-armed bandit, Appl. Stochastic Models Bus. Ind. 2010; 26:639-658. (www.economics.uci.edu/~ivan/asmb.874.pdf)

This method returns a structure with the following attributes:

:alts

Ordered list of alternatives, populated with scoring info.

:base

Second best performing alternative.

:least

Least performing alternative (but more than zero conversion).

:choice

Choice alternative, either the outcome or best alternative.

Alternatives returned by this method are populated with the following attributes:

:probability

Probability (probability this is the best alternative).

:difference

Difference from the least performant altenative.

The choice alternative is set only if its probability is higher or equal to the specified probability (default is 90%).



341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
# File 'lib/vanity/experiment/ab_test.rb', line 341

def bayes_bandit_score(_probability = 90)
  begin
    require "backports/1.9.1/kernel/define_singleton_method" if RUBY_VERSION < "1.9"
    require "integration"
    require "rubystats"
  rescue LoadError
    raise("to use bayes_bandit_score, install integration and rubystats gems")
  end

  begin
    require "gsl"
  rescue LoadError
    Vanity.logger.warn("for better integration performance, install gsl gem")
  end

  BayesianBanditScore.new(alternatives, outcome).calculate!
end

#calculate_scoreObject

– Reporting –



267
268
269
270
271
272
273
# File 'lib/vanity/experiment/ab_test.rb', line 267

def calculate_score
  if respond_to?(score_method)
    send(score_method)
  else
    score
  end
end

#choose(request = nil) ⇒ Object

Chooses a value for this experiment. You probably want to use the Rails helper method ab_test instead.

This method picks an alternative for the current identity and returns the alternative’s value. It will consistently choose the same alternative for the same identity, and randomly split alternatives between different identities.

Examples:

color = experiment(:which_blue).choose


188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
# File 'lib/vanity/experiment/ab_test.rb', line 188

def choose(request = nil)
  if @playground.collecting?
    if active?
      if enabled? # rubocop:todo Style/GuardClause
        return assignment_for_identity(request)
      else
        # Show the default if experiment is disabled.
        return default
      end
    else
      # If inactive, always show the outcome. Fallback to generation if one can't be found.
      index = connection.ab_get_outcome(@id) || alternative_for(identity)
    end
  else
    # If collecting=false, show the alternative, but don't track anything.
    identity = identity()
    @showing ||= {}
    @showing[identity] ||= alternative_for(identity)
    index = @showing[identity]
  end

  alternatives[index.to_i]
end

#chooses(value, request = nil) ⇒ Object

Forces this experiment to use a particular alternative. This may be used in test cases to force a specific alternative to obtain a deterministic test. This method also is used in the add_participant callback action when adding participants via vanity_js.

Examples:

Setup test to red button

setup do
  experiment(:button_color).chooses(:red)
end

def test_shows_red_button
  . . .
end

Use nil to clear selection

teardown do
  experiment(:green_button).chooses(nil)
end


232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
# File 'lib/vanity/experiment/ab_test.rb', line 232

def chooses(value, request = nil)
  if @playground.collecting?
    if value.nil?
      connection.ab_not_showing @id, identity
    else
      index = @alternatives.index(value)
      save_assignment(identity, index, request) unless filter_visitor?(request, identity)

      raise ArgumentError, "No alternative #{value.inspect} for #{name}" unless index

      if (connection.ab_showing(@id, identity) && connection.ab_showing(@id, identity) != index) ||
         alternative_for(identity) != index
        connection.ab_show(@id, identity, index)
      end
    end
  else
    @showing ||= {}
    @showing[identity] = value.nil? ? nil : @alternatives.index(value)
  end
  self
end

#complete!(outcome = nil) ⇒ Object



476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
# File 'lib/vanity/experiment/ab_test.rb', line 476

def complete!(outcome = nil)
  # This statement is equivalent to: return unless collecting?
  return unless @playground.collecting? && active?

  self.enabled = false
  super

  unless outcome
    if defined?(@outcome_is)
      begin
        result = @outcome_is.call
        outcome = result.id if result.is_a?(Alternative) && result.experiment == self
      rescue StandardError => e
        Vanity.logger.warn("Error in AbTest#complete!: #{e}")
      end
    else
      best = score.best
      outcome = best.id if best
    end
  end
  # TODO: logging
  connection.ab_set_outcome(@id, outcome || 0)
end

#conclusion(score = score()) ⇒ Object

Use the result of #score or #bayes_bandit_score to derive a conclusion. Returns an array of claims.



361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
# File 'lib/vanity/experiment/ab_test.rb', line 361

def conclusion(score = score())
  claims = []
  participants = score.alts.inject(0) { |t, alt| t + alt.participants }
  claims << if participants.zero?
              I18n.t('vanity.no_participants')
            else
              I18n.t('vanity.experiment_participants', count: participants)
            end
  # only interested in sorted alternatives with conversion
  sorted = score.alts.select { |alt| alt.measure > 0.0 }.sort_by(&:measure).reverse
  if sorted.size > 1
    # start with alternatives that have conversion, from best to worst,
    # then alternatives with no conversion.
    sorted |= score.alts
    # we want a result that's clearly better than 2nd best.
    best = sorted[0]
    second = sorted[1]
    if best.measure > second.measure
      diff = ((best.measure - second.measure) / second.measure * 100).round
      better = I18n.t('vanity.better_alternative_than', probability: diff.to_i, alternative: second.name) if diff > 0
      claims << I18n.t('vanity.best_alternative_measure', best_alternative: best.name, measure: format('%.1f', (best.measure * 100)), better_than: better)
      claims << if score.method == :bayes_bandit_score
                  if best.probability >= 90
                    I18n.t('vanity.best_alternative_probability', probability: score.best.probability.to_i)
                  else
                    I18n.t('vanity.low_result_confidence')
                  end
                elsif best.probability >= 90
                  I18n.t('vanity.best_alternative_is_significant', probability: score.best.probability.to_i)
                else
                  I18n.t('vanity.result_isnt_significant')
                end
      sorted.delete best
    end
    sorted.each do |alt|
      claims << if alt.measure > 0.0
                  I18n.t('vanity.converted_percentage', alternative: alt.name.sub(/^\w/, &:upcase), percentage: format('%.1f', (alt.measure * 100)))
                else
                  I18n.t('vanity.didnt_convert', alternative: alt.name.sub(/^\w/, &:upcase))
                end
    end
  else
    claims << I18n.t('vanity.no_clear_winner')
  end
  claims << I18n.t('vanity.selected_as_best', alternative: score.choice.name.sub(/^\w/, &:upcase)) if score.choice
  claims
end

#default(value) ⇒ Object

Call this method once to set a default alternative. Call without arguments to obtain the current default. If default is not specified, the first alternative is used.

Examples:

Set the default alternative

ab_test "Background color" do
  alternatives "red", "blue", "orange"
  default "red"
end

Get the default alternative

assert experiment(:background_color).default == "red"


45
46
47
48
49
50
51
52
53
54
55
56
# File 'lib/vanity/experiment/ab_test.rb', line 45

def default(value)
  @default = value
  @is_default_set = true
  class << self
    define_method :default do |*args|
      raise ArgumentError, "default has already been set to #{@default.inspect}" unless args.empty?

      alternative(@default)
    end
  end
  nil
end

#destroyObject

– Store/validate –



502
503
504
505
# File 'lib/vanity/experiment/ab_test.rb', line 502

def destroy
  connection.destroy_experiment(@id)
  super
end

#enabled=(bool) ⇒ Object

Enable or disable the experiment. Only works if the playground is collecting and this experiment is enabled.

Note You should not set the enabled/disabled status of an experiment until it exists in the database. Ensure that your experiment has had #save invoked previous to any enabled= calls.



71
72
73
74
75
76
77
78
79
80
81
82
# File 'lib/vanity/experiment/ab_test.rb', line 71

def enabled=(bool)
  return unless @playground.collecting? && active?

  if created_at.nil?
    Vanity.logger.warn(
      'DB has no created_at for this experiment! This most likely means' \
      'you didn\'t call #save before calling enabled=, which you should.'
    )
  else
    connection.set_experiment_enabled(@id, bool)
  end
end

#enabled?Boolean

Returns true if experiment is enabled, false if disabled.

Returns:

  • (Boolean)


61
62
63
# File 'lib/vanity/experiment/ab_test.rb', line 61

def enabled?
  !@playground.collecting? || (active? && connection.is_experiment_enabled?(@id))
end

#false_trueObject Also known as: true_false

Defines an A/B test with two alternatives: false and true. This is the default pair of alternatives, so just syntactic sugar for those who love being explicit.

Examples:

ab_test "More bacon" do
  metrics :yummyness
  false_true
end


166
167
168
# File 'lib/vanity/experiment/ab_test.rb', line 166

def false_true
  alternatives false, true
end

#fingerprint(alternative) ⇒ Object

Returns fingerprint (hash) for given alternative. Can be used to lookup alternative for experiment without revealing what values are available (e.g. choosing alternative from HTTP query parameter).



174
175
176
# File 'lib/vanity/experiment/ab_test.rb', line 174

def fingerprint(alternative)
  Digest::MD5.hexdigest("#{id}:#{alternative.id}")[-10, 10]
end

#metrics(*args) ⇒ Object

Tells A/B test which metric we’re measuring, or returns metric in use.

Examples:

Define A/B test against coolness metric

ab_test "Background color" do
  metrics :coolness
  alternatives "red", "blue", "orange"
end

Find metric for A/B test

puts "Measures: " + experiment(:background_color).metrics.map(&:name)


95
96
97
98
# File 'lib/vanity/experiment/ab_test.rb', line 95

def metrics(*args)
  @metrics = args.map { |id| @playground.metric(id) } unless args.empty?
  @metrics
end

#outcomeObject

Alternative chosen when this experiment completed.



469
470
471
472
473
474
# File 'lib/vanity/experiment/ab_test.rb', line 469

def outcome
  return unless @playground.collecting?

  outcome = connection.ab_get_outcome(@id)
  outcome && alternatives[outcome]
end

#outcome_is(&block) ⇒ Object

Defines how the experiment can choose the optimal outcome on completion.

By default, Vanity will take the best alternative (highest conversion rate) and use that as the outcome. You experiment may have different needs, maybe you want the least performing alternative, or factor cost in the equation?

The default implementation reads like this:

outcome_is do
  a, b = alternatives
  # a is expensive, only choose a if it performs 2x better than b
  a.measure > b.measure * 2 ? a : b
end

Raises:

  • (ArgumentError)


461
462
463
464
465
466
# File 'lib/vanity/experiment/ab_test.rb', line 461

def outcome_is(&block)
  raise ArgumentError, "Missing block" unless block
  raise "outcome_is already called on this experiment" if defined?(@outcome_is)

  @outcome_is = block
end

#rebalance!Object

Force experiment to rebalance.



439
440
441
442
443
444
# File 'lib/vanity/experiment/ab_test.rb', line 439

def rebalance!
  return unless @playground.collecting?

  score_results = bayes_bandit_score
  set_alternative_probabilities score_results.alts if score_results.method == :bayes_bandit_score
end

#rebalance_frequency(rf = nil) ⇒ Object

Sets or returns how often (as a function of number of people assigned) to rebalance. For example:

 ab_test "Simple" do
   rebalance_frequency 100
 end

puts "The experiment will automatically rebalance after every " + experiment(:simple).description + " users are assigned."


429
430
431
432
433
434
435
436
# File 'lib/vanity/experiment/ab_test.rb', line 429

def rebalance_frequency(rf = nil) # rubocop:todo Naming/MethodParameterName
  if rf
    @assignments_since_rebalancing = 0
    @rebalance_frequency = rf
    rebalance!
  end
  @rebalance_frequency
end

#resetObject

clears all collected data for the experiment



508
509
510
511
512
513
514
515
# File 'lib/vanity/experiment/ab_test.rb', line 508

def reset
  return unless @playground.collecting?

  connection.destroy_experiment(@id)
  connection.set_experiment_created_at(@id, Time.now)
  @outcome = @completed_at = nil
  self
end

#saveObject

Set up tracking for metrics and ensure that the attributes of the ab_test are valid (e.g. has alternatives, has a default, has metrics). If collecting, this method will also store this experiment into the db. In most cases, you call this method right after the experiment’s been instantiated and declared.



522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
# File 'lib/vanity/experiment/ab_test.rb', line 522

def save
  if defined?(@saved)
    Vanity.logger.warn("Experiment #{name} has already been saved")
    return
  end
  @saved = true
  true_false unless defined?(@alternatives)
  raise "Experiment #{name} needs at least two alternatives" unless @alternatives.size >= 2

  if !@is_default_set
    default(@alternatives.first)
    Vanity.logger.warn("No default alternative specified; choosing #{@default} as default.")
  elsif alternative(@default).nil?
    # Specified a default that wasn't listed as an alternative; warn and override.
    Vanity.logger.warn("Attempted to set unknown alternative #{@default} as default! Using #{@alternatives.first} instead.")
    # Set the instance variable directly since default(value) is no longer defined
    @default = @alternatives.first
  end
  super
  if !defined?(@metrics) || @metrics.nil? || @metrics.empty?
    Vanity.logger.warn("Please use metrics method to explicitly state which metric you are measuring against.")
    default_metric = @playground.metrics[id] ||= Vanity::Metric.new(@playground, name)
    @metrics = [default_metric]
  end
  @metrics.each do |metric|
    metric.hook(&method(:track!)) # rubocop:todo Performance/MethodObjectAsBlock
  end
end

#score(probability = 90) ⇒ Object

Scores alternatives based on the current tracking data. This method returns a structure with the following attributes:

:alts

Ordered list of alternatives, populated with scoring info.

:base

Second best performing alternative.

:least

Least performing alternative (but more than zero conversion).

:choice

Choice alternative, either the outcome or best alternative.

Alternatives returned by this method are populated with the following attributes:

:z_score

Z-score (relative to the base alternative).

:probability

Probability (z-score mapped to 0, 90, 95, 99 or 99.9%).

:difference

Difference from the least performant altenative.

The choice alternative is set only if its probability is higher or equal to the specified probability (default is 90%).



290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
# File 'lib/vanity/experiment/ab_test.rb', line 290

def score(probability = 90)
  alts = alternatives
  # sort by conversion rate to find second best and 2nd best
  sorted = alts.sort_by(&:measure)
  base = sorted[-2]
  # calculate z-score
  pc = base.measure
  nc = base.participants
  alts.each do |alt|
    p = alt.measure
    n = alt.participants
    alt.z_score = (p - pc) / (((p * (1 - p) / n) + (pc * (1 - pc) / nc)).abs**0.5)
    alt.probability = AbTest.probability(alt.z_score)
  end
  # difference is measured from least performant
  if least = sorted.find { |alt| alt.measure > 0 } # rubocop:todo Lint/AssignmentInCondition
    alts.each do |alt|
      alt.difference = (alt.measure - least.measure) / least.measure * 100 if alt.measure > least.measure
    end
  end
  # best alternative is one with highest conversion rate (best shot).
  # choice alternative can only pick best if we have high probability (>90%).
  best = sorted.last if sorted.last.measure > 0.0
  choice = if outcome
             alts[outcome.id]
           else
             (best && best.probability >= probability ? best : nil)
           end
  Struct.new(:alts, :best, :base, :least, :choice, :method).new(alts, best, base, least, choice, :score) # rubocop:todo Lint/StructNewOverride
end

#score_method(method = nil) ⇒ Object

What method to use for calculating score. Default is :ab_test, but can also be set to :bayes_bandit_score to calculate probability of each alternative being the best.

ab_test “noodle_test” do

alternatives "spaghetti", "linguine"
metrics :signup
score_method :bayes_bandit_score

end

Examples:

Define A/B test which uses bayes_bandit_score in reporting



151
152
153
154
# File 'lib/vanity/experiment/ab_test.rb', line 151

def score_method(method = nil)
  @score_method = method if method
  @score_method
end

#set_alternative_probabilities(alternative_probabilities) ⇒ Object

– Unequal probability assignments –



411
412
413
414
415
416
# File 'lib/vanity/experiment/ab_test.rb', line 411

def set_alternative_probabilities(alternative_probabilities) # rubocop:todo Naming/AccessorMethodName
  # create @use_probabilities as a function to go from [0,1] to outcome
  cumulative_probability = 0.0
  new_probabilities = alternative_probabilities.map { |am| [am, (cumulative_probability += am.probability) / 100.0] }
  @use_probabilities = new_probabilities
end

#showing?(alternative) ⇒ Boolean

True if this alternative is currently showing (see #chooses).

Returns:

  • (Boolean)


255
256
257
258
259
260
261
262
263
# File 'lib/vanity/experiment/ab_test.rb', line 255

def showing?(alternative)
  identity = identity()
  if @playground.collecting?
    (connection.ab_showing(@id, identity) || alternative_for(identity)) == alternative.id
  else
    @showing ||= {}
    @showing[identity] == alternative.id
  end
end

#track!(_metric_id, _timestamp, count, *args) ⇒ Object

Called via a hook by the associated metric.



552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
# File 'lib/vanity/experiment/ab_test.rb', line 552

def track!(_metric_id, _timestamp, count, *args)
  return unless active? && enabled?

  identity = args.last[:identity] if args.last.is_a?(Hash)
  identity ||= begin
    identity()
  rescue StandardError
    nil
  end
  if identity # rubocop:todo Style/GuardClause
    return if connection.ab_showing(@id, identity)

    index = alternative_for(identity)
    connection.ab_add_conversion(@id, index, identity, count)
    check_completion!
  end
end