Experiment Guide

Experiments can be conducted by any GitLab team, but are most often conducted by the teams from the Growth Sub-department.

Experiments are run as A/B/n tests and are evaluated by the data the experiment generates. The team reviews the data, determines the variant that performed most effectively, and promotes that variant as the new default code path (or reverts back to the control). In all instances, an experiment will be cleaned up after it has generated enough data to determine performance and be considered resolved.

Process and tracking issue

Each experiment should have a related Experiment tracking issue created, which is intended to track the experiment from deployment and rollout through to resolution and cleanup.

At the time an experiment is rolled out, the due date of the tracking issue should be specified. The timeline depends on the experiment and can be up to several weeks in the future.

With experiment resolution, the outcome of the experiment should be posted to the tracking issue with the reasoning for the decision. Any and all non-relevant experiment code should be removed and review concerns should be addressed during cleanup. After cleanup, the tracking issue can be closed.

Implementing an experiment

For the sake of our example, let's say we want to run an experiment for how to cancel a subscription. In our control (current world) we show a toggle that reads "Auto-renew", and in our experiment candidate we want to show a "Cancel subscription" button with a confirmation. Ultimately the behavior is the same, but the interface will be considerably different.

Defining the feature flag

Let's name our experiment subscription_cancellation. It's important to understand how this name is prefixed as growth_experiment_subscription_cancellation in our Unleash feature detection and in our Snowplow tracking calls. We prefix our experiments so we can consistently identify them as experiments and clean them up over time.

This means that you'll need to go to the Feature Flags interface interface (for the project you're working on) and add a growth_experiment_subscription_cancellation feature flag and define which environment(s) it should be rolled out on for initial deployment, for instance, gitlab-staging or customers-staging.

You can include yourself and others in a list and assign that list using the User List rollout strategy, or you can add your user id to the experiment manually. You can use this to include or exclude yourself from an experiment for verification before rolling it out to others.

Implementation

When you implement an experiment in code you'll need to provide the name that you've given it in the feature flags interface, and a context -- which usually will include something like a user / user id, but may also include several other aspects.

class SubscriptionsController < ApplicationController
  include Gitlab::GrowthExperiment::Interface

  def show
    experiment(:subscription_cancellation, user_id: user.id) do |e|
      e.use { render_toggle_button } # control
      e.try { render_cancel_button } # candidate
    end
  end
end

You can also provide different variants for the experience if you've defined variants in the Feature Flag interface (not yet available).

experiment(:subscription_cancellation, user_id: user.id) do |e|
  e.use { render_toggle_button } # control
  e.try(:variant_one) { render_cancel_button(confirmation: true) }
  e.try(:variant_two) { render_cancel_button(confirmation: false) }
end

Later, and elsewhere in code you can use the same experiment call to track events on the experiment. The important detail here is to use the same context between your calls to experiment. If the context is the same, we're able to consistently track the event in a way that associates it to which variant is being presented -- which may be based on the user we're presenting it to.

exp = experiment(:subscription_cancellation, user_id: user.id)
exp.track('clicked_button')

You can use the more low level class or instance interfaces...

### Class level interface using `.run` ```ruby exp = Gitlab::GrowthExperiment.run(:subscription_cancellation, user_id: user.id) do |e| # context can be passed to `experiment`, `.run`, `new`, or after the fact like here. # context must be added before `#run` or `#track` calls. e.context(project_id: project.id) e.use { toggle_button_interface } # control e.try { cancel_button_interface } # candidate end # track an event on the experiment we've defined. exp.track(:clicked_button) ``` While `Gitlab::GrowthExperiment.run` is what we document, you can also use `Gitlab::GrowthExperiment.experiment`. ### Instance level interface ```ruby exp = Gitlab::GrowthExperiment.new(:subscription_cancellation, user_id: user.id) # context can be passed to `.new`, or after the fact like here. # context must be added before `#run` or `#track` calls. exp.context(project_id: project.id) exp.use { toggle_button_interface } # control exp.try { cancel_button_interface } # candidate exp.run # track an event on the experiment we've defined. exp.track(:clicked_button) ```

You can define use custom classes...

### Custom class ```ruby class CancellationExperiment < Gitlab::GrowthExperiment def initialize(variant_name = nil, **context, &block) super(:subscription_cancellation, variant_name, **context, &block) end end exp = CancellationExperiment.run(user_id: user.id) do |e| # context can be passed to `.run`, or after the fact like here. # context must be added before `#run` or `#track` calls. e.context(project_id: project.id) e.use { toggle_button_interface } # control e.try { cancel_button_interface } # candidate end # track an event on the experiment we've defined. exp.track(:clicked_button) ```

You can hard specify the variant to use...

### Specifying which variant to use This should generally be discouraged, as it can change the experience users have during rollout, and may confuse generating reports from the tracking calls. It is possible however, and may be useful if you understand the implications. ```ruby experiment(:subscription_cancellation, :no_interface, user_id: user.id) do |e| e.use { toggle_button_interface } # control e.try { cancel_button_interface } # candidate e.try(:no_interface) { no_interface! } # variant end ``` Or you can set the variant within the block. ```ruby experiment(:subscription_cancellation, user_id: user.id) do |e| e.variant(:variant) # set the variant # ... end ```

The experiment method, and the underlying Gitlab::GrowthExperiment is an implementation on top of Scientist. Generally speaking you can use the DSL that Scientist defines, but for experiments we use experiment instead of science, and specify the variant on initialization (or via #variant and not in the call to #run. The interface is otherwise the same, even though not every aspect of Scientist makes sense for experiments.

Context migrations

There are times when we may need to add new values or change something that we're providing in context while an experiment is running. We make this possible by passing the migrated_from context key.

Take for instance, that you might be using version: 1 in your context. If you want to migrate this to version: 2, you just need to provide the context that you provided prior, in a migrated_from context key. In doing this, a given experiment experience can be resolved back through any number of migrations.

experiment(:my_experiment, version: 2, migrated_from: { version: 1 })

It's important to understand that this can bucket a user in a new experience (depending on the rollout strategy being used and what is changing in the context), so you should investigate how this might impact your experiment before using it.

When there isn't a user

When there isn't a user, we typically have to fall back to another concept to provide a consistent experiment experience. What this means, is that once we bucket someone in a certain bucket, we always bucket them in that bucket.

We do this by using cookies.... [document more]

We intentionally don't, and shouldn't, track things like user ids. What we can and do track is what we consider an "experiment experience" key. This key is generated by the context we pass to the experiment implementation. If we consistently pass the same context to an experiment, we're able to consistently track events generated in that experience. A context can contain things like user, or project -- so, if you only included a user in the context that user would get the same experience across all projects they view, but if you include the currently viewed project in the context the user would potentially have a different experience on each of their projects. Each can be desirable given the objectives of the experiment.

Code quality expectations

Since experimental code is inherently short lived, an intentionally stated goal is to iterate quickly to generate and evaluate performance data.

This goal prioritizes iteration and resolution over code quality, which means that experiment code may not always meet our code standards guidelines, but should also not negatively impact the availability of GitLab nor contribute to bad data. Even though experiments will be deployed to a minority of users, we still expect a flawless experience for those users, therefore, good test coverage is still required.

Reviewers and maintainers are encouraged to note when code doesn't meet our code standards guidelines. Please mention your concerns and include or link to them on the experiment tracking issue. The experiment author(s) are responsible for addressing these concerns when the experiment is resolved.