This is a tool to help run statistical analyses of A/B testing experiments we run here at Songkick.
We use this util mainly to generate CSV files that we can plot using Google Docs in order to determine if an A/B test is a success or a failure.
Install skab by running `gem install skab`
You can run the util by using the `skab` command line
Command line arguments
skab [output] [model] [model_args]
The command line accepts a variable number of arguments:
`output` is the name of the output module to use to print data
`model` is the name of the model used to model the process to analyse
All other arguments are model dependent and are passed to the model
Skab is able to output different statistics, all based on the model used to generate the distribution.
We currently support two main outputs:
Distribution: the discrete probability distribution for each group, based on the model used to represent the process
Differential: the discrete probability distribution for Xb - Xa
Skab currently supports two models to generate a distribution of the mean depending on the actual observed values:
Poisson model, working with rate of events on a specific period of time
Binomial model, working with success rates
The poisson model
The poisson model accepts two integer parameters: A and B. Each parameter corresponds to the measured number of events occuring in group A or B, respectively.
The distribution outputs a list of probability for each mean depending on the A or B group, according to the poisson law of small numbers.
Here is an example, with 1450 events observed for group A and 1430 for group B:
skab distribution poisson 1450 1430
It is worth noting that the Poisson distribution is expensive to compute for large numbers (> 100), so this model uses an approximation using a normal distribution (using a variance of delta).
The binomial model
The binomial model is used to generate a distribution of success rates depending on a number of trials and successes for each group A and B.
The distribution outputs a list of probable success rates and their respective probability for groups A and B.
For example, this command generate the binomial distribution with:
200 successes out of 450 trials for group A
220 successes out of 470 trials for group B
skab distribution binomial 450 200 470 220
This software relies on Hash ordering to display values in the correct order. On Ruby versions older than 1.9, hash ordering wasn't guaranteed, and this will cause some output to be inconsistent (mainly differential CSV and summary outputs).
The MIT License
Copyright © 2012 Songkick
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.