Flakey Spec Catcher

About:

This gem is intended to catch and prevent the merging of flakey RSpec tests.

There are two primary usecases for flakey_spec_catcher (FSC):

  1. Git Detection - Make changes to specs, add them to a git commit and then allow FSC to detect changes and re-run only the new/edited tests in your commit many times.

  2. Manual Re-runs - Specify a test to re-run many times regardless of whether it has corresponding changes in yuor commit.

FSC detects changes by running the equivalent of a git diff between the current branch's commit and the HEAD of Source Control Management (SCM) to detect any changes to files that contain _spec.rb. For these files, it will re-run either the changed tests or the whole spec file's test suite a specified number of times based on user configurations.

If all re-runs that FSC triggers pass, the result of running it will be a status code of 0. If a single re-run fails, a non-zero exit status will result.

Note that flakey_spec_catcher will only detect committed changes, so make sure to add any changed spec files to the commit and amend it if you're testing locally.

All that's required is to install the gem and then call its executable, flakey_spec_catcher.

Why use FSC?

FSC allows you to specify a usage with which to re-run tests. The default usage is inserting all test cases manually into RSpec::Core::Runner API and then clearing all example and test data between each re-run. So long as RSpec is configured to not reset browser+server configurations between test runs or test suites, re-running tests this way will be the fastest way to ensure that tests are not flakey. If providing a custom usage, FSC will run each re-run as a separate process and will likely incur more significant overhead, however, it may require less tweaking of your existing RSpec configuration.

If you have setup your environment to use the default usage (RSpec::Core::Runner), FSC will use an RSpec listener to combine your test results and give you a consolidated summary of how many re-runs of each test case failed with a specific assertion.

FSC is ideal for testing in continuous integration since it will run only created/edited tests and can run them many times in an efficient manner. So long as silent mode is not enabled, FSC will return an exit status that can be used to fail a build if flakey specs are detected.

What do I need to run FSC?

You'll need to be running in a git repository for FSC to detect changes. You'll also need to have at least one commit merged in SCM so that FSC has a baseline for comparison.

Ensure that your gem dependencies meet the base requirements specified by flakey_spec_catcher.gemspec

Tips for running FSC efficiently

In order to run FSC in the most efficient way via RSpec::Core::Runner, you'll need to ensure that your browser+server connections do not reset before/after each example or example group if using Selenium. If RSpec has additional dependencies, you'll also need to ensure that FSC has access to any needed gems. It's best to do this by adding flakey_speec_catcher to your Gemfile via bundler and then running it with bundle exec flakey_spec_catcher

In what cases if FSC not suitable?

Since FSC by default re-runs tests at the smallest possible level of change (a single test case or example), it is not suitable to allow it to run on non-idempotent tests, or any kind of test that cannot be re-run on its own repeatedly. For any such tests, applicable test files may be excluded or specific examples may be tagged and those tags can then be excluded in runtime.

Install:

gem install flakey_spec_catcher

Or add it using bundler

bundle add flakey_spec_catcher
bundle install

Configuration:

Environment Variables

In some cases, certain environment variables can be configured:

  • FSC_REPEAT_FACTOR: Number of times the examples in each file will be run. Defaults to 20.

  • FSC_IGNORE_FILES: Specify changed files that will be exempt from FSC re-runs. Regex matching is allowed.

  • FSC_IGNORE_BRANCHES: Disable running flakey_spec_catcher in git detection mode on any commits occurring on a specified branch.

  • FSC_SILENT_MODE: If 'true', FSC will return back with a exit status 0 regardless of the result. If 'false', FSC will return a non-zero exit status if flakey specs are detected.

  • FSC_RERUN_FILE_ONLY: If 'true', FSC will re-run the whole spec file that contains changes rather than only re-running each individual, changed example or test.

  • FSC_USAGE_PATTERNS: Specify RSpec re-run usage patterns for specific directories or paths. For example, if you want FSC to re-run your changes in 'spec/ui' with bundle exec rspec and want your 'spec/api' changes to be re-run with parallel_rspec, you could specify the following value: FSC_USAGE_PATTERNS='{ spec/ui/** => bundle exec rspec }, { spec/api => parallel_rspec }' Note that by specifying a usage pattern, you will be running a separate process for each re-run and runtimes will suffer. This should only be used when re-running via RSpec::Core::Runner (default usage) is not suitable

  • FSC_EXCLUDED_TAGS: Specify tags to exclude from re-runs. If a given example/testcase matches any of the specified tags, then that testcase will be excluded from the re-run queue. Tags may contain a corresponding value that is either a symbol or string and multiple tag name value pairs may be specified in a comma separated list. For example, if a testcase contains a description such as it 'tests controller functionality', :tag1 => 'enabled' do then this test could be excluded using `FSC_EXCLUDED_TAGS=':tag1 => enabled'. Note: examples do not inherit the tags from their parent example groups or parent contexts.

  • FSC_OUTPUT_FILE: Specify a relative path to a file to write the contents of re-runs

Any of these environment variables can be overriden in the commit message by adding the environment variable key and usage/value according to the accepted values specified above.

Example:

Commit Message

FSC_REPEAT_FACTOR='10'
FSC_USAGE_PATTERNS = '{ spec/ui => bundle exec rspec }, { spec/api => parallel_rspec }'
...

Command Line Arguments

  -t, --test=TEST_NAME             Specify one or more specs in comma separated list
  -u, --usage=USAGE                Specify a re-run usage for the manual re-run
  -r, --repeat=REPEAT_FACTOR       Specify a repeat factor for the manual re-run(s)
  -e, --excluded-tags=EXCLUDED     Specify tags to exclude in a comma separated list
  -o, --output=PATH_TO_OUTPUT      Direct all re-run output to a specific file
  -v, --version                    Prints current flakey_spec_catcher_version
  -h, --help                       Displays available flakey_spec_catcher cli overrides

Examples:

# Re-run spec/test_spec.rb:4 50 times using RSpec::Core::Runner
flakey_spec_catcher --test='spec/test_spec.rb:4' --repeat='50'

# Re-run all tests in spec/test_spec.rb 50 times each using 'bundle exec rspc' for
#  each re-run (at a file level since no line number is specified)
flakey_spec_catcher --test='spec/test_spec.rb' --repeat='50' --usage='bundle exec rspec'

# Re-run all tests found by git detection and exclude tests that
#  have the tags :flakey or :unsuccessful => true
flakey_spec_catcher --excluded-tags=':flakey, :unsuccessful => true'

Manually re-running a test:

Once the gem has been installed, you may manually specify a test to run along with a custom usage (if none is provided, the specified test will run with RSpec::Core::Runner) and a repeat_factor. If no repeat_factor is provided, FSC will check ENV['FSC_REPEAT_FACTOR'] to see if a value has been configured. If none was provided, it will fall back on the default of 20 re-runs

Usage Examples:

# Re-run api/spec/user_spec.rb:3 using RSpec::Core::Runner (default usage) 15 times
flakey_spec_catcher --test='api/spec/user_spec.rb:3' --repeat='15'

# Re-run all files matching the glob api/spec/*_spec.rb in a separate process using
# 'bundle exec parallel_rspec' FSC_REPEAT_FACTOR times each
flakey_spec_catcher --test='api/spec/*_spec.rb' --usage='bundle exec parallel_rspec'

# Re-run api/spec/admin_spec.rb in a separate process using rspec, 10 times each
flakey_spec_catcher --test='api/spec/admin_spec.rb' --repeat='10' --usage='rspec'

Additional Git Detection Examples

# Run git detection mode and exclude tests that have the tags :flakey or :unsuccessful => true
#  run each test 20 times (default) each
bundle exec flakey_spec_catcher --excluded-tags=':flakey, :unsuccessful => true'
export FSC_REPEAT_FACTOR='10'
export FSC_RERUN_FILE_ONLY='true'
export FSC_IGNORE_FILES='test_spec.rb'

# Run git detection mode, exclude any tests that match /test_spec.rb/ 10 times each
#  if a change is found in any other *_spec.rb file, all tests in that file will be re-run
bundle exec flakey_spec_catcher

Full example

  1. In a repo, changes are made to spec files that are then added to a git commit.
  2. Run flakey_spec_catcher with any desired user configurations

Example Output: ```Flakey Spec Catcher Settings: Current Branch: master Remote: origin Current Sha: 12341234123412341234123412341234 Base Sha: 56785678567856785678567856785678 Repeat factor: 20 Changed Specs Detected: ["api/spec/changed_spec1.rb:2", "api/spec/changed_spec.rb:3", "ui/spec/new_spec.rb:3", "ui/spec/new_spec.rb: 5", ]


Re-run Preview Running api/spec/changed_spec1.rb:2 20 times Running api/spec/changed_spec1.rb:3 20 times Running ui/spec/new_spec.rb:3 20 times Running ui/spec/new_spec.rb:5 20 times


********** SUMMARY ********** 2 example(s) ran 20 times without any failures


          No Flakiness Detected!


  If any of the tests were flakey, output would resemble something like

  ```********** SUMMARY **********
  1 example(s) ran 20 times without any failures

  Test Description (./api/spec/test_spec.rb:2)

  FAILED 15 / 20 times
    5 times with exception message:
      expected: 1
          got: 0

      (compared using ==)
    5 times with exception message:
      expected: 1
          got: 2

      (compared using ==)
    5 times with exception message:
      expected: 1
          got: 3

      (compared using ==)

How FSC works:

  1. GitController runs a git diff and creates a ChangeSummary object to represent the git diff for each changed spec file.
  2. A ChangeCapsule uses the ChangeSummary object to go through each changed file and identify ChangeContext objects.
  3. After representing all changed code blocks as ChangeContext objects within a ChangeCapsule (one per file), ChangeCapsules are stored in the CapsuleManager.
  4. Command line interface overrides are retrieved using the CLIOverride class and a referece to a CLIOverride instance is set in the UserConfig instance.
  5. User Configurations are parsed using the UserConfig class for environment variables and are set for any user configurations that don't have an existing CLIOverride provided value.
  6. UserConfig and CapsuleManager objects are used to initialize a RerunManager object which creates a RerunCapsule object for each ChangeContext and pairs it with an rspec usage based on user configurations.
  7. RerunManager determines thee re-runs based on user settings and passes back the identified file/test changes to the Runner along with their specified usages.
  8. Runner will re-run each of the changes at a file level or the specific context-level of each test and will exit with a 0 if all changes passed all of their re-runs or a non-zero exit status (passed from RSpec).

See class documentation for more details.

Release:

We release flakey_spec_catcher to RubyGems.org.

To release a new gem version:

1) Bump the gem version as appropriate in lib/flakey_spec_catcher/version.rb. Follow semantic versioning principles. Commit, push to Gerrit, and merge.

2) Pull master and install dependencies for good measure:

git checkout master
git pull
bundle install

3) Now release the gem!

bundle exec rake release

This will build the gem, git tag HEAD with the current gem version, and push the gem to RubyGems.org.

To clean up after yourself:

bundle exec rake clobber

Testing FSC:

Since FSC relies on git commit history to correctly identify the specs that it will re-run, we have several specs that will create a temporary commit with staged changes, run the corresponding model specs to verify functionality, and then remove the temporary commit and staged changes. Upon merging a commmit, code health and code coverage are both assessed in SonarQube.

See: https://sonarqube.core.inseng.net/dashboard?id=tab%3Aflakey_spec_catcher

To test your FSC changes, we've setup some scripts that will build docker images and run the FSC specs in the latest ruby versions.

bash bin/build
bash bin/test

To build the gem locally and test out FSC in another repository:

gem build flakey_spec_catcher.gemspec
gem install flakey_spc_catcher-<MAJOR-MINOR-PATCH>.gem
<cd to repo for testing>
flakey_spec_catcher <ARGS>

License:

MIT. See LICENSE.txt for details.