Backend Process Flaky Tests - TISTATechnologies/caseflow GitHub Wiki
Owner | Date |
---|---|
Ferris | 2/19/2020 |
This page describes Caseflow's definition of a flaky test, and the process for tracking and fixing flaky tests.
A "flaky test" is defined as any test that does not reliably pass or fail. Flaky tests are an issue because they intermittently cause Caseflow's test suite to fail, which can prevent PRs from getting merged into master.
There are two spellings of "flaky" that appear in our codebase:
- flaky
- flakey
For consistency, we have chosen to go with the spelling "flaky" because it is the recommended spelling when doing a Google search.
You can identify a flaky test by running a test that failed in CircleCI in your local environment. If the test failed in the test environment, but succeeds in your local environment, it is most likely a flaky test. More generally, any test that does not consistently pass or fail is considered a flaky test. If you need your build to pass, you can typically run the test suite multiple times as a workaround to a flaking test.
For a more thorough description of how to identify and debug a flaky test see the documentation here: https://github.com/department-of-veterans-affairs/caseflow/wiki/Flakey-Test-Remedies#debugging-flakey-tests
Written on 2/19/2020
We used to track flaky tests on a single GitHub issue. The advantages of this process was that it was very lightweight and allowed developers to quickly track flaky tests and search for them in a centralized place. The downside was that it was difficult to integrate the work of fixing flaky tests into the sprint planning process, so the work was not being tracked or prioritized.
While developing this new process for tracking flaky tests, we wanted to preserve both of the advantages of the old process, while also allowing sprint teams to easily find, estimate, and incorporate fixing flaky tests into their sprint planning processes.
See the notes from the backend workgroup discussion on 2/18/2020.
When you encounter a flaky test, check if there is an open flaky test ticket with the same error message. If there is not, create a new issue in GitHub using the "Flaky test task" template.
Fill out the issue template, and make note of the error message and provide a link to the CircleCI build that is failing.
If a CircleCI build is failing from multiple untracked flaky tests, create a separate ticket for each unless they are failing from the same underlying issue. It can be challenging to determine if multiple issues are failing due to the same underlying issue, so creating multiple tickets is appropriate for most cases. The issue template has a section where you can document that the flaky tests might be related.
Tag the issue with the labels: if they are not tagged by the template.
Example of a GitHub issue for a flaky test
Tech leads for each sprint should aim to include flaky test tickets during sprint planning, and treat them separately from tech improvement tickets.
Each ticket should be estimated and timeboxed.
If the developer hasn't made any progress on the ticket in the allotted time, they should create a PR
to skip the flaking test. Afterward the PR gets merged, they should create a follow up issue,
checking the appropriate check box for Has the test already been skipped in the code?
.