Flakey Test Remedies - TISTATechnologies/caseflow GitHub Wiki
Remedies for common pitfalls that can cause flakey tests. Also check out Testing Best Practices.
find vs. find_all
In spec/feature/queue/colocated_task_queue_spec.rb
:
- page.find_all("table#case-timeline-table button", text: view_text).each(&:click)
+ page.find("table#case-timeline-table button", text: view_text).click
expect(page).to have_content("INSTRUCTIONS FROM VLJ", wait: 15)
find_all
will not fail if it finds no matching element. If it finds no matching element, it has nothing to click to open. So, if the page does not load right away, we won't actually open the task instructions, which is the reason we were seeing a flake. By changing this to find
, we wait until the "View instructions" element is loaded, then click it.
let vs. before
Prefer using let
fixture to setup test data to before
since before
is always run.
before do
organization.add_user(organization_user)
User.authenticate!(user: organization_user)
- create_list(:privacy_act_task, unassigned_count, :in_progress, assigned_to: organization)
- create_list(:privacy_act_task, assigned_count, :on_hold, assigned_to: organization)
end
+ let!(:unassigned_tasks) do
+ privacy_tasks = create_list(
+ :privacy_act_task, unassigned_count / 2,
+ :in_progress,
+ assigned_to: organization
+ )
+ translation_tasks = create_list(
+ :translation_task,
+ translation_task_count,
+ :in_progress,
+ assigned_to: organization
+ )
+ privacy_tasks + translation_tasks
+ end
More details described in this StackOverflow answer and this article.
before(:each)
instead of before(:all)
Use An important detail of before :all is that it's not DB transactional. I.e, anything within the before :all persists to the DB and you must manually tear down in the after :all method.
Implications mean that after test suites have completed, changes are not reverted ready for later tests. This can result in complicated bugs and issues with cross contamination of data. I.e, If an exception is thrown the after :all callback is not called.
However, before: each is DB transaction.
More here
skip_callback
and set_callback
in a before and after block
Use skip_callback
will persist to other specs if it is run in a spec, as seen below.
scenario "BGS stuff is happening" do
BgsPowerOfAttorney.skip_callback(:foo, :before, :bar!)
end
To ensure that skip_callback
doesn't have undesired effects, try using a before
and after
block for skip_callback
and set_callback
. Below includes an example of this set-up.
before do
BgsPowerOfAttorney.skip_callback(:foo, :before, :bar!)
end
after do
BgsPowerOfAttorney.set_callback(:foo, :before, :bar!)
end
let!
outside context
Using We should be wary of let!(:foo)
outside of any context scoping, particularly at the start of a file. That means the factory will always be instantiated.
In this case, we need to define the before block before anything else, so that the time is frozen before any factories are called.
The bug was the order in which we declared let! vs before at the top of a file. The before was executing after the let!. https://github.com/department-of-veterans-affairs/caseflow/pull/13085/files#diff-7de594cd05a73f5334aad3c44048c1f0R3-R14
describe Appeal, :all_dbs do
include IntakeHelpers
- let!(:appeal) { create(:appeal) }
before do
Timecop.freeze(Time.utc(2019, 1, 1, 12, 0, 0))
end
+ let!(:appeal) { create(:appeal) } # must be *after* Timecop.freeze
context "includes PrintsTaskTree concern" do
context "#structure" do
let!(:root_task) { create(:root_task, appeal: appeal) }
Debugging flakey tests
Check Knapsack Pro for Test Order
I don't believe you need to do the 'binary search' section below since we recently added Knapsack pro.
If you go into CircleCI for and find the failing test, you should see something like the following information in the terminal. This info includes:
- Directions on how to run the same subset of tests in the same order in dev.
- A list of the files that are tested.
D, [2021-03-02T16:42:42.288202 #840] DEBUG -- : [knapsack_pro] POST https://api.knapsackpro.com/v1/queues/queue
D, [2021-03-02T16:42:42.288288 #840] DEBUG -- : [knapsack_pro] API request UUID: 6f6b1959-bb02-4156-a30a-8f3411e6fed2
D, [2021-03-02T16:42:42.288319 #840] DEBUG -- : [knapsack_pro] API response:
D, [2021-03-02T16:42:42.288371 #840] DEBUG -- : [knapsack_pro] {"queue_name"=>"1783:edd88486c51be049a6c2b31c7b627c02", "build_subset_id"=>nil, "test_files"=>[{"path"=>"spec/feature/hearings/change_hearing_disposition_spec.rb", "time_execution"=>169.1206240653992}, {"path"=>"spec/models/distribution_spec.rb", "time_execution"=>161.0852148532867}]}`
I, [2021-03-02T16:42:42.288472 #840] INFO -- : [knapsack_pro] To retry in development the subset of tests fetched from API queue please run below command on your machine. If you use --order random then remember to add proper --seed 123 that you will find at the end of rspec command.
I, [2021-03-02T16:42:42.288514 #840] INFO -- : [knapsack_pro] bundle exec rspec --format documentation --no-color --format RspecJunitFormatter --out tmp/rspec.xml --default-path spec "spec/feature/hearings/change_hearing_disposition_spec.rb" "spec/models/distribution_spec.rb"
D, [2021-03-02T16:42:43.709751 #840] DEBUG -- : [knapsack_pro] Test suite time execution queue recording enabled.
Run options: include {:focus=>true}
Binary search your way to victory
You can try the rspec --bisect feature which is ideal when you are running random order of tests.
Otherwise, if you are running a predictable order of tests, assuming you can duplicate the errors, you want to apply a binary search pattern (which is essentially what rspec --bisect
does for you).
Identify the first failing spec file
Access the xml output, rspec.xml
, from the node that failed. Download the xml and parse it with nokogiri to extract an ordered list of the test files. Below is an example.
doc = Nokogiri::XML(File.read("rspec.xml"))
testcases = doc.xpath("//testcase")
result = testcases.collect {|c| c["file"] }
deduped = result.uniq
deduped.join(" ")
Comment out half the spec filenames prior to the failure. Then, re-run the tests using the new filelist.
Each time you get a failure, repeat the process
The binary search pattern means you keep commenting out half the active lines in the file till the tests all pass. When they all pass, you've narrowed it down to the combination of tests, in the correct order, that trigger the failure. When they pass, whatever change you made last to the filelist is the one that matters.
When you've identified the culprit, debug and repeat
Now you know the file(s) that caused the flake. You'll have to inspect each one to try and see what might trigger the bug. Often it's databases not being cleaned up, or global variables being set (things like FeatureToggle
or RequestStore
are often culprits).
Use ensure_stable to re-run a test
In trying to replicate a flake or test if a flake fix is working, you may want to try wrapping the test with ensure_stable
to run it many times.
OS-dependencies
- Use round to avoid precision differences:
let(:time) { 1.day.ago.round }
- Why? Timestamp truncation: https://www.toptal.com/ruby-on-rails/timestamp-truncation-rails-activerecord-tale