Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix flaky test on test_dictionary.py::test_dictionary_looping #2105

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

lonly7star
Copy link

This PR aims to fix the flaky test on test_dictionary.py::test_dictionary_looping so the test could pass for multiple test runs

The result

The test can pass when running only once, but fail when running the test suit multiple times. When asserting the key pair does not exist on the global_err_dicts or global_pairs, the pair actually exists start on the second test run.

Steps to reproduce the issue

  1. install pytest flaky finder with pip install pytest-flakefinder
  2. run pytest with flake-finder with command pytest -k test_dictionary.py --flake-finder

Issue of the code

The reason is the code tried to assert the key pair or error read from the file doesn't exist. However, the global_err_dicts and global_pairs only initialize once per pytest runs, so start from the second run of test_dictionary_looping, the assert will fail.

Proposed solution

Adding a variable to track how many txt files we had read. With a pytest fixture to initialize both global_err_dicts andglobal_pairs` if the variable indicates it is a new test run.

@lonly7star
Copy link
Author

should I re-formate the code with the format requirements of codespell and submit a new PR?

@peternewman
Copy link
Collaborator

This PR aims to fix the flaky test on test_dictionary.py::test_dictionary_looping so the test could pass for multiple test runs

The test can pass when running only once, but fail when running the test suit multiple times. When asserting the key pair does not exist on the global_err_dicts or global_pairs, the pair actually exists start on the second test run.

I assume you've not had this test fail on a single run, so it's not flaky in the traditional sense but it's not compatible with re-running the test multiple times in the same session?

I'm not targeting this at you, but this fix seems like quite a bodge to me, isn't there some set up and tear down functions for the test (rather than the test suite) which we could deal with this in? Flaky ( https://github.com/box/flaky ) seems to support this stuff, based on issues like ( box/flaky#124 box/flaky#109 box/flaky#53 box/flaky#39 ). Maybe pytest-flakefinder does too and we just need to add setup/tearDown or add these to it?

should I re-formate the code with the format requirements of codespell and submit a new PR?

You can just fix it in this PR if you'd prefer.

@lonly7star
Copy link
Author

I assume you've not had this test fail on a single run, so it's not flaky in the traditional sense but it's not compatible with re-running the test multiple times in the same session?

Thank you for reply my PR! I'm a master's student that trying to fix flaky tests as a course project. In the definition, a flaky test means a test could produce different results with the testing code unchanged. In our case, when a testing function fail when re-run multiple times in the same session is considered flaky.

isn't there some setup and tear-down functions for the test (rather than the test suite) which we could deal with this in?

So the general tear down/fixture does not really fit for test_dictionary_looping it is a parameterized test and test_ran_all depends on the result of it, which means when we re-run either the full test file or just the testing function. test_dictionary_looping will fail if we don't clean up after all of the parameters are feed in. And since the test_ran_all function depends on the result, it will fail if we clean up early.

That is why I used a counter for the looping times in a fixture so I could clean at the right time. In re-runs, using @pytest.mark.dependency only provides the running priority but is not necessary right after the desired loop times when all parameters are fed.
I only raised this PR in the perspective of the flaky test and it is totally fine if you feel re-runs will not be a concern of this project. Or we could discuss the manul tear-down for passing re-runs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants