-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to simulate TDD with pytest #1853
Comments
Tests created using Can you give an example of a particular exercise where the sort order you’re getting is that far off ideal? |
Here's the first test from the word count exercise test suite:
And here's the first one that gets run:
I have no doubt that the canonical test order isn't a 100% perfect approximation of the TDD experience, but I would think that running the 11th test in the file before the first, as is the case here, is pretty much always going to result in an experience that is not even close. But if it's indeed the case that renaming every test is the only solution, then yeah, I don't mind just dealing with it myself by adding decorators. |
@glvno thank you for the example, that's an exercise that is indeed reasonably well crafted so that the order of tests can be seen as significant. It's an interesting problem: general practice in unit testing is that the order of the tests should not matter. In fact many testing suites run tests in parallel as a matter of practice, so the order cannot matter. This is a little less true of integration tests, though even there it's -- again generally -- considered best if the tests are decoupled and order is irrelevant. Good TDD practice implies adding tests in a natural order, but it doesn't mean running them in that order, if you see what I mean. That said the developers of test runners tend to choose an order as a matter of implementation; the unittest developers chose lexi sort order and the Implementing as subclasses of unittest.TestCase is and was sensible for the Python track, because that's the only "universal" approach that all the various Python test runners adhere to, but it leaves us with lexi sort order, for better or worse, as the de facto ordering of tests. Where we could fix this is in test generation, which right now is a manual process. We'd have to do that by taking the sort order received from the canonical data and generating tests that sorted in that order lexicographically. @cmccandless should we add that to the list for test generation? It's going to result in some hideous method names, but it would make an iterative That said my concern with doing so is that I've noticed several instances in which the ordering of the tests seems to heavily influence the solutions some students are providing -- in that their solution, for better or worse, follows from trying to progress through the tests in the order they're presented, rather than engaging with the ideas -- when it should not have that influence. In the "real" world tests will not be curated to present in a specific order, though I'm realizing from the debate it exercism/problem-specifications#1553 that "real world" isn't necessarily a concern here. |
I have no idea how test generation works, so maybe this isn't feasible, but it seems like using commented-out skips would address both issues as well as keep the test method names from getting too gnarly. The experience would be essentially identical to writing tests in a natural order in that each time a test is uncommented ('written') it gets run with every other uncommented test in an arbitrary order. |
Test generation is not, as of yet, working, so any usage of it for this purpose is a bit of a reach goal at the moment. The names wouldn't be too egregious, they'd basically all have to become In theory we could put in marks for the Marking all the tests in all the exercises with a commented
Pushing out a change that disables all but the first test on all 117 exercises is, unfortunately, a clear non-starter, as that would break the existing semantics for mentors and students alike. |
@glvno I've talked this through with @cmccandless and we agree that this would be a sensible thing to try to address, but after a bit of investigation I've come up with some complications.
Also there's an ambiguity we'd need to resolve. Some of the test files have more than once TestCase, so would it be best to:
I'm leaning towards the second option, but it adds more noise to the file. That said having never actually checked I'm not sure what happens if all a TestCase's methods are marked to skip. |
for other people that are having this problem: I have written the following Python script: format_test.py that utilizes the Usage |
@yawpitch, this slipped my attention when we were implementing the generator as it stands today. What are your thoughts on this now? |
Right, so my thoughts haven't advanced too much on this one, but here's the current state of things: Automatic test generation is working, however we did not add any logic in for setting a specific test ordering, or adding numerically advancing names, or We didn't do that for a good reason: we simply cannot rely on the Now, the automated test runner, which is intended for the features coming to the V3 version of exercism launching later this year has already been built to apply the test definition order as requested above (https://github.com/exercism/python-test-runner/blob/master/runner/sort.py), to whatever good that does us. However this is really only a solution for tests submitted to exercism.io infrastructure for processing ... there's no good and reliable way to enforce the same ordering on tests run locally, even if we had confidence that the order had been carefully curated. Now, after investigating how to do this while developing the automated runner I can say that, in theory, we could potentially offer an "official" way to do this locally by making available (or shipping with the exercises) a But the difficulty there is that again we're effectively forcing the use of See Thus if we give a hard requirement on The script @h2oboi89 has provided above is a bit of a merciless hack, but it will work for people who want to use it, even if it does rewrite the test files -- and woe be unto anyone that accidentally gives it a path to a system critical file -- so there is a workaround. I'm not sure there's a need for a more "official" solution, but I'm open to someone investigating the conftest.py approach if they'd like. |
With unittest you can specify a sort function. Assuming the test file defines them in the order you like, the way I'd probably do it is sort the functions by line number, instead of name. Something along the lines of:
|
That's another workaround that can be used, but I still don't believe we should introduce it into the tests we distribute to students. My reasoning remains the same as described above: any assumption that the tests in the various canonical data files have been entered into those files in a curated, TDD-friendly order is categorically False ... no such reliable order exists, or is ever likely to be introduced, and introducing a line-number ordering to the tests is just going to provide a false sense that such an ordering exists, when it does not. |
We agree on the underlying assumption (whether the canonical data should have such an order is a different question), I just wanted to point out that it is possible to get unittest to run tests in whatever order you want using arbitrary functions, in case you did come up with such an order in the future, some exercise requires it, or to have local tests run the same order as the automated runner you referenced. |
And that's helpful, however I'm of the mind we should close this Issue as wontfix because of that lack of guarantees in the upstream data. The canonical data repo is currently locked to major changes and it remains to be seen precisely how it will be incorporated into V3, so I don't think we're able to meaningfully move on this right now except by introducing something like that logic, which without those same guarantees really just slows down tests to provide a false sense of order. Anyone strongly disagree with that? |
In favor of |
Closing as wontfix. |
I'm not sure how it is with previous versions, but with Python 3.7.3 and Pytest 5.0.1, tests are run in alphabetical order rather than in the order they appear in the test file. As I understand it, Exercism's explicit pedagogical aim is to simulate TDD, and so it seems pretty important that the tests get run in the order they appear in the file such that
pytest -x
stops at the simplest of the failing test cases -- i.e., the case a test-driven developer would have written first.As it stands now, the first failing test is often for an edge case that isn't really useful until the program's basic functionality is up and running.
In the minitest files for the Ruby track, if I remember correctly,
# skip
lines are included for all but the first test, and the learner is instructed to uncomment the skip for the next test once they get a given test to pass.I would imagine there's a more elegant fix for this than simply going back through all the track's test files and adding hundreds of
# @pytest.mark.skip
decorators, but I've yet to figure out what it is. Maybe there's some way to configure pytest to run the tests in the order they appear in the test file that I'm just unaware of? I've spent a couple of hours looking into this to no avail so far.The text was updated successfully, but these errors were encountered: