-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CR] Randomize test order: catch interaction, etc bugs #46473
Conversation
Randomize ordering of tests to check for problems such as seen in CleverRaven#46439.
Need to work on getting build.sh to use make check.
Do all of these _have_ to be using separate test-initialization methods?
I may do a PR to note in TESTING.md exactly where one needs to change file lines to globally modify what cata_test (or various alternate names) does... |
I have asked in Discussions (#46476) if someone with a Linux box can analyze the Github "artifacts". |
Thanks to @anothersimulacrum - converts from crash to failure, allowing more analysis.
environmental_revert_effect removes hunger - it does not do anything (currently?) regarding stored kcal. (It is possible vitamins also need to be reset - will have to see.)
Current status:
|
Originally gave an error of unable to find an adjacent square to put the manhack in.
This is to avoid errors of not being able to place an NPC.
Currently:
|
May want to alter the various cata_test calls to have |
Make sure have correct starting season length (and that "set_eternal_season" doesn't change it when turn eternal season on/off).
Fixing, and making sure did fix.
monster_test.cpp was altering `trigdist` and not restoring it. (Most notably visible in `vision_single_tile_skylight`.)
|
Just put `trigdist` back - the option itself should be reset when `override_option` goes out of (variable) scope and its destructor gets called. (A thank-you to @anothersimulacrum for pointing this out to me.)
So, errors still seen:
|
Expand debug_weary_info (in character.cpp) to give enough information to reconstruct reasons for weary_threshold variations.
Intermittent errors in weary tests (see CleverRaven#46256) are hard to debug without more information. While this information has been expanded in CleverRaven#46473, this does not give enough examples, nor is it able to tell what in "normal" builds is causing intermittent weary test problems.
Give more detailed information on weariness (tracker and intake), to try to figure out why keeps going up and down during tests. Since the "healthy" stored calories are at bmi 25, put calories to healthy minus debug_nutrition, to prevent going over.
tests/player_helpers.cpp
Outdated
@@ -72,7 +72,9 @@ void clear_character( player &dummy ) | |||
// This sets HP to max, clears addictions and morale, | |||
// and sets hunger, thirst, fatigue and such to zero | |||
dummy.environmental_revert_effect(); | |||
dummy.set_stored_kcal( dummy.get_healthy_kcal() ); // But not stored kcal | |||
// However, the above does not set stored kcal; | |||
// 2170 is calories of debug_nutrition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stick a REQUIRE
or something here guaranteeing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point - but it's erroring in stomach contents bmr requirements, so I may need to revert that anyway. (I was trying to keep bmi in the 18.5-25 range, even after stomach contents were absorbed.)
The Appveyor results from the above are the most informative regarding weary; Travis also has one of immediate interest, although approximately duplicated in Appveyor; general is from the mixed-work/food (and other non-weary), which is harder to interpret. I'm going to try removing the -2170 (and also do a slight bit of reformatting so messages don't wrap) next, somewhat to confirm these results (from which it appears the |
The caloric subtraction (minus calories for debug_nutrition) is causing errors in other tests, and it is also desirable to make sure it isn't doing anything to the weariness tests themselves (weary intake). With the new information (weary tracker and intake), the summarize transition output is linewrapping; trying to prevent.
The general, travis, and appveyor test failures are indicating that the 24-hour-dig task's fluctuating from 1 to 0 to 1 in weariness level is due to intake going up (as calories are absorbed from the guts). They unfortunately do not give information for the 8-hours-dig, 8-hours-wait A local test using
I suspect the above is due to However... @I-am-Erk, @anothersimulacrum: I am unclear on the exact purpose of factoring caloric intake into weariness. (Currently, one would almost say the character has hypoglycemia...) |
Really glad to see someone working on getting randomly ordered tests working. It's something that was on my todo list but I probably never would have got around to. I just want to make sure you're aware of If there are specific things you'd like to see tested on Linux, let me know, and I'll try to get to them. |
Understand! Things are about to get busier for me at work (teaching Anatomy & Physiology), but I'll do as much as I can. (Once the parallel tests stabilize a bit, I will see about adapting it for randomized test orders - should not be difficult; sticking
Thanks! I am on a Mac, so that works; will take a look at now.
Thanks! |
@anothersimulacrum: Feel free to put a |
…into patch-3; not running cases (android on Travis; appveyor) without testing, nor single-test jobs
I suppose this PR is abandoned, since last commit was a year and a half ago? I'm not sure if it should stay in PR tracker, especially with this many unresolved conflicts. |
Closing as abandoned. |
Summary
SUMMARY: Infrastructure "Randomize test order to catch interactions, other bugs"
Purpose of change
As seen in #46439 and #46404, some test faults are triggered by differing test orders. Once this and other existing faults are fixed using data from testing this PR, it can be merged to help with future testing.
Describe the solution
Add
--order rand
when invoking cata_test.Describe alternatives you've considered
I can't see any practical ones.
Testing
See #46439.
Additional context
I plan on using the results from this PR to fix the tests, test them on this PR, then put in other PRs for the individual fixes. (Unless someone has a better idea?)