-
-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: add experimental-report-test-result-info option to the test goal #21171
tests: add experimental-report-test-result-info option to the test goal #21171
Conversation
8e68f9b
to
d1cf132
Compare
src/python/pants/core/goals/test.py
Outdated
"""Save a JSON file with the information about the source of test result.""" | ||
timestamp = datetime.now().strftime("%Y_%m_%d_%H_%M_%S") | ||
obj = json.dumps({"timestamp": timestamp, "run_id": run_id, "sources": results}) | ||
with safe_open(f"test_result_source_report_runid{run_id}_{timestamp}.json", "w") as fh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be concurrent-safe since every invocation of the test goal would have a unique id, so multiple files can be created in parallel with no racing. This follows closely the naming pattern of the workunit-logger files, e.g. pants_run_2024_05_11_23_39_48_438_cf61f6cd0266404bb307a9ff3103cea4
. I don't think a GUID is justified here and run id should suffice.
Also relying on run id provides a hint at the order in which the goals were run (in case there are multiple test
goal invocations as part of the build without the pantsd
restart).
After running the test
goal a few times one after another:
$ ls test_result* -1
test_result_source_report_runid7_2024_07_14_23_30_02.json
test_result_source_report_runid8_2024_07_14_23_30_12.json
test_result_source_report_runid9_2024_07_14_23_30_17.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few example files produced:
$ cat test_result_source_report_runid9_2024_07_14_23_30_17.json | jq
{
"timestamp": "2024_07_14_23_30_17",
"run_id": 9,
"sources": {
"src/python/pants/core/goals/test_test.py:tests": "memoized"
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ cat test_result_source_report_runid0_2024_07_14_23_29_40.json | jq
{
"timestamp": "2024_07_14_23_29_40",
"run_id": 0,
"sources": {
"helloworld/greet/greeting_test.py:tests": "hit_locally",
"helloworld/translator/translator_test.py:tests": "hit_locally"
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting!
The workunit-logger backend does provide some information such as which tests were executed, but doesn't tell anything about the other modules (was it memoized? cached locally? cached remotely?)
I wonder if we could also be trying to extract even more information into the workunit logger? For instance, cache look-ups take time, so in theory those could/should appear as spans too?
While I imagine tests are usually the slowest part of a build and so most interesting, I could imagine linting/type checking/packaging can also take a while in some cases, so having a "more general" infrastructure for tracking processes and their runs seems like it maybe be useful?
I've been also torn apart whether this should be part of the generic framework or just an extension to the
Making So perhaps there's a need for some generic framework to track everything that happens with a Pants run. The But I also see how having this option in the |
Oh, I hadn't realised. And I just started it using in my work repo... 🤔 Do you happen to have some numbers about the perf impact?
Ah interesting. I was imagining that someone might have this running constantly and ingested into some sort of dashboard/metrics.
Yeah, maybe it is! Along that line of thinking, it'd be cool to get this in and allow people to start using it/us to learn how people use it. To unblock getting that, I think I'd be more than happy with the changes we're discussing in #21171 (comment)
What do you think? |
d1cf132
to
06044f9
Compare
06044f9
to
4fbe116
Compare
That shouldn't be a problem in a small/medium size repos. I've seen only ~10% performance degradation in local experiments so I wouldn't worry too much. But on a huge repo like mine, I just can't afford collecting any debugging information while only caring about the caching efficiency for the tests run.
Sorry, my wording was poor. I mean collecting the source of test results into the dashboard/metrics database - yes! But no to running
Sure, this is what I've now done :) option and file are renamed and I've tweaked the name of the option and updated the docs accordingly. |
👍 Total tangent from this PR, but maybe workunit logger could/should support filtering by level, so that it can just log the
Ah, right! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
@huonw it seems there's already a setting to do that:
I confirm the verbosity of the messages is different in the files. See https://www.pantsbuild.org/2.23/reference/global-options#streaming_workunits_level |
…al (pantsbuild#21171) A new option `--report-test-result-source` is added to the `[test]` config section. Enabling this option will produce a file on disk with information that would tell you where the results of tests come from. The tests might have been executed locally or remotely, but they might have been retrieved from the local or remote cache, or be memoized. Knowing where the test results come from may be useful when evaluating the efficiency of the caching strategy and the nature of the changes in the source code that may lead to frequent cache invalidations.
A new option
--report-test-result-source
is added to the[test]
config section. Enabling this option willproduce a file on disk with information that would tell you where the results of tests come from. The tests might
have been executed locally or remotely, but they might have been retrieved from the local or remote cache,
or be memoized. Knowing where the test results come from may be useful when evaluating the efficiency of the
caching strategy and the nature of the changes in the source code that may lead to frequent cache invalidations.
Why?
After running tests one can see how the tests execute, but this information is available only in plain text:
It would be helpful to store that
helloworld/greet/greeting_test.py
was executed andhelloworld/translator/translator_test.py
was cached locally in a parseable form. I would want to have this produced in every CI build and stored in the metrics database to be able to see how effectively the cache is used.For instance, imagine you have nice remote cache setup, but if every PR would modify a few files that are transitive dependencies of majority of tests, the cache won't really help. If one can see what tests are executed most often (instead of being picked up from the cache), they might be good candidates for refactoring.
Some of the caching stats is available via the
[stats]
subsystem and I've recently added support for JSON Lines for thestats
subsystem, see #20579, but this has no information about the tests modules themselves.The
workunit-logger
backend does provide some information such as which tests were executed, but doesn't tell anything about the other modules (was it memoized? cached locally? cached remotely?)@sureshjoshi also had an idea:
so this another use case, too.