-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: create perf artifacts concept and collect them #38889
roachtest: create perf artifacts concept and collect them #38889
Conversation
Needs https://github.com/cockroachdb/roachperf/pull/38 to land in order to properly parse the new artifacts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @andreimatei)
pkg/cmd/roachtest/kv.go, line 67 at r1 (raw file):
sequential = " --sequential" } t.ArtifactsDir()
Was this intentional?
pkg/cmd/roachtest/test.go, line 74 at r1 (raw file):
// directory from all of the nodes if it is non-empty. // It will not cause an error if the perf artifacts directory does not exist. HasPerfArtifacts bool
I'm a little surprised that we don't just try to pull from this directory unconditionally. It seems like an easy thing to miss and have to debug every few months. Are you worried that doing so would be too expensive? Is it because of the logging you discuss below?
Before the roachtest rewrite in cockroachdb#30977 we used to collect the logs directory from remote hosts unconditionally. We also used to write stats.json histogram files to the logs directory. This combination enabled the collection of histogram files for roachperf. Since the new roachtest only copies logs to the test runner in the case of failure, we have not gotten any perf artifacts since the change. This log policy is sane for logs and other debug artifacts like heap profiles that exist there but has been a bit of a bummer as we've missed out on the dopamine rush due to seeing charts go up and to the right in bespoke D3 glory. The truth is that we weren't all that excited about the previous behavior with regards to perf data either. Sometimes tests would fail and their lousy results would mess with our pretty pictures (https://github.com/cockroachdb/roachperf/issues/37). This PR introduces a new concept, perf artifacts, which live in a different directory, the perfArtifactsDir ("perf") and has the opposite fetching policy from the logs; the "perf" directory is copied from each remove host into the test's artifactsDir only upon successful completion of the test. One downside of this as implemented is that it can spam the test's log (but fortunately not roachtest's logs anymore, <3 @andreimatei). cockroachdb#38890 attempts to address this somewhat. It shouldn't been too big of a problem because it only happens on tests that already passed. This change also necessitates a small change to roachperf to look in this new "perf" directory (and respect the rewrite's test artifact directory structure). Fixes cockroachdb#38871. Release note: None
0492bb9
to
3cf2306
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)
pkg/cmd/roachtest/kv.go, line 67 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Was this intentional?
No. Removed.
pkg/cmd/roachtest/test.go, line 74 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm a little surprised that we don't just try to pull from this directory unconditionally. It seems like an easy thing to miss and have to debug every few months. Are you worried that doing so would be too expensive? Is it because of the logging you discuss below?
Done. The error was a big part of the reason. I thought about typing code to have the testSpec tell the runner which nodes should have perf artifacts but it didn't seem super clean to generalize as tests like in tpcc determine their load nodes when they are run based on the shape of the cluster. I made #38890 to make it somewhat less egregious, plus it's only after tests pass.
A more drastic thing that could work is to change roachprod get
to use rsync
instead of scp
and would enable fancier features like --ignore-missing-args
. Not for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)
TFTR! bors r=nvanbenschoten |
38848: roachpb: further clarify correctness of observed timestamps r=nvanbenschoten a=nvanbenschoten This commit expands on the explanation added in b7cc205. Specifically, it focuses on the inability to bound a transaction's uncertainty interval below the starting timestamp of the lease that it is executing a request on. This limitation was added in 85d4627. I tend to forget about it once every few months and get very confused how this all works. Release note: None 38889: roachtest: create perf artifacts concept and collect them r=nvanbenschoten a=ajwerner Before the roachtest rewrite in #30977 we used to collect the logs directory from remote hosts unconditionally. We also used to write stats.json histogram files to the logs directory. This combination enabled the collection of histogram files for roachperf. Since the new roachtest does not unconditionally copy logs to the test runner, we have not gotten any perf artifacts since the change. This PR introduces a new concept, perf artifacts, which live in a different directory, the perfArtifactsDir. If a testSpec indicates that it HasPerfArtifacts then that directory is copied to the test runner's artifactsDir for that test from each node in the cluster upon success. This change also necessitates a small change to roachperf to look in this new "perf" directory. Fixes #38871. Release note: None 38895: roachprod: disable ssh logging r=ajwerner a=tbg This was never fully introduced and there is currently no impetus to push it over the finish line. It would be useful to diagnose issues that are readily reproducible, but we haven't seen any of those. Release note: None Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]>
Build succeeded |
Before the roachtest rewrite in #30977 we used to collect the logs directory
from remote hosts unconditionally. We also used to write stats.json histogram
files to the logs directory. This combination enabled the collection of
histogram files for roachperf. Since the new roachtest does not unconditionally
copy logs to the test runner, we have not gotten any perf artifacts since the
change.
This PR introduces a new concept, perf artifacts, which live in a different
directory, the perfArtifactsDir. If a testSpec indicates that it
HasPerfArtifacts then that directory is copied to the test runner's
artifactsDir for that test from each node in the cluster upon success.
This change also necessitates a small change to roachperf to look in this new
"perf" directory.
Fixes #38871.
Release note: None