-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: use separate log dirs for separate runs #30146
Conversation
760ebbb
to
2273d57
Compare
0f4e404
to
167dd05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this change conceptually, but I'm not following why so many changes were necessary to accomplish it. I feel like there is a much more targeted change to the creation of the rootLogger
that could accomplish the same thing. Something along the lines of passing in an optional suffix to rootLogger
where the suffix is constructed from the run number just as you're doing in this PR. My worry with the change as it stands is that roachtest
itself has inadequate test coverage. You might inadvertently be breaking something.
Reviewable status: complete! 0 of 0 LGTMs obtained
846155d
to
6adae0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL, I've reworked this a bit.
Reviewable status: complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks better. Just a question about how child logger stuff is working.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):
logPath := filepath.Join(t.ArtifactsDir(), "test.log") l, err := rootLogger(logPath)
This is odd. Before we were passing in a directory name, and now you're passing in <dir>/test.log
. How is that doing the right thing with child loggers?
pkg/cmd/roachtest/test.go, line 68 at r1 (raw file):
// point, SubtestName will be populated to the original value of Name. Name string SubtestName string
Is SubtestName
part of the "public" interface? If not, let's rename it to subtestName
.
pkg/cmd/roachtest/test.go, line 652 at r1 (raw file):
} // Each subtest gets its own subdir in the parent's artifacts dir. artifactsDir := filepath.Join(parentDir, teamCityNameEscape(spec.SubtestName), artifactsSuffix)
I think you could compute SubtestName
here as strings.TrimPrefix(spec.Name, parent.Name + "/")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
This is odd. Before we were passing in a directory name, and now you're passing in
<dir>/test.log
. How is that doing the right thing with child loggers?
I think maybe I've misunderstood the point of child loggers... I thought they were just adding a prefix, but I guess they were both adding a prefix and logging to a different file...
What is the point of child loggers exactly? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I think maybe I've misunderstood the point of child loggers... I thought they were just adding a prefix, but I guess they were both adding a prefix and logging to a different file...
What is the point of child loggers exactly? :)
Child loggers are used by test which want to output something to a separate log file. Usually the output is a bit spammy or would overwhelm the normal output in some way (e.g. workload output from multiple concurrent workers).
6adae0e
to
28ae150
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Child loggers are used by test which want to output something to a separate log file. Usually the output is a bit spammy or would overwhelm the normal output in some way (e.g. workload output from multiple concurrent workers).
ok, it took me a bit to figure out what's going on with the prefix & friends. Please see now.
28ae150
to
1d95072
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you verify that the artifacts generated before and after this change are the same for a test which uses child loggers? (Just grep around for one).
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
Before this patch, roachtest run --count <n> would overwrite the logs dirs n times (in fact I think it was technically worse - there's one dir per node and inside there non-unique file would be overwritten and otherwise stale files would be left from the run the produced it). This patch moves to unique dirs per run, and otherwise rationalizes the code around the log and artifact locations. Each top level tests gets different dirs for the different runs. If `--count` is not specified, nothing changes. After this patch, when running: roachtest run clock acceptance/event-log acceptance/status-server --count=2 you get the following tree structure. artifacts ├── acceptance │ ├── run_1 │ │ ├── event-log │ │ ├── status-server │ ├── run_2 │ │ ├── event-log │ │ ├── status-server └── clock ├── run_1 │ ├── jump │ │ ├── large_backward_disabled │ │ ├── large_backward_enabled └── run_2 │ ├── large_backward_disabled ├── jump │ ├── large_backward_disabled │ ├── large_forward_enabled When --count is not specified, nothing changes. Release note: None
1d95072
to
8fdb36e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you verify that the artifacts generated before and after this change are the same for a test which uses child loggers? (Just grep around for one).
I had.
Merging on Monday. TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
pkg/cmd/roachtest/test.go, line 68 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
Is
SubtestName
part of the "public" interface? If not, let's rename it tosubtestName
.
done
pkg/cmd/roachtest/test.go, line 652 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
I think you could compute
SubtestName
here asstrings.TrimPrefix(spec.Name, parent.Name + "/")
.
I could, but I think maintaining subtestName
explicitly is cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
Build failed (retrying...) |
30146: roachtest: use separate log dirs for separate runs r=andreimatei a=andreimatei Before this patch, roachtest run --count <n> would overwrite the logs dirs n times (in fact I think it was technically worse - there's one dir per node and inside there non-unique file would be overwritten and otherwise stale files would be left from the run the produced it). This patch moves to unique dirs per run, and otherwise rationalizes the code around the log and artifact locations. Each top level tests gets different dirs for the different runs. If `--count` is not specified, nothing changes. After this patch, when running: roachtest run clock acceptance/event-log acceptance/status-server --count=2 you get the following tree structure. artifacts ├── acceptance │ ├── run_1 │ │ ├── event-log │ │ ├── status-server │ ├── run_2 │ │ ├── event-log │ │ ├── status-server └── clock ├── run_1 │ ├── jump │ │ ├── large_backward_disabled │ │ ├── large_backward_enabled └── run_2 │ ├── large_backward_disabled ├── jump │ ├── large_backward_disabled │ ├── large_forward_enabled When --count is not specified, nothing changes. Release note: None Co-authored-by: Andrei Matei <[email protected]>
Build succeeded |
Before this patch, roachtest run --count would overwrite the logs
dirs n times (in fact I think it was technically worse - there's one dir
per node and inside there non-unique file would be overwritten and
otherwise stale files would be left from the run the produced it).
This patch moves to unique dirs per run, and otherwise rationalizes the
code around the log and artifact locations.
Each top level tests gets different dirs for the different runs. If
--count
is not specified, nothing changes.After this patch, when running:
roachtest run clock acceptance/event-log acceptance/status-server --count=2
you get the following tree structure.
artifacts
├── acceptance
│ ├── run_1
│ │ ├── event-log
│ │ ├── status-server
│ ├── run_2
│ │ ├── event-log
│ │ ├── status-server
└── clock
├── run_1
│ ├── jump
│ │ ├── large_backward_disabled
│ │ ├── large_backward_enabled
└── run_2
│ ├── large_backward_disabled
├── jump
│ ├── large_backward_disabled
│ ├── large_forward_enabled
When --count is not specified, nothing changes.
Release note: None