Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: use separate log dirs for separate runs #30146

Merged
merged 1 commit into from
Oct 8, 2018

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Sep 12, 2018

Before this patch, roachtest run --count would overwrite the logs
dirs n times (in fact I think it was technically worse - there's one dir
per node and inside there non-unique file would be overwritten and
otherwise stale files would be left from the run the produced it).
This patch moves to unique dirs per run, and otherwise rationalizes the
code around the log and artifact locations.
Each top level tests gets different dirs for the different runs. If
--count is not specified, nothing changes.
After this patch, when running:
roachtest run clock acceptance/event-log acceptance/status-server --count=2

you get the following tree structure.

artifacts
├── acceptance
│ ├── run_1
│ │ ├── event-log
│ │ ├── status-server
│ ├── run_2
│ │ ├── event-log
│ │ ├── status-server
└── clock
├── run_1
│ ├── jump
│ │ ├── large_backward_disabled
│ │ ├── large_backward_enabled
└── run_2
│ ├── large_backward_disabled
├── jump
│ ├── large_backward_disabled
│ ├── large_forward_enabled

When --count is not specified, nothing changes.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei andreimatei force-pushed the roachtest-logs branch 2 times, most recently from 0f4e404 to 167dd05 Compare October 2, 2018 18:07
Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this change conceptually, but I'm not following why so many changes were necessary to accomplish it. I feel like there is a much more targeted change to the creation of the rootLogger that could accomplish the same thing. Something along the lines of passing in an optional suffix to rootLogger where the suffix is constructed from the run number just as you're doing in this PR. My worry with the change as it stands is that roachtest itself has inadequate test coverage. You might inadvertently be breaking something.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL, I've reworked this a bit.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better. Just a question about how child logger stuff is working.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):

	logPath := filepath.Join(t.ArtifactsDir(), "test.log")
	l, err := rootLogger(logPath)

This is odd. Before we were passing in a directory name, and now you're passing in <dir>/test.log. How is that doing the right thing with child loggers?


pkg/cmd/roachtest/test.go, line 68 at r1 (raw file):

	// point, SubtestName will be populated to the original value of Name.
	Name        string
	SubtestName string

Is SubtestName part of the "public" interface? If not, let's rename it to subtestName.


pkg/cmd/roachtest/test.go, line 652 at r1 (raw file):

	}
	// Each subtest gets its own subdir in the parent's artifacts dir.
	artifactsDir := filepath.Join(parentDir, teamCityNameEscape(spec.SubtestName), artifactsSuffix)

I think you could compute SubtestName here as strings.TrimPrefix(spec.Name, parent.Name + "/").

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This is odd. Before we were passing in a directory name, and now you're passing in <dir>/test.log. How is that doing the right thing with child loggers?

I think maybe I've misunderstood the point of child loggers... I thought they were just adding a prefix, but I guess they were both adding a prefix and logging to a different file...
What is the point of child loggers exactly? :)

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think maybe I've misunderstood the point of child loggers... I thought they were just adding a prefix, but I guess they were both adding a prefix and logging to a different file...
What is the point of child loggers exactly? :)

Child loggers are used by test which want to output something to a separate log file. Usually the output is a bit spammy or would overwhelm the normal output in some way (e.g. workload output from multiple concurrent workers).

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/cmd/roachtest/cluster.go, line 560 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Child loggers are used by test which want to output something to a separate log file. Usually the output is a bit spammy or would overwhelm the normal output in some way (e.g. workload output from multiple concurrent workers).

ok, it took me a bit to figure out what's going on with the prefix & friends. Please see now.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Can you verify that the artifacts generated before and after this change are the same for a test which uses child loggers? (Just grep around for one).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

Before this patch, roachtest run --count <n> would overwrite the logs
dirs n times (in fact I think it was technically worse - there's one dir
per node and inside there non-unique file would be overwritten and
otherwise stale files would be left from the run the produced it).
This patch moves to unique dirs per run, and otherwise rationalizes the
code around the log and artifact locations.
Each top level tests gets different dirs for the different runs. If
`--count` is not specified, nothing changes.
After this patch, when running:
roachtest run clock acceptance/event-log acceptance/status-server --count=2

you get the following tree structure.

artifacts
├── acceptance
│   ├── run_1
│   │   ├── event-log
│   │   ├── status-server
│   ├── run_2
│   │   ├── event-log
│   │   ├── status-server
└── clock
    ├── run_1
    │   ├── jump
    │   │   ├── large_backward_disabled
    │   │   ├── large_backward_enabled
    └── run_2
        │   ├── large_backward_disabled
        ├── jump
        │   ├── large_backward_disabled
        │   ├── large_forward_enabled

When --count is not specified, nothing changes.

Release note: None
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you verify that the artifacts generated before and after this change are the same for a test which uses child loggers? (Just grep around for one).

I had.

Merging on Monday. TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/test.go, line 68 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Is SubtestName part of the "public" interface? If not, let's rename it to subtestName.

done


pkg/cmd/roachtest/test.go, line 652 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

I think you could compute SubtestName here as strings.TrimPrefix(spec.Name, parent.Name + "/").

I could, but I think maintaining subtestName explicitly is cleaner.

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@craig
Copy link
Contributor

craig bot commented Oct 8, 2018

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request Oct 8, 2018
30146: roachtest: use separate log dirs for separate runs r=andreimatei a=andreimatei

Before this patch, roachtest run --count <n> would overwrite the logs
dirs n times (in fact I think it was technically worse - there's one dir
per node and inside there non-unique file would be overwritten and
otherwise stale files would be left from the run the produced it).
This patch moves to unique dirs per run, and otherwise rationalizes the
code around the log and artifact locations.
Each top level tests gets different dirs for the different runs. If
`--count` is not specified, nothing changes.
After this patch, when running:
roachtest run clock acceptance/event-log acceptance/status-server --count=2

you get the following tree structure.

artifacts
├── acceptance
│   ├── run_1
│   │   ├── event-log
│   │   ├── status-server
│   ├── run_2
│   │   ├── event-log
│   │   ├── status-server
└── clock
    ├── run_1
    │   ├── jump
    │   │   ├── large_backward_disabled
    │   │   ├── large_backward_enabled
    └── run_2
        │   ├── large_backward_disabled
        ├── jump
        │   ├── large_backward_disabled
        │   ├── large_forward_enabled

When --count is not specified, nothing changes.

Release note: None

Co-authored-by: Andrei Matei <[email protected]>
@craig
Copy link
Contributor

craig bot commented Oct 8, 2018

Build succeeded

@craig craig bot merged commit 8fdb36e into cockroachdb:master Oct 8, 2018
@andreimatei andreimatei deleted the roachtest-logs branch October 8, 2018 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants