Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test rework, part 4 #3492

Merged
merged 8 commits into from
Oct 13, 2024
Merged

Test rework, part 4 #3492

merged 8 commits into from
Oct 13, 2024

Conversation

apostasie
Copy link
Contributor

@apostasie apostasie commented Oct 3, 2024

This is the next part of the test rewriting project.

Background and motivation:

See:

and the initial commit message from 2c95d09
along with the linked issues from above PRs.

The gist of it is really simple: our testing situation is not good (see prior discussions for full list of issues - but then, the sheer fact we have to retry our CI multiple times to get a green tree should say enough).

This here is the part 4 of the effort to rewrite our tests using a new, custom developed testing toolkit that emphasizes test expressiveness, debugability and better isolation.

Changes

Individual commits messages have extensive details about the changes where appropriate.

Status

a. the following has been ran locally 100x+ with no failure:

b. the following have ran locally 20x+ with no failure:

  • ipfs (~ 100 secs)

c. the following are failing after about 20 iterations:

  • image (~40 secs)

d. the following are NOT part of this PR:

  • builder
  • container
  • compose
  • login

Also left out of this PR: disabling retries for non flaky tests.
We do now have the infrastructure for that (eg: -test.only-flaky) but this will come in another PR after this.

Note that while no-failure locally with 100+ runs is no guarantee that things are fine: I do run local tests directly on the host, not inside the dockerfile - running inside the dockerfile on the CI is likely to bring up a lot more failures, either because things are inherently slower, or because the extra docker layer is adding specific conditions that will make us fail.


Review

Close to 10k lines is indeed a chunky PR, but then, as with the previous parts, there is absolutely no "code" or "logic" changes in nerdctl itself - this is "just" tests: if they run, and if the CI is green...

PR has been split in as many commits as feasible to facilitate review, along with a number of inline comments in the github UI where appropriate.

Let me know if I can make this easier in any way.

The north star for this whole effort remains:

  • we must get to a consistently green CI without retries - with no or very few flaky test.
  • go test -p 1 ./cmd/nerdctl/... must work locally out of the box on developers laptop without randomly breaking for missing requirements

@apostasie apostasie force-pushed the IGNORE branch 5 times, most recently from 8946f8f to fe54f32 Compare October 5, 2024 07:42
@apostasie apostasie changed the title [IGNORE] CI testing [WIP] Test rework, part 4 Oct 5, 2024
Command: test.RunCommand("volume", "create"),
Expected: test.Expects(0, nil, nil),
Command: test.Command("volume", "create"),
Expected: test.Expects(0, nil, test.Match(regexp.MustCompile("^[a-f0-9]{64}\n$"))),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enhance the test a little bit.

Errors: []error{errdefs.ErrInvalidArgument},
}
},
// NOTE: docker returns 125 on this
Copy link
Contributor Author

@apostasie apostasie Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use compact syntax.

assert.NilError(t, err, "File creation failed")
testCase := nerdtest.Setup()

testCase.Require = nerdtest.BrokenTest("This test assumes that the host-side of a volume can be written into, "+
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this test as broken (for Docker without root)

Setup: func(data test.Data, helpers test.Helpers) {
var vol1, vol2, vol3, vol4 = data.Identifier() + "-1", data.Identifier() + "-2", data.Identifier() + "-3", data.Identifier() + "-4"
var label1, label2, label3, label4 = data.Identifier() + "=label-1", data.Identifier() + "=label-2", data.Identifier() + "=label-3", data.Identifier() + "-group=label-4"
testCase.Require = nerdtest.BrokenTest("This test assumes that the host-side of a volume can be written into, "+
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking test as broken.

},
Expected: test.Expects(0, nil, test.Equals("uninitialized-setup-command")),
},
},
Expected: test.Expects(0, nil, test.Equals("uninitialized-setup")),
},
{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithEnv is no longer exposed. This tests are now moot.

Description: "TOML > Default",
Command: test.RunCommand("info", "-f", "{{.Driver}}"),
Expected: test.Expects(0, nil, test.Equals("dummy-snapshotter-via-toml\n")),
Data: test.WithConfig(nerdtest.NerdctlToml, `snapshotter = "dummy-snapshotter-via-toml"`),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config is now separate from Data.

return helpers.
Command("run", "-it", "--rm", "--net=host", testutil.AlpineImage, "echo", "this was always working").
WithWrapper("unbuffer")
Command: func(data test.Data, helpers test.Helpers) test.TestableCommand {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chaining was a mistake.

@@ -14,7 +14,7 @@
limitations under the License.
*/

package main
package issues
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this file out of the root, where it does not belong.

func TestIssue3425(t *testing.T) {
nerdtest.Setup()

var registry *testregistry.RegistryServer

var ipfsPath string
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now embedded inside the Requirement IPFS

),
Env: map[string]string{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nerdtest.IPFS requirement now takes care of this.

@apostasie apostasie force-pushed the IGNORE branch 6 times, most recently from a374cf1 to f7cfb10 Compare October 6, 2024 02:43
@apostasie apostasie force-pushed the IGNORE branch 6 times, most recently from c00c943 to d0ca83a Compare October 7, 2024 06:40
@apostasie apostasie force-pushed the IGNORE branch 2 times, most recently from f772e84 to a7bb358 Compare October 10, 2024 21:19
@apostasie
Copy link
Contributor Author

There is clearly a deadlock happening in image tests, but only for arm64.

Debugging right now - marking draft until I find the culprit.

@apostasie apostasie marked this pull request as draft October 10, 2024 21:20
@apostasie apostasie force-pushed the IGNORE branch 6 times, most recently from 498bdfb to a71ecf3 Compare October 10, 2024 22:42
Signed-off-by: apostasie <[email protected]>
This commit bundles a few minor fixes that are necessary following changes in test tooling.

Signed-off-by: apostasie <[email protected]>
@apostasie apostasie marked this pull request as ready for review October 10, 2024 23:35
@apostasie
Copy link
Contributor Author

Fixed.

@apostasie apostasie mentioned this pull request Oct 11, 2024
@AkihiroSuda AkihiroSuda requested a review from a team October 11, 2024 08:59
@@ -51,7 +51,7 @@ func TestVolumeInspect(t *testing.T) {
&test.Requirement{
Check: func(data test.Data, helpers test.Helpers) (bool, string) {
isDocker, _ := nerdtest.Docker.Check(data, helpers)
return !isDocker || test.IsRoot(), "docker cli needs to be run as root"
return !isDocker || os.Geteuid() == 0, "docker cli needs to be run as root"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on reviewers feedback.

@@ -94,7 +95,7 @@ func TestVolumeLsFilter(t *testing.T) {
&test.Requirement{
Check: func(data test.Data, helpers test.Helpers) (bool, string) {
isDocker, _ := nerdtest.Docker.Check(data, helpers)
return !isDocker || test.IsRoot(), "docker cli needs to be run as root"
return !isDocker || os.Geteuid() == 0, "docker cli needs to be run as root"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on reviewers feedback.

@@ -32,7 +32,7 @@ import (
"github.com/containerd/nerdctl/v2/pkg/testutil/test"
)

var BuildkitHost test.ConfigKey = "bkHost"
var BuildkitHost test.ConfigKey = "BuildkitHost"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on reviewers feedback.

)

// IsRoot returns true if we are root... simple
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on reviewers feedback.

@@ -562,7 +562,7 @@ func M(m *testing.M) {
flag.BoolVar(&flagTestKillDaemon, "test.allow-kill-daemon", false, "enable tests that kill the daemon")
flag.BoolVar(&flagTestIPv6, "test.only-ipv6", false, "enable tests on IPv6")
flag.BoolVar(&flagTestKube, "test.only-kubernetes", false, "enable tests on Kubernetes")
flag.BoolVar(&flagTestFlaky, "test.only-flaky", false, "enable testing of flaky tests only")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on reviewers feedback.


func environmentIsForFlaky() bool {
return testutil.GetFlakyEnvironment()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call testutil functions directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have to bridge between the legacy tests, and the code handling flag parsing, and the proposed new framework, which unfortunately implies a certain level of duplication.
I will clean that up once all tests are migrated.

RegistryImageNext = testutil.RegistryImageNext
KuboImage = testutil.KuboImage
DockerAuthImage = testutil.DockerAuthImage
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't just use testutil constants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are stuck in a bad situation right now with a mix of _platform_test.go files (which should disappear) and Skip(platform) tests, that must have the corresponding helpers available.

This will get cleaned-up later on, but rn a level of indirection is necessary (or alternatively add a bunch of stuff in the legacy files).

}

// RootLess marks a test as suitable only for the rootless environment
var RootLess = &test.Requirement{
Copy link
Member

@AkihiroSuda AkihiroSuda Oct 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe s/RootLess/Rootless/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering about that...

I am happy either way - I will take another look at that aspect (RootLess, RootFul) with the next round then.

"github.com/containerd/nerdctl/v2/pkg/testutil/test"
)

func IsDocker() bool {
return testutil.GetTarget() == "docker"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call testutil.GetTarget directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrating to the proposed new framework, I would rather have everything sourced from there to help cleaning things up.
And yes, eventually, they will be merged.

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'm merging this, but this amount of PR was almost unreviewable and even causing Safari to almost hang up 😓

In the next series of the PR please consider splitting the PR to smaller parts.

Also consider reducing the amount of the helper functions and constants (see the comments)

@AkihiroSuda AkihiroSuda merged commit 6abe279 into containerd:main Oct 13, 2024
22 checks passed
@apostasie
Copy link
Contributor Author

Thanks, I'm merging this, but this amount of PR was almost unreviewable and even causing Safari to almost hang up 😓

@AkihiroSuda I really appreciate your help and support on this.
I could not have done it otherwise.

In the next series of the PR please consider splitting the PR to smaller parts.

Upcoming parts are roughly:

  • github actions clean-up with non-flaky/flaky(with retry) split up - this will give us a lot of bang for the buck and likely a good speed-up of the total testrun time
  • migration of container tests
  • migration of builder tests
  • migration of login tests
  • migration of compose tests

I think the test framework is mature enough now that we can have a separate PR for each of these ^ - making it much less painful for everybody (me included 😓).

Also consider reducing the amount of the helper functions and constants (see the comments)

Eventually all of it will fuse into one - rn I have to make sure existing tests still work unaltered, and the new ones are not constrained... It is tough. I almost gave up on the effort multiple times 😰. Debugging some of these issues was mind altering.

Thanks again!
I am absolutely convinced we will be in a much better situation soon, and the quality of 2.0 will rock!

@AkihiroSuda AkihiroSuda added the area/ci e.g., CI failure label Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci e.g., CI failure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants