Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Disable retries on the CI #3400

Closed
wants to merge 6 commits into from

Conversation

apostasie
Copy link
Contributor

@apostasie apostasie commented Sep 3, 2024

There is a lot of context about this in #3303.

TL;DR:
Retrying tests was a good idea to make the CI less painful to work with, but unfortunately, the side-effect is that it is just making things worse by basically allowing tests (and code) that can fail half of the time to still "pass".

Merging this PR will make things very painful for us for some time as we'll have to "pay back" and fix all the things that have probably been failing for months, but it will make things much better in the long run.

LMK your thoughts...

@apostasie
Copy link
Contributor Author

Note: I can't make this to go green right now obviously...
I would still avocate we need to merge this so that people can engage and fix these bugs.

@apostasie
Copy link
Contributor Author

@ktock maybe you could help on these

TestRunStargz is failing twice on this run:

=== RUN   TestRunStargz
    container_run_stargz_linux_test.go:36: assertion failed: res.ExitCode is not exitCode: 
        Command:  /usr/local/bin/nerdctl --namespace=nerdctl-test --snapshotter=stargz run --rm ghcr.io/stargz-containers/fedora:30-esgz ls /.stargz-snapshotter
        ExitCode: 1
        Error:    exit status 1
        Stdout:   
        Stderr:   ghcr.io/stargz-containers/fedora:30-esgz: resolving      |--------------------------------------| 
        elapsed: 0.1 s                            total:   0.0 B (0.0 B/s)                                         
        ghcr.io/stargz-containers/fedora:30-esgz:                                      resolved       |++++++++++++++++++++++++++++++++++++++| 
        index-sha256:5286767fa09878e16acd75ab13bfa5b985473a7cf5442599c837153050b4122c: downloading    |--------------------------------------|    0.0 B/235.0 B 
        elapsed: 0.2 s                                                                 total:   0.0 B (0.0 B/s)                                         
        time="2024-09-03T16:21:58Z" level=error msg="server \"ghcr.io\" does not seem to support HTTPS" error="failed to prepare extraction snapshot \"extract-426242182-kaxG sha256:0e9db48c579d098b58a078ee45fc6490b2c297d677ee63a3da3ae032c59eb4d6\": connection error: desc = \"transport: Error while dialing: dial unix /run/user/1001/containerd-stargz-grpc/containerd-stargz-grpc.sock: connect: connection refused\": unavailable"
        time="2024-09-03T16:21:58Z" level=info msg="Hint: you may want to try --insecure-registry to allow plain HTTP (if you are in a trusted network)"
        time="2024-09-03T16:21:58Z" level=fatal msg="failed to prepare extraction snapshot \"extract-426242182-kaxG sha256:0e9db48c579d098b58a078ee45fc6490b2c297d677ee63a3da3ae032c59eb4d6\": connection error: desc = \"transport: Error while dialing: dial unix /run/user/1001/containerd-stargz-grpc/containerd-stargz-grpc.sock: connect: connection refused\": unavailable"
        
        
--- FAIL: TestRunStargz (0.39s)

Thanks in advance.

This was referenced Sep 3, 2024
@apostasie apostasie force-pushed the disable-retries branch 8 times, most recently from 3251c46 to 450817d Compare September 25, 2024 20:11
@apostasie apostasie marked this pull request as draft September 25, 2024 20:23
@apostasie apostasie changed the title Disable retries on the CI [WIP] Disable retries on the CI Sep 25, 2024
As we make progress rewriting tests, the new tooling needs to adapt.
In a shell, this is:
- introducing (more) `Requirements`, with a better API
- update documentation
- fix some t.Helper calls
- fix broken stdin implementation
- do cleanup custom namespaces properly
- change hashing function
- disable "private" implying custom data root which is more trouble than is worth
- minor cleanups

Signed-off-by: apostasie <[email protected]>
Signed-off-by: apostasie <[email protected]>
Signed-off-by: apostasie <[email protected]>
@apostasie
Copy link
Contributor Author

Closing.
This is getting subsumed into the next "testing" PR with the introduction of the Flaky marker.

@apostasie apostasie closed this Oct 4, 2024
@apostasie apostasie mentioned this pull request Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant