Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: admission-control/index-backfill failed #105260

Closed
cockroach-teamcity opened this issue Jun 21, 2023 · 13 comments · Fixed by #105639
Closed

roachtest: admission-control/index-backfill failed #105260

cockroach-teamcity opened this issue Jun 21, 2023 · 13 comments · Fixed by #105639
Assignees
Labels
A-testing Testing tools and infrastructure branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-testeng TestEng Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Jun 21, 2023

roachtest.admission-control/index-backfill failed with artifacts on master @ 1eb628e8c8a7eb1cbf9bfa2bd6c31982c25cbba0:

(tpce.go:168).runTPCE: EOF
(test_runner.go:1122).func1: 4 dead node(s) detected
test artifacts and logs in: /artifacts/admission-control/index-backfill/run_1

Parameters: ROACHTEST_arch=amd64 , ROACHTEST_cloud=gce , ROACHTEST_cpu=8 , ROACHTEST_encrypted=false , ROACHTEST_fs=ext4 , ROACHTEST_localSSD=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

/cc @cockroachdb/kv-triage

This test on roachdash | Improve this report!

Jira issue: CRDB-28947

@cockroach-teamcity cockroach-teamcity added branch-master Failures and bugs on the master branch. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team labels Jun 21, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.1 milestone Jun 21, 2023
@erikgrinaker
Copy link
Contributor

Something is severely messed up here, seeing multiple nodes erroring out due to Raft log data loss:

F230621 06:43:42.993570 267 go.etcd.io/raft/v3/log.go:322 ⋮ [T1,n3,s3,r8411/4:{-}] 62  tocommit(684) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost?
F230621 06:48:50.253740 321 go.etcd.io/raft/v3/log.go:322 ⋮ [T1,n4,s4,r5691/3:‹/Table/136/2{3-4}›] 244076  tocommit(95) is out of range [lastIndex(11)]. Was the raft log corrupted, truncated, or lost?
F230621 06:48:52.062827 334 go.etcd.io/raft/v3/log.go:322 ⋮ [T1,n1,s1,r5264/1:‹/Table/107/1{0-1}›] 244441  tocommit(108) is out of range [lastIndex(24)]. Was the raft log corrupted, truncated, or lost?
F230621 06:48:52.062854 315 go.etcd.io/raft/v3/log.go:322 ⋮ [T1,n8,s8,r5156/2:‹/Table/120/4/"A{ANBPR…-BF-d"…}›] 3834  tocommit(187) is out of range [lastIndex(82)]. Was the raft log corrupted, truncated, or lost?
F230621 06:48:52.072214 362 go.etcd.io/raft/v3/log.go:322 ⋮ [T1,n8,s8,r883/5:‹/Table/134/1/{68167-70001}›] 3938  tocommit(122) is out of range [lastIndex(77)]. Was the raft log corrupted, truncated, or lost?

We also saw this in #105261.

However, I'm running admission-control/index-backfill again now, and not seeing any failures so far.

Let's see if this comes up in any further roachtest failures (ones that are easier to bisect). If it's as widespread as we're seeing here, I'd expect it to blow up all over the place.

The following PRs merged yesterday, and could possibly be relevant:

@erikgrinaker
Copy link
Contributor

Hm, it looks like we hit some other panics here too, when adding RPC metrics around here:

peerMetrics: rpcCtx.metrics.acquire(k),

@tbg made some recent changes here in #99191.

panic: child [3 10.142.1.101:26257 default] already exists [recovered]
        panic: child [3 10.142.1.101:26257 default] already exists

goroutine 1399700 [running]:
panic({0x5a2a400, 0xc017b2f7a0})
        GOROOT/src/runtime/panic.go:987 +0x3ba fp=0xc00d05b388 sp=0xc00d05b2c8 pc=0x49e43a
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).recover(0xe9a0aa?, {0x766e250, 0xc0146360f0})
        github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:230 +0x6a fp=0xc00d05b3d0 sp=0xc00d05b388 pc=0x11cba4a
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2.3()
        github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:478 +0x2e fp=0xc00d05b3f8 sp=0xc00d05b3d0 pc=0x11cd00e
runtime.deferCallSave(0xc00d05b4c8, 0xc018dd5f90?)
        GOROOT/src/runtime/panic.go:796 +0x88 fp=0xc00d05b408 sp=0xc00d05b3f8 pc=0x49e028
runtime.runOpenDeferFrame(0xc00d93e1e0?, 0xc0009be140)
        GOROOT/src/runtime/panic.go:769 +0x1a5 fp=0xc00d05b450 sp=0xc00d05b408 pc=0x49de45
panic({0x5a2a400, 0xc017b2f7a0})
        GOROOT/src/runtime/panic.go:884 +0x212 fp=0xc00d05b510 sp=0xc00d05b450 pc=0x49e292
github.com/cockroachdb/cockroach/pkg/util/metric/aggmetric.(*childSet).add(0xc0098cecc0, {0x7663540, 0xc009adf860})
        github.com/cockroachdb/cockroach/pkg/util/metric/aggmetric/agg_metric.go:107 +0x1d7 fp=0xc00d05b5f8 sp=0xc00d05b510 pc=0x1a9da17
github.com/cockroachdb/cockroach/pkg/util/metric/aggmetric.(*AggGauge).AddChild(...)
        github.com/cockroachdb/cockroach/pkg/util/metric/aggmetric/gauge.go:87
github.com/cockroachdb/cockroach/pkg/rpc.(*Metrics).acquire(0xc0005b7d38, {{0xc000a7dec0?, 0x56869e0?}, 0x10e58f0?, 0xc0?})
        github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/metrics.go:189 +0x31f fp=0xc00d05b6e0 sp=0xc00d05b5f8 pc=0x1b0715f
github.com/cockroachdb/cockroach/pkg/rpc.(*Context).newPeer(0xc0005b7c00, {{0xc000a7dec0?, 0xc000a7dec0?}, 0xb7?, 0x0?})
        github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/peer.go:178 +0x65 fp=0xc00d05b888 sp=0xc00d05b6e0 pc=0x1b079e5
github.com/cockroachdb/cockroach/pkg/rpc.(*Context).grpcDialNodeInternal(0xc0005b7c00, {0xc000a7dec0, 0x12}, 0x3, 0x0)
        github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:2093 +0x453 fp=0xc00d05bb08 sp=0xc00d05b888 pc=0x1b03dd3
github.com/cockroachdb/cockroach/pkg/rpc.(*Context).GRPCDialNode(0xc0122f4c10?, {0xc000a7dec0?, 0xc0008e68c0?}, 0x3?, 0x0?)
        github.com/cockroachdb/cockroach/pkg/rpc/pkg/rpc/context.go:2050 +0x12e fp=0xc00d05bb90 sp=0xc00d05bb08 pc=0x1b038ce
github.com/cockroachdb/cockroach/pkg/rpc/nodedialer.(*Dialer).dial(0xc0038fb860, {0x766e250, 0xc009adf7a0}, 0x30?, {0x7641a50?, 0xc004902698}, 0x1, 0x0?)
        github.com/cockroachdb/cockroach/pkg/rpc/nodedialer/nodedialer.go:170 +0xb1 fp=0xc00d05bc40 sp=0xc00d05bb90 pc=0x1b1afd1
github.com/cockroachdb/cockroach/pkg/rpc/nodedialer.(*Dialer).Dial(0xc0038fb860, {0x766e250, 0xc009adf7a0}, 0x5c4aba0?, 0x0?)
        github.com/cockroachdb/cockroach/pkg/rpc/nodedialer/nodedialer.go:103 +0x11c fp=0xc00d05bcb8 sp=0xc00d05bc40 pc=0x1b1aa5c
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*RaftTransport).startProcessNewQueue.func2({0x766e250, 0xc009adf7a0})
        github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/raft_transport.go:925 +0x226 fp=0xc00d05be18 sp=0xc00d05bcb8 pc=0x3994426
runtime/pprof.Do({0x766e250?, 0xc0146360f0?}, {{0xc016fd0fe0?, 0xc0122f4eb8?, 0xc63326?}}, 0xc01a845960)
        GOROOT/src/runtime/pprof/runtime.go:40 +0xa3 fp=0xc00d05be88 sp=0xc00d05be18 pc=0xe9a003
github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*RaftTransport).startProcessNewQueue.func3({0x766e250, 0xc0146360f0})
        github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/raft_transport.go:947 +0x1c8 fp=0xc00d05bf30 sp=0xc00d05be88 pc=0x39941a8
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
        github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:484 +0x146 fp=0xc00d05bfe0 sp=0xc00d05bf30 pc=0x11ccec6
runtime.goexit()
        GOROOT/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00d05bfe8 sp=0xc00d05bfe0 pc=0x4d35c1
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
        github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:475 +0x43b

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Jun 22, 2023

We actually saw this across several nodes, which did not see the Raft panics:

logs/2.unredacted/cockroach-stderr.log
8:panic: child [3 10.142.1.101:26257 default] already exists [recovered]
10811:panic: child [3 10.142.1.101:26257 default] already exists [recovered]

logs/5.unredacted/cockroach-stderr.log
8:panic: child [3 10.142.1.101:26257 default] already exists [recovered]
9155:panic: child [3 10.142.1.101:26257 default] already exists [recovered]

logs/9.unredacted/cockroach-stderr.log
8:panic: child [3 10.142.1.101:26257 default] already exists [recovered]

@erikgrinaker
Copy link
Contributor

Rough timeline:

06:43:35 cluster initialized
06:43:42 n3 raft panic
06:48:43 n5 rpc panic
06:48:43 n9 rpc panic
06:48:44 n2 rpc panic
06:48:50 n4 raft panic
06:48:52 n1 raft panic
06:48:52 n8 raft panic

n6 no panic
n7 no panic

@erikgrinaker
Copy link
Contributor

Unfortunately, a lot of the interesting logs here have rotated out, because the logs are spammed with trace events from #102793.

@erikgrinaker
Copy link
Contributor

n3 died right after it started up. It managed to apply 2 snapshots, then it errored out.

W230621 06:43:41.630936 1086 kv/kvserver/replica_raftstorage.go:389 ⋮ [T1,n3,s3,r211/7:‹/Table/117/1/200000698{2…-4…}›] 55  unable to retrieve conf reader, cannot determine range MaxBytes
I230621 06:43:41.630974 1086 kv/kvserver/replica_raftstorage.go:514 ⋮ [T1,n3,s3,r211/7:‹/Table/117/1/200000698{2…-4…}›] 56  applied INITIAL snapshot 128ab07f from (n1,s1):4 at applied index 68 (total=32ms data=9.2 MiB ingestion=6@29ms)
W230621 06:43:42.070220 2562 kv/kvserver/replica_raftstorage.go:389 ⋮ [T1,n3,s3,r211/7:‹/Table/117/1/200000698{2…-4…}›] 57  unable to retrieve conf reader, cannot determine range MaxBytes
I230621 06:43:42.070264 2562 kv/kvserver/replica_raftstorage.go:514 ⋮ [T1,n3,s3,r211/7:‹/Table/117/1/200000698{2…-4…}›] 58  applied VIA_SNAPSHOT_QUEUE snapshot 3181fd7a from (n1,s1):4 at applied index 69 (total=7ms data=9.2 MiB ingestion=6@4ms)
[...]
F230621 06:43:42.993570 267 go.etcd.io/raft/v3/log.go:322 ⋮ [T1,n3,s3,r8411/4:{-}] 62  tocommit(684) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost?

There are no further logs for r8411, since the logs have all rotated out. Could be an issue with snapshots.

@erikgrinaker
Copy link
Contributor

There's something else that's really weird here though. In one of the RPC panics, it saw 2 different IPs for n3 (which is what caused the panic):

E230621 06:48:43.531065 1399656 2@rpc/peer.go:611 ⋮ [T1,n2,rnode=3,raddr=‹10.142.1.36:26257›,class=system,rpc] 137415  failed connection attempt‹ (last connected 5m0.048s ago)›: grpc: ‹connection error: desc = "transport: error while dialing: dial tcp 10.142.1.36:26257: connect: connection refused"› [code 14/Unavailable]
E230621 06:48:43.531218 1399657 2@rpc/peer.go:611 ⋮ [T1,n2,rnode=3,raddr=‹10.142.1.36:26257›,class=default,rpc] 137416  failed connection attempt‹ (last connected 5m0.049s ago)›: grpc: ‹connection error: desc = "transport: error while dialing: dial tcp 10.142.1.36:26257: connect: connection refused"› [code 14/Unavailable]
E230621 06:48:43.646279 1011 2@rpc/peer.go:611 ⋮ [T1,n2,rnode=3,raddr=‹10.142.1.101:26257›,class=system,rpc] 137417  failed connection attempt‹ (last connected 5m0.098s ago)›: grpc: ‹connection error: desc = "transport: error while dialing: dial tcp 10.142.1.101:26257: connect: connection refused"› [code 14/Unavailable]
E230621 06:48:44.029010 1399700 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n2] 137418  a panic has occurred!
E230621 06:48:44.029010 1399700 1@util/log/logcrash/crash_reporting.go:188 ⋮ [T1,n2] 137418 +child [‹3› ‹10.142.1.101:26257› ‹default›] already exists

These IPs appear to belong to n3 from two different clusters. From roachprod_state/clusters, we find the following hosts:

  • 10.142.1.101: teamcity-10611064-1687326486-19-n10cpu8-0003 -- this test's cluster, created 06:38:25
  • 10.142.1.36: teamcity-10611064-1687326486-18-n10cpu8-0003 -- some other cluster, created 06:30:27

Crosstalk between clusters would definitely explain this, but why didn't the cluster ID checks trip?

@erikgrinaker
Copy link
Contributor

Wait a minute... These tests use volume snapshots to set up the test fixture. Does that mean that all clusters from the same fixtures have the same cluster IDs? Maybe this cluster got tangled up with the other cluster in #105261 that also used these snapshots, and also saw Raft panics?

@erikgrinaker
Copy link
Contributor

Sure enough, the volume snapshots use a static cluster ID 6d35ecfd-e28f-47b9-9ffc-f94f488fce8a, which will be shared by all tests using them. On the other cluster from #105261 we see its n1 briefly connect to n3 from this cluster:

E230621 06:43:42.961776 420 2@rpc/peer.go:590 ⋮ [T1,n1,rnode=3,raddr=‹10.142.1.36:26257›,class=system,rpc] 1059  disconnected (was healthy for 6m49.265s): grpc: ‹connection error: desc = "transport: error while dialing: connection interrupted (did the remote node shut down or are there networking issues?)"› [code 14/Unavailable]
W230621 06:43:43.123420 1168748 kv/kvserver/raft_transport.go:942 ⋮ [T1,n1] 1060  while processing outgoing Raft queue to node 3: recv msg error: grpc: ‹grpc: the client connection is closing› [code 1/Canceled]:
E230621 06:43:43.123467 1165763 2@rpc/peer.go:590 ⋮ [T1,n1,rnode=3,raddr=‹10.142.1.101:26257›,class=system,rpc] 1061  disconnected (was healthy for 2.128s): grpc: ‹node unavailable; try another peer› [code 2/Unknown]
E230621 06:43:43.154504 1165549 2@rpc/peer.go:590 ⋮ [T1,n1,rnode=3,raddr=‹10.142.1.101:26257›,class=default,rpc] 1062  disconnected (was healthy for 2.16s): grpc: ‹node unavailable; try another peer› [code 2/Unknown]

I don't know where it's getting these IPs from, but there's definite cross-talk here. I also see us contacting a bunch of other clusters, but these are rejected due to cluster ID mismatches:

E230621 06:43:36.782170 490 2@rpc/peer.go:611 ⋮ [T1,n1,rnode=?,raddr=‹10.142.0.124:26257›,class=system,rpc] 6  failed connection attempt‹ (last connected 0s ago)›: grpc: ‹client cluster ID "6d35ecfd-e28f-47b9-9ffc-f94f488fce8a" doesn't match server cluster ID "1d37197d-b1e3-4ac3-988a-236fefde9e1a"› [code 2/Unknown]
E230621 06:43:37.781932 564 2@rpc/peer.go:611 ⋮ [T1,n1,rnode=?,raddr=‹10.142.1.13:26257›,class=system,rpc] 7  failed connection attempt‹ (last connected 0s ago)›: grpc: ‹connection error: desc = "transport: error while dialing: dial tcp 10.142.1.13:26257: connect: connection refused"› [code 14/Unavailable]
E230621 06:43:38.782647 222 2@rpc/peer.go:611 ⋮ [T1,n1,rnode=?,raddr=‹10.142.0.234:26257›,class=system,rpc] 8  failed connection attempt‹ (last connected 0s ago)›: grpc: ‹client cluster ID "6d35ecfd-e28f-47b9-9ffc-f94f488fce8a" doesn't match server cluster ID "c27243fa-dcaf-41f7-9f7d-dbec053b0c32"› [code 2/Unknown]
E230621 06:43:39.783881 670 2@rpc/peer.go:611 ⋮ [T1,n1,rnode=?,raddr=‹10.142.0.113:26257›,class=system,rpc] 9  failed connection attempt‹ (last connected 0s ago)›: grpc: ‹client cluster ID "6d35ecfd-e28f-47b9-9ffc-f94f488fce8a" doesn't match server cluster ID "ada6081f-ba6b-4b99-ac00-beac9e872867"› [code 2/Unknown]
E230621 06:43:40.789069 685 2@rpc/peer.go:611 ⋮ [T1,n1,rnode=?,raddr=‹10.142.1.15:26257›,class=system,rpc] 10  failed connection attempt‹ (last connected 0s ago)›: grpc: ‹client cluster ID "6d35ecfd-e28f-47b9-9ffc-f94f488fce8a" doesn't match server cluster ID "352eaf36-f3cd-48ba-a0a5-63b1c351b04b"› [code 2/Unknown]

Not sure what the best solution here is. I'm going to hand this one over to @irfansharif and @cockroachdb/test-eng.

@erikgrinaker erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-testing Testing tools and infrastructure T-testeng TestEng Team and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Jun 23, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jun 23, 2023

cc @cockroachdb/test-eng

irfansharif added a commit to irfansharif/cockroach that referenced this issue Jun 27, 2023
These two roachtests previously attempted to (opportunistically) share
disk snapshots. In cockroachdb#105260 we observed that when running
simultaneously, the two clusters end up using the same cluster ID and
there's cross-talk from persisted gossip state where we record IP
addresses. This commit prevents this snapshot re-use, giving each test
its own one. While here, we reduce the cadence of these tests to be
weekly runs instead, since they've (otherwise) been non-flakey.

Release note: None
@cockroach-teamcity
Copy link
Member Author

roachtest.admission-control/index-backfill failed with artifacts on master @ 4a614f89cea81bf94674d6072c3bbf30502244d4:

(assertions.go:333).Fail: 
	Error Trace:	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpce.go:159
	            				github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/admission_control_index_backfill.go:148
	            				main/pkg/cmd/roachtest/test_runner.go:1060
	            				GOROOT/src/runtime/asm_amd64.s:1594
	Error:      	Received unexpected error:
	            	cluster.Install: COMMAND_PROBLEM: exit status 100
	            	(1) attached stack trace
	            	  -- stack trace:
	            	  | main.(*clusterImpl).Install
	            	  | 	main/pkg/cmd/roachtest/cluster.go:2417
	            	  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.initTPCESpec
	            	  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpce.go:64
	            	  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.runTPCE
	            	  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/tpce.go:158
	            	  | github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests.registerIndexBackfill.func1
	            	  | 	github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/admission_control_index_backfill.go:148
	            	  | main.(*testRunner).runTest.func2
	            	  | 	main/pkg/cmd/roachtest/test_runner.go:1060
	            	  | runtime.goexit
	            	  | 	GOROOT/src/runtime/asm_amd64.s:1594
	            	Wraps: (2) cluster.Install
	            	Wraps: (3) Node 10. Command with error:
	            	  | ``````
	            	  | set -exuo pipefail;
	            	  | sudo apt-get update;
	            	  | sudo apt-get install  -y \
	            	  |     apt-transport-https \
	            	  |     ca-certificates \
	            	  |     curl \
	            	  |     software-properties-common;
	            	  | curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -;
	            	  | sudo add-apt-repository \
	            	  |    "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
	            	  |    $(lsb_release -cs) \
	            	  |    stable";
	            	  |
	            	  | sudo apt-get update;
	            	  | sudo apt-get install  -y docker-ce;
	            	  |
	            	  | ``````
	            	Wraps: (4) COMMAND_PROBLEM
	            	Wraps: (5) exit status 100
	            	Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *hintdetail.withDetail (4) errors.Cmd (5) *exec.ExitError
	Test:       	admission-control/index-backfill
(require.go:1360).NoError: FailNow called
test artifacts and logs in: /artifacts/admission-control/index-backfill/run_1

Parameters: ROACHTEST_arch=amd64 , ROACHTEST_cloud=gce , ROACHTEST_cpu=8 , ROACHTEST_encrypted=false , ROACHTEST_fs=ext4 , ROACHTEST_localSSD=false , ROACHTEST_ssd=0

Help

See: roachtest README

See: How To Investigate (internal)

This test on roachdash | Improve this report!

@erikgrinaker
Copy link
Contributor

Last failure is infra flake.

E: Failed to fetch http://us-east1.gce.archive.ubuntu.com/ubuntu/pool/universe/p/pigz/pigz_2.4-1_amd64.deb  503  Service Unavailable [IP: 35.196.65.164 80]
E: Failed to fetch http://us-east1.gce.archive.ubuntu.com/ubuntu/pool/universe/s/slirp4netns/slirp4netns_0.4.3-1_amd64.deb  503  Service Unavailable [IP: 35.196.65.164 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

@craig craig bot closed this as completed in 4104f66 Jun 29, 2023
@exalate-issue-sync exalate-issue-sync bot removed the T-kv KV Team label Jun 29, 2023
@renatolabs
Copy link
Contributor

cc #103316 (just to keep track of these occurrences).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. T-testeng TestEng Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants