Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make GHA Suite Tests Deterministic #382

Closed
njtran opened this issue Apr 26, 2021 · 2 comments
Closed

Make GHA Suite Tests Deterministic #382

njtran opened this issue Apr 26, 2021 · 2 comments

Comments

@njtran
Copy link
Contributor

njtran commented Apr 26, 2021

Currently, our GHA tests run locally on computers with high success rates, but run into race conditions that fail tests on GHA. We need to investigate:

  • What parts our tests are creating the race conditions?
  • Is ginkgo creating these race conditions?
  • Is there something in GHA we can change to fix decrease this error rate?
@njtran
Copy link
Contributor Author

njtran commented Jun 23, 2021

Sample output from GHA tests that don't occur locally (at least for me).

E0623 00:49:32.460681   22179 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 988 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x2aff400, 0x4885580)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xc9
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0xc2
panic(0x2aff400, 0x4885580)
	/opt/hostedtoolcache/go/1.16.3/x64/src/runtime/panic.go:971 +0x499
github.com/awslabs/karpenter/pkg/cloudprovider/aws.(*CloudProvider).Create(0xc0006854d0, 0x38dc858, 0xc000b00e70, 0xc000a2c380, 0xc0009de548, 0x1, 0x1, 0x1, 0x1, 0x1, ...)
	/home/runner/work/karpenter/karpenter/pkg/cloudprovider/aws/cloudprovider.go:143 +0xbd8
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/allocation.(*Controller).Reconcile(0xc0006b8140, 0x38dc858, 0xc000b00e70, 0x3903558, 0xc000a2c380, 0x38adb80, 0xc000a2c380, 0x2df5346, 0x6)
	/home/runner/work/karpenter/karpenter/pkg/controllers/provisioning/v1alpha1/allocation/controller.go:106 +0x97c
github.com/awslabs/karpenter/pkg/controllers.(*GenericController).Reconcile(0xc0006a61a0, 0x38dc858, 0xc000b00e70, 0xc000b9b859, 0x7, 0xc000b9b840, 0xc, 0xc000b00e70, 0x2cc8b29b30eda406, 0x0, ...)
	/home/runner/work/karpenter/karpenter/pkg/controllers/controller.go:61 +0x3c8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00014d540, 0x38dc7b0, 0xc000089e00, 0x2b9d420, 0xc000937400)
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263 +0x447
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00014d540, 0x38dc7b0, 0xc000089e00, 0xc00058af00)
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x369
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x38dc7b0, 0xc000089e00)
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198 +0x65
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0x4f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000406f50)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x76
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0008ebf50, 0x38a1740, 0xc000a1b110, 0xc0000ce201, 0xc0000230e0)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xbb
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000406f50, 0x3b9aca00, 0x0, 0xc000406f01, 0xc0000230e0)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x115
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x38dc7b0, 0xc000089e00, 0xc000957c20, 0x3b9aca00, 0x0, 0x1)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0xb4
k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x38dc7b0, 0xc000089e00, 0xc000957c20, 0x3b9aca00)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99 +0x65
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:195 +0x785
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x285cb38]

goroutine 988 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x16d
panic(0x2aff400, 0x4885580)
	/opt/hostedtoolcache/go/1.16.3/x64/src/runtime/panic.go:971 +0x499
github.com/awslabs/karpenter/pkg/cloudprovider/aws.(*CloudProvider).Create(0xc0006854d0, 0x38dc858, 0xc000b00e70, 0xc000a2c380, 0xc0009de548, 0x1, 0x1, 0x1, 0x1, 0x1, ...)
	/home/runner/work/karpenter/karpenter/pkg/cloudprovider/aws/cloudprovider.go:143 +0xbd8
github.com/awslabs/karpenter/pkg/controllers/provisioning/v1alpha1/allocation.(*Controller).Reconcile(0xc0006b8140, 0x38dc858, 0xc000b00e70, 0x3903558, 0xc000a2c380, 0x38adb80, 0xc000a2c380, 0x2df5346, 0x6)
	/home/runner/work/karpenter/karpenter/pkg/controllers/provisioning/v1alpha1/allocation/controller.go:106 +0x97c
github.com/awslabs/karpenter/pkg/controllers.(*GenericController).Reconcile(0xc0006a61a0, 0x38dc858, 0xc000b00e70, 0xc000b9b859, 0x7, 0xc000b9b840, 0xc, 0xc000b00e70, 0x2cc8b29b30eda406, 0x0, ...)
	/home/runner/work/karpenter/karpenter/pkg/controllers/controller.go:61 +0x3c8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc00014d540, 0x38dc7b0, 0xc000089e00, 0x2b9d420, 0xc000937400)
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:263 +0x447
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc00014d540, 0x38dc7b0, 0xc000089e00, 0xc00058af00)
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235 +0x369
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.1(0x38dc7b0, 0xc000089e00)
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:198 +0x65
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1()
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0x4f
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000406f50)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:155 +0x76
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc0008ebf50, 0x38a1740, 0xc000a1b110, 0xc0000ce201, 0xc0000230e0)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:156 +0xbb
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000406f50, 0x3b9aca00, 0x0, 0xc000406f01, 0xc0000230e0)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:133 +0x115
k8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext(0x38dc7b0, 0xc000089e00, 0xc000957c20, 0x3b9aca00, 0x0, 0x1)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:185 +0xb4
k8s.io/apimachinery/pkg/util/wait.UntilWithContext(0x38dc7b0, 0xc000089e00, 0xc000957c20, 0x3b9aca00)
	/home/runner/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:99 +0x65
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:195 +0x785
path is /home/runner/work/karpenter/karpenter
Unable to read coverage file  to combine, open : no such file or directory

Ginkgo ran 2 suites in 2m18.968975386s
Test Suite Failed
make: *** [Makefile:29: battletest] Error 1
Error: Process completed with exit code 2.

@njtran
Copy link
Contributor Author

njtran commented Jun 23, 2021

Lines in question:

cloudprovider.go#L143: node.Labels = packing.Constraints.Labels
allocation/controller.go#106: packedNodes, err := c.cloudProvider.Create(ctx, provisioner, packings)

@njtran njtran closed this as completed Sep 7, 2021
gfcroft pushed a commit to gfcroft/karpenter-provider-aws that referenced this issue Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant