Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: stuck rollout when 2nd deployment happens before 1st finishes #3354

Merged
merged 1 commit into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion rollout/canary.go
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ func (c *rolloutContext) scaleDownOldReplicaSetsForCanary(oldRSs []*appsv1.Repli

annotationedRSs := int32(0)
for _, targetRS := range oldRSs {
if c.isReplicaSetReferenced(targetRS) {
if c.rollout.Spec.Strategy.Canary.TrafficRouting != nil && c.isReplicaSetReferenced(targetRS) {
// We might get here if user interrupted an an update in order to move back to stable.
c.log.Infof("Skip scale down of older RS '%s': still referenced", targetRS.Name)
continue
Expand Down
17 changes: 17 additions & 0 deletions test/e2e/canary_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -565,6 +565,23 @@ func (s *CanarySuite) TestCanaryScaleDownOnAbortNoTrafficRouting() {
ExpectRevisionPodCount("2", 0)
}

func (s *CanarySuite) TestCanaryWithPausedRollout() {
(s.Given().
HealthyRollout(`@functional/rollout-canary-with-pause.yaml`).
When().
ApplyManifests().
MarkPodsReady("1", 3). // mark all 3 pods ready
WaitForRolloutStatus("Healthy").
UpdateSpec(). // update to revision 2
WaitForRolloutStatus("Paused").
UpdateSpec(). // update to revision 3
WaitForRolloutStatus("Paused").
Then().
ExpectRevisionPodCount("1", 3).
ExpectRevisionPodCount("2", 0).
ExpectRevisionPodCount("3", 1))
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashutosh16 I ran the e2e tests with the fix removed, and this test passed. So I don't think this is a valid test? (at least for avoiding regressions)

Copy link
Contributor Author

@ashutosh16 ashutosh16 Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right the e2e does not simulate the scenario correctly. I looked into e2e function UpdateSpec() and it seems it resume the rollout and then update to new revision which doesn't cover this scenario. I'll look into it further.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the testcase. It should fail without the fix.

func (s *CanarySuite) TestCanaryUnScaleDownOnAbort() {
s.Given().
HealthyRollout(`@functional/canary-unscaledownonabort.yaml`).
Expand Down
77 changes: 77 additions & 0 deletions test/e2e/functional/rollout-canary-with-pause.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
apiVersion: v1
kind: Service
metadata:
name: rollout-canary-with-pause-root
spec:
type: NodePort
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app: rollout-canary-with-pause
---
apiVersion: v1
kind: Service
metadata:
name: rollout-canary-with-pause-canary
spec:
type: NodePort
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app: rollout-canary-with-pause
---
apiVersion: v1
kind: Service
metadata:
name: rollout-canary-with-pause-stable
spec:
type: NodePort
ports:
- port: 80
targetPort: http
protocol: TCP
name: http
selector:
app: rollout-canary-with-pause
---
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: rollout-canary-with-pause
spec:
replicas: 3
revisionHistoryLimit: 3
progressDeadlineSeconds: 5
selector:
matchLabels:
app: rollout-canary-with-pause
template:
metadata:
labels:
app: rollout-canary-with-pause
spec:
containers:
- name: rollouts-demo
image: nginx:1.19-alpine
ports:
- containerPort: 80
readinessProbe:
initialDelaySeconds: 10
httpGet:
path: /
port: 80
periodSeconds: 30
strategy:
canary:
canaryService: rollout-canary-with-pause-canary
stableService: rollout-canary-with-pause-stable
steps:
- setWeight: 20
- pause: {}

Loading