Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Delete role assignments when deleting a VM #2934

Merged
merged 8 commits into from
May 23, 2018

Conversation

croeck
Copy link
Contributor

@croeck croeck commented May 13, 2018

What this PR does / why we need it:

Have a look at #2916 for the detailed error description.

In short: Role assignments do not get deleted when deleting a VM. If we use managed identities, this leads to an error during the rollout of a new node. With this PR we delete all role assignments that are associated with the VM to be deleted.

Implementation note:

I decided to pass the subscription ID to where I needed it for the scope calculation. If you should dislike this approach, I think we could also work without it. Then the listing of all role assignments would only be based on the principal ID of the VM

Which issue this PR fixes:

fixes #2916

Special notes for your reviewer:

Current tests were adjusted to work with these changes, but no new tests created.

If applicable:

  • documentation
  • unit tests
  • tested backward compatibility (ie. deploy with previous version, upgrade with this branch)

@msftclas
Copy link

msftclas commented May 13, 2018

CLA assistant check
All CLA requirements met.

@jackfrancis
Copy link
Member

@dmitsh Can you take a quick look? I'll run upgrade/scale tests against it as well. Thanks!

dmitsh
dmitsh previously approved these changes May 14, 2018
// The role assignments should only be relevant if managed identities are used,
// but always cleaning them up is easier than adding rule based logic here and there.
scope := fmt.Sprintf(AADRoleResourceGroupScopeTemplate, subscriptionID, resourceGroup)
logger.Infof("fetching roleAssignments: %s with principal %s", scope, *vm.Identity.PrincipalID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran an upgrade test against this branch and got the following nil pointer panic:

time="2018-05-15T19:03:56Z" level=info msg="deleting managed disk: kubernetes-koreasouth-74251/k8s-master-15172440-0_OsDisk_1_dc7dab13d2354a4f875595aa83b99d3c"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x138a30f]

goroutine 1 [running]:
github.com/Azure/acs-engine/pkg/operations.CleanDeleteVirtualMachine(0x186eca0, 0xc4205f4900, 0xc42031c780, 0xc4208ade60, 0x24, 0x7fff98924da7, 0x1b, 0xc42034bc60, 0x15, 0x148c120, ...)
	/go/src/github.com/Azure/acs-engine/pkg/operations/deletevm.go:98 +0xa2f
github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade.(*UpgradeMasterNode).DeleteNode(0xc4201622a0, 0xc42086eb20, 0xc42034bc00, 0x15, 0x0)
	/go/src/github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade/upgrademasternode.go:39 +0x7f
github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade.(*Upgrader).upgradeMasterNodes(0xc4208be210, 0x148c101, 0xc4208be210)
	/go/src/github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade/upgrader.go:137 +0xa98
github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade.(*Upgrader).RunUpgrade(0xc4208be210, 0x0, 0x0)
	/go/src/github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade/upgrader.go:54 +0x2f
github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade.(*UpgradeCluster).UpgradeCluster(0xc420a13c60, 0x804f1c7d6b541430, 0x6afe76994af22385, 0xc420b40800, 0x26be, 0x7fff98924da7, 0x1b, 0xc42073b6e0, 0xc420ac9940, 0x8, ...)
	/go/src/github.com/Azure/acs-engine/pkg/operations/kubernetesupgrade/upgradecluster.go:115 +0x660
github.com/Azure/acs-engine/cmd.(*upgradeCmd).run(0xc4200914a0, 0xc42016a900, 0xc420487d00, 0x0, 0x10, 0x0, 0x0)
	/go/src/github.com/Azure/acs-engine/cmd/upgrade.go:225 +0x455
github.com/Azure/acs-engine/cmd.newUpgradeCmd.func1(0xc42016a900, 0xc420487d00, 0x0, 0x10, 0x0, 0x0)
	/go/src/github.com/Azure/acs-engine/cmd/upgrade.go:58 +0x52
github.com/Azure/acs-engine/vendor/github.com/spf13/cobra.(*Command).execute(0xc42016a900, 0xc420487c00, 0x10, 0x10, 0xc42016a900, 0xc420487c00)
	/go/src/github.com/Azure/acs-engine/vendor/github.com/spf13/cobra/command.go:647 +0x3e4
github.com/Azure/acs-engine/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc42021b8c0, 0xc42016ab40, 0xc42016a900, 0xc42016a480)
	/go/src/github.com/Azure/acs-engine/vendor/github.com/spf13/cobra/command.go:726 +0x2d4
github.com/Azure/acs-engine/vendor/github.com/spf13/cobra.(*Command).Execute(0xc42021b8c0, 0xc42000e018, 0x13a4b83)
	/go/src/github.com/Azure/acs-engine/vendor/github.com/spf13/cobra/command.go:685 +0x2b
main.main()
	/go/src/github.com/Azure/acs-engine/main.go:12 +0x74

…le if no managed identity is returned by azure
@codecov
Copy link

codecov bot commented May 19, 2018

Codecov Report

Merging #2934 into master will decrease coverage by 0.06%.
The diff coverage is 28.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2934      +/-   ##
==========================================
- Coverage   49.72%   49.66%   -0.07%     
==========================================
  Files          91       91              
  Lines       13998    14027      +29     
==========================================
+ Hits         6961     6967       +6     
- Misses       6394     6415      +21     
- Partials      643      645       +2
Impacted Files Coverage Δ
cmd/scale.go 0% <0%> (ø) ⬆️
pkg/armhelpers/graph.go 0% <0%> (ø) ⬆️
.../operations/kubernetesupgrade/upgrademasternode.go 43.13% <0%> (ø) ⬆️
pkg/armhelpers/mockclients.go 15.29% <0%> (-0.54%) ⬇️
pkg/operations/kubernetesupgrade/upgrader.go 57.85% <100%> (+0.35%) ⬆️
pkg/operations/scaledownagentpool.go 100% <100%> (ø) ⬆️
...g/operations/kubernetesupgrade/upgradeagentnode.go 53.84% <100%> (ø) ⬆️
pkg/operations/kubernetesupgrade/upgradecluster.go 47.22% <100%> (+0.36%) ⬆️
pkg/operations/deletevm.go 45.31% <30.76%> (-4.69%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e0ff5b...47e03a6. Read the comment docs.

@acs-bot acs-bot added the size/L label May 19, 2018
@croeck
Copy link
Contributor Author

croeck commented May 19, 2018

@jackfrancis I just added two test cases which should somewhat cover the issue you experienced. Can you please:

  • verify the added test cases
  • check again with your own test setup

@jackfrancis
Copy link
Member

/approve
/lgtm

@acs-bot
Copy link

acs-bot commented May 23, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jackfrancis jackfrancis merged commit 055f2b5 into Azure:master May 23, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

acs-engine upgrade fails: RoleAssignmentUpdateNotPermitted
5 participants