Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AKS : Advanced Networking / Calico Network Policy #2987

Merged
merged 2 commits into from
Mar 21, 2019

Conversation

thatInfrastructureGuy
Copy link
Contributor

This PR aims to add Calico Network policy to AKS as described in MS Docs.

Added network_policy to kubernetes_cluster resource.

PR Inspiration: #1479

Signed-off-by: thatInfrastructureGuy <[email protected]>
Copy link
Collaborator

@katbyte katbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @thatInfrastructureGuy,

Thank you for this PR! i've left some mostly minor comments inline but overall this is looking pretty good. Once those few issues are addressed and tests pass I think this should be good to merge 🙂

azurerm/resource_arm_kubernetes_cluster.go Outdated Show resolved Hide resolved
azurerm/resource_arm_kubernetes_cluster.go Outdated Show resolved Hide resolved
azurerm/data_source_kubernetes_cluster.go Outdated Show resolved Hide resolved
azurerm/resource_arm_kubernetes_cluster_test.go Outdated Show resolved Hide resolved
azurerm/resource_arm_kubernetes_cluster_test.go Outdated Show resolved Hide resolved
website/docs/r/kubernetes_cluster.html.markdown Outdated Show resolved Hide resolved
…essary. Removed check for calico in CustomizeDiff block.

Signed-off-by: thatInfrastructureGuy <[email protected]>
Copy link
Collaborator

@katbyte katbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @thatInfrastructureGuy,

Thank you for the changes, this LGTM now. However the tests are failing:

------- Stdout: -------
=== RUN   TestAccAzureRMKubernetesCluster_advancedNetworkingAzureCalicoPolicy
=== PAUSE TestAccAzureRMKubernetesCluster_advancedNetworkingAzureCalicoPolicy
=== CONT  TestAccAzureRMKubernetesCluster_advancedNetworkingAzureCalicoPolicy
--- FAIL: TestAccAzureRMKubernetesCluster_advancedNetworkingAzureCalicoPolicy (218.11s)
    testing.go:538: Step 0 error: Error applying: 1 error occurred:
        	* azurerm_kubernetes_cluster.test: 1 error occurred:
        	* azurerm_kubernetes_cluster.test: Error creating/updating Managed Kubernetes Cluster "acctestaks190305190900741436" (Resource Group "acctestRG-190305190900741436"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="BadRequest" Message="Parameter networkProfile.networkPolicy is not allowed."

Are there any steps we need to enable networkPolicy on our subscription?

Also getting many of these:

* azurerm_virtual_network.test: 1 error occurred:
        	* azurerm_virtual_network.test: Error waiting for completion of Virtual Network "acctestvirtnet190305190900434025" (Resource Group "acctestRG-190305190900434025"): Code="InternalServerError" Message="An error occurred." Details=[]

@thatInfrastructureGuy
Copy link
Contributor Author

@katbyte Thanks for reverting back. Yes. To enable network policy in your subscription please follow:

# To create an AKS with network policy, first enable a feature flag on your subscription. To register the EnableNetworkPolicy feature flag, use the az feature register command as shown in the following example:
az feature register --name EnableNetworkPolicy --namespace Microsoft.ContainerService

# It takes a few minutes for the status to show Registered. You can check on the registration status using the az feature list command:
az feature list -o table --query "[?contains(name, 'Microsoft.ContainerService/EnableNetworkPolicy')].{Name:name,State:properties.state}"

# When ready, refresh the registration of the Microsoft.ContainerService resource provider using the az provider register command:
az provider register --namespace Microsoft.ContainerService

Source : https://docs.microsoft.com/en-us/azure/aks/use-network-policies#before-you-begin

@thatInfrastructureGuy
Copy link
Contributor Author

thatInfrastructureGuy commented Mar 5, 2019

Also getting many of these:

* azurerm_virtual_network.test: 1 error occurred:
        	* azurerm_virtual_network.test: Error waiting for completion of Virtual Network "acctestvirtnet190305190900434025" (Resource Group "acctestRG-190305190900434025"): Code="InternalServerError" Message="An error occurred." Details=[]

I believe this might be related to above comment as VNET creation is not getting completed because of "unknown parameter" networkProfile.networkPolicy.

@thatInfrastructureGuy
Copy link
Contributor Author

@katbyte Is there anything I can assist with? Can you please guide me for the documentation of labels. Eg: What does waiting-response stand for?

Thanks for helping me out!

@ghost ghost removed the waiting-response label Mar 5, 2019
@jfcoz
Copy link

jfcoz commented Mar 7, 2019

@katbyte , is there something more to change ?

@RichardFowles89
Copy link

Hi guys, do you have any idea when this will be available for general release? We want to implement it at work.

Thanks!

@dannydombrowski
Copy link

Hey @katbyte it has been 15 days since your last activity on this issue. Could you please let us know if there is anything else left before merging this issue?

@katbyte katbyte added this to the v1.24.0 milestone Mar 21, 2019
@katbyte
Copy link
Collaborator

katbyte commented Mar 21, 2019

Sorry for the delay here @dannydombrowski, @RichardFowles89, @jfcoz & @thatInfrastructureGuy our attention was elsewhere.

I have verified that the tests now pass for us and this will get into v1.24 🙂

@katbyte katbyte merged commit a7f35de into hashicorp:master Mar 21, 2019
katbyte added a commit that referenced this pull request Mar 21, 2019
@RichardFowles89
Copy link

That's great - thank you! I will keep my eye out.

@thatInfrastructureGuy thatInfrastructureGuy deleted the calicoPlugin branch March 29, 2019 16:58
@ghost
Copy link

ghost commented Apr 3, 2019

This has been released in version 1.24.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
	version = "~> 1.24.0"
}
# ... other configuration ...

@RichardFowles89
Copy link

Great! Thank you for letting me know

@soggychipsnz
Copy link

Has anyone had success with provisioning a network-policy enabled AKS cluster with the new provider?

When attempting to provision an AKS cluster with this enabled, my deployment eventually fails with:
azurerm_kubernetes_cluster.aks_cluster: Error waiting for completion of Managed Kubernetes Cluster "REMOVED" (Resource Group REMOVED"): Code="NodesNotReady" Message=""

The same state deploys fine without the network_policy enabled.
RBAC and the Azure network plugin are set, also using 1.24.0 of the azurerm provider.
The network_profile network policy is being set to the value of "calico"
If I diff the exported ARM template of the failed deployment with one that has been successfully provisioned via ARM template, nothing stands out as incorrect.

Any assistance on this one or confirmation that it is working for others would be appreciated.

@thatInfrastructureGuy
Copy link
Contributor Author

@soggychipsnz I am running into same issue. I redeployed (not restarted) the VMs and did a terraform apply again to fix it.

@soggychipsnz
Copy link

Thanks for confirming, I was wondering if I was missing something obvious.
How does one redeploy individual nodes from a managed container service? (assuming the nodes are the VMs you are referring to)

I have opened a case with MS to check out the failed instance as well just incase they have anything to add.

@thatInfrastructureGuy
Copy link
Contributor Author

thatInfrastructureGuy commented Apr 8, 2019

Just navigate to MC_${RESOURCE_GROUP}_${CLUSTER_NAME} resource group. On each VM; click the redeploy button available in the portal.

Screen Shot 2019-04-08 at 10 58 07 AM

@soggychipsnz
Copy link

Cheers for the feedback. I'll give that a go after I get closure from my case I have open with MS as they might be able to provide more understanding as to why this issue is happening.

@thatInfrastructureGuy
Copy link
Contributor Author

@soggychipsnz MicrosoftDocs/azure-docs#28567

Seems like azure backend issue which is currently being fixed. No ETA.

@feiskyer
Copy link

feiskyer commented Apr 9, 2019

Yep, this is bug for v1.12.x. versions below that are still working

@thatInfrastructureGuy
Copy link
Contributor Author

thatInfrastructureGuy commented Apr 9, 2019

@feiskyer Thanks for the update. Is there a recommended path forward for users? Eg:

  1. Using AKS 1.12 with azure-npm // OR
  2. AKS <= 1.11 with Calico.

Which network plugin is likely to graduate to GA?

@feiskyer
Copy link

feiskyer commented Apr 9, 2019

@thatInfrastructureGuy yep, both of those are working. it's up to users to decide which one is preferred.

@ghost
Copy link

ghost commented Apr 20, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants