Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Don't present agent-pool-only validation results for full-cluster apimodels #1453

Merged
merged 1 commit into from
Sep 22, 2017

Conversation

itowlson
Copy link
Contributor

What this PR does / why we need it: If a user creates an apimodel for an ACS cluster, but omits a required field such as the service principal client ID, then acs-engine generate produces a spurious error message that the required DNS prefix is missing. This occurs because if an apimodel fails validation then it is retried as an agent-pool-only apimodel, and if it fails validation against that schema, then it is always the agent-pool-only validation error that is returned. Since the APO schema has dnsPrefix as its first required field, and a full cluster has dnsPrefix under masterProfile rather than at the top level, this validation fails on dnsPrefix and so that is the error that gets displayed.

(Alternatively, if the user has specified a stable apiVersion instead of vlabs, they get a spurious error that they are using an unrecognised apiVersion, because the full cluster API versions are not valid for APO.)

An ideal solution to this would be to use different API namespaces, or to add a specType field or something like that, so that the user could clearly indicate the intent of the apimodel JSON. This would be a breaking change for vlabs though. Therefore, this PR takes a heuristic approach: if the apimodel validates as a full cluster or APO, we use that, but if it validates as neither, then we try to infer intent based first on the API version and then on which properties are present in the apimodel JSON. Whichever intent we infer, we report only the error from the appropriate validation.

Which issue this PR fixes: fixes #1427

Special notes for your reviewer: None

Release note:

NONE

@msftclas
Copy link

@itowlson,
Thanks for your contribution as a Microsoft full-time employee or intern. You do not need to sign a CLA.
Thanks,
Microsoft Pull Request Bot

@jackfrancis
Copy link
Member

@itowlson Thanks for this! btw gofmt -s pkg/api/convertertoagentpoolonlyapi_test.go should fix the linting test failure.

@@ -264,3 +266,63 @@ func convertVLabsAgentPoolOnlyCertificateProfile(vlabs *vlabs.CertificateProfile
api.KubeConfigCertificate = vlabs.KubeConfigCertificate
api.KubeConfigPrivateKey = vlabs.KubeConfigPrivateKey
}

func isAgentPoolOnlyCluster(version string, contents []byte) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this function. This is not the right way to check if it's an agent pool only cluster or not.

agentPoolOnlyClusterCount++
}
}
return distinctivePropertyCounts{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed as well.

Safer/guaranteed way is:
If MasterProfile != nil or it contains property MasterProfile then it's not a agent pool only cluster. Relying on count could break the logic in future because e.g. extensionProfiles could be added to agentpoolonly cluster and accessProfiles could be added to full cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amanohar for the advice and feedback. If we can use masterProfile alone then that's much nicer - it greatly simplifies the logic. I've made that change; please let me know if you would like to see further changes.

switch version {
case v20170831.APIVersion:
managedCluster := &v20170831.ManagedCluster{}
if e := json.Unmarshal(contents, &managedCluster); e != nil {
return nil, e
return nil, true, e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not impossible that ManagedCluster and ContainerService types could have same API versions so relying on API version to return true is error prone.

@@ -42,8 +42,15 @@ func (a *Apiloader) DeserializeContainerService(contents []byte, validate bool,
version := m.APIVersion
service, err := a.LoadContainerService(contents, version, validate, existingContainerService)
if service == nil || err != nil {
log.Infof("Error returned by LoadContainerService: %+v. Attempting to load container service using LoadContainerServiceForAgentPoolOnlyCluster", err)
service, err = a.LoadContainerServiceForAgentPoolOnlyCluster(contents, version, validate)
service, matchedAgentPoolOnly, agentPoolErr := a.LoadContainerServiceForAgentPoolOnlyCluster(contents, version, validate)
Copy link
Contributor

@amanohar amanohar Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation would be to update isAgentPoolOnlyClusterFromProperties to check if MasterProfile is present. If it is not present call: LoadContainerServiceForAgentPoolOnlyCluster else call: LoadContainerService.

Regarding:

This occurs because if an apimodel fails validation then it is retried as an agent-pool-only apimodel, and if it fails validation against that schema, then it is always the agent-pool-only validation error that is returned.

This is not entirely correct from ACS Engine perspective since generate is a command, error is printed and not swallowed. But I understand that it could be confusing. ACS Engine cmds are commonly used only for ContainerService type (and not agent pool only type) so it makes sense to optimize for that.

@itowlson itowlson force-pushed the fix-spurious-dnsprefix-errors branch 2 times, most recently from 8e9fb6f to be62af4 Compare September 21, 2017 00:10
amanohar
amanohar previously approved these changes Sep 21, 2017
@jackfrancis
Copy link
Member

@itowlson Linter is complaining (!):

pkg/api/convertertoagentpoolonlyapi.go:270:6:warning: func isAgentPoolOnlyClusterJson should be isAgentPoolOnlyClusterJSON (golint)

Go idioms generally prefer upper-casing all acronyms like NASA in a camelCase setting (e.g., iLoveNASA) in favor of imposing camel-case upon an acronym (e.g., iLoveNasa).

@itowlson
Copy link
Contributor Author

@jackfrancis Aargh, sorry about that, too long in .NET-land! I ran gofmt but not make test-style - my bad. Fix pushed, will check CI results.

@itowlson
Copy link
Contributor Author

@amanohar @jackfrancis Does this need re-review since I pushed the linting fix?

Copy link
Member

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jackfrancis
Copy link
Member

@itowlson You're good but github review process doesn't love commit squashing ;)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

acs-engine 0.6.0 generate fails on DNSPrefix and ServicePrincipalProfile validation
4 participants