Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CAFV-267] Clutser upgrade test automation #475

Merged
merged 6 commits into from
Jul 13, 2023
Merged

Conversation

ymo24
Copy link
Contributor

@ymo24 ymo24 commented Jul 10, 2023

Description

Please provide a brief description of the changes proposed in this Pull Request

Checklist

  • tested locally
  • updated any relevant dependencies
  • updated any relevant documentation or examples

API Changes

Are there API changes?

  • Yes
  • No

If yes, please fill in the below

  1. Updated conversions?
    • Yes
    • No
    • N/A
  2. Updated CRDs?
    • Yes
    • No
    • N/A
  3. Updated infrastructure-components.yaml?
    • Yes
    • No
    • N/A
  4. Updated ./examples/capi-quickstart.yaml?
    • Yes
    • No
    • N/A
  5. Updated necessary files under ./infrastructure-vcd/v1.0.0/?
    • Yes
    • No
    • N/A

Issue

If applicable, please reference the relevant issue

Fixes #


This change is Reviewable

ymo24 added 3 commits July 5, 2023 12:48
Signed-off-by: ymo24 <[email protected]>
Signed-off-by: ymo24 <[email protected]>
Copy link
Contributor

@rliang88 rliang88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 7 files at r1, all commit messages.
Reviewable status: 2 of 7 files reviewed, all discussions resolved (waiting on @Anirudh9794 and @sahithi)

Copy link
Contributor

@Anirudh9794 Anirudh9794 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @sahithi)

Copy link
Contributor

@Anirudh9794 Anirudh9794 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @sahithi and @ymo24)


tests/e2e/utils/cluster_upgrade_utils.go line 375 at r1 (raw file):

			break
		case VCDMachineTemplate:
			specMap, err := testingsdk.GetMapBySpecName(unstructuredObj.Object, "spec", "VCDMachineTemplate")

We should ideally create a new VCDMachineTemplate object and change the reference in the Machine Deployment and KCP.
But I am okay with the current changes.
I don't know if we are planning to use VCDMachineTemplate this way. Can you please add a TODO for this ?

Copy link
Collaborator

@sahithi sahithi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r1, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @Anirudh9794 and @ymo24)


tests/e2e/e2e_suite_test.go line 13 at r1 (raw file):

	PathToMngmntClusterKubecfg    string
	PathToWorkloadClusterCapiYaml string
	TargetTKGVersion              string

It is not clear if these variables TargetTKGVersion and TargetK8SVersion are meant to be used for cluster creation (or) cluster upgrades.

Please rename them to TargetTKGVersionForUpgradeTest, TargetK8sVersionForUpgradeTest


tests/e2e/workload_cluster_upgrade_test.go line 32 at r1 (raw file):

var _ = Describe("Workload Cluster Life cycle management", func() {
	When("create, upgrade, and delete the workload cluster", func() {

For my benefit, what is the purpose of this "When"?


tests/e2e/utils/cluster_upgrade_utils.go line 68 at r1 (raw file):

// @param currTkgVersion - the cluster's current tkg version
// @param tkgMap - tkg_map.json, but marshalled into map[string]interface{}
// @return map[string]interface{} - map of all possible upgrade paths; it is a subset of tkgMap

Can you mention an example of return map? It is a good practice to identify key-value pairs by an example.


tests/e2e/utils/cluster_upgrade_utils.go line 150 at r1 (raw file):

		}

		tkgVersionIsGreater, err := isGreaterThanOrEqual(tkgVersionStr, currTkgVersion)

you can rename this to isTkgVersionCompatible.

TKG version need not be greater.


tests/e2e/utils/cluster_upgrade_utils.go line 177 at r1 (raw file):

func getRDEUpgradeResources(supportedUpgrades map[string]interface{}, allVappTemplates []*types.QueryResultVappTemplateType, targetK8sVersion, targetTkgVersion string) (string, *types.QueryResultVappTemplateType, error) {
	// Create a map to store the tkgOva data based on the k8s version
	tkgOvaMap := make(map[string]map[string]interface{})

Please provide examples on how these maps will look like.
for example, what is the key - tkg version? K8s version? ova name? and what is the value?

I had to read through the entire method and then tkgmap.json to find that out.


tests/e2e/utils/cluster_upgrade_utils.go line 252 at r1 (raw file):

// - string: The Kubernetes version that the cluster is being upgraded to.
// - error: nil if there is no error, otherwise the encountered error.
func getCapiYamlAfterUpgrade(ctx context.Context, r runtimeclient.Client, capiYaml string, tkgMapKey string, tkgMap map[string]interface{}, vappTemplate *types.QueryResultVappTemplateType) (string, error) {

Method name and return value do not match.

Please change it to ApplyK8sVersionUpgrade()


tests/e2e/utils/cluster_upgrade_utils.go line 375 at r1 (raw file):

Previously, Anirudh9794 wrote…

We should ideally create a new VCDMachineTemplate object and change the reference in the Machine Deployment and KCP.
But I am okay with the current changes.
I don't know if we are planning to use VCDMachineTemplate this way. Can you please add a TODO for this ?

It is fine if testing triggered the upgrades. Just ensure VCDMachineTemplate object is updated first before MachineDeployment


tests/e2e/utils/cluster_upgrade_utils.go line 425 at r1 (raw file):

	currTkgVersion, err := getTkgVersionFromTKGMap(currK8sVersion, tkgMap)
	if err != nil {
		return "", fmt.Errorf("error retrieving TKG version from capiyaml: [%v]", err)

error retrieving TKG version for a given K8s version [%s] from the tkgmap


tests/e2e/utils/cluster_upgrade_utils.go line 428 at r1 (raw file):

	}

	supportedUpgrades, err := getSupportedUpgrades(currK8sVersion, currTkgVersion, tkgMap)

I am unsure if we need to get the entire supported upgrade list, we could pass the target tkg version and target K8s version also into just one method, and the output could be tkgmapKey and vAppTemplate.

I don't see a need for two methods, "getSupportedUpgrades" and "getRdeUpgradeResources".

It is okay for now, please create a task to refactor this. Please don't hesitate to chat if you think I am missing something.

We just want one reusable method

  • getTKGInfoForAGivenK8sAndTKGVersion // get all the tkg details for a given K8s and tkg version
    // parses through entire tkg json and outputs the map (tkg details)

And then see if that ova exists in vApp templates.


tests/e2e/utils/cluster_upgrade_utils.go line 439 at r1 (raw file):

	updatedK8sVersion, err := getCapiYamlAfterUpgrade(ctx, r, capiYaml, tkgMapKey, tkgMap, vappTemplate)
	if err != nil {
		return "", fmt.Errorf("error constructing new capiyaml: [%v]", err)

error applying the modified objects of KCP, MD, VCDMachineTemplate for upgrade operation


tests/e2e/utils/cluster_upgrade_utils.go line 454 at r1 (raw file):

	testClient *testingsdk.TestClient, runtimeClient runtimeclient.Client, targetK8sVersion string,
) error {
	timeout := timeoutMinutes * time.Minute

what are the timeout and poll interval values? What is the value of timeoutMinutes?


tests/e2e/utils/cluster_upgrade_utils.go line 459 at r1 (raw file):

	return wait.PollImmediate(pollInterval, timeout, func() (bool, error) {
		// getting all nodes
		nodeList, err := testClient.GetNodes(context.Background())

Is there a need to get the node details?

I think we can just wait on the machines.
Machines also has K8 version as one of its columns.

if every machine has target k8s version and is in running state, that is good enough check.Please verify

ymo24 added 2 commits July 12, 2023 14:08
Signed-off-by: ymo24 <[email protected]>
Signed-off-by: ymo24 <[email protected]>
Copy link
Contributor Author

@ymo24 ymo24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 7 files reviewed, 12 unresolved discussions (waiting on @Anirudh9794, @rliang88, and @sahithi)


tests/e2e/e2e_suite_test.go line 13 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

It is not clear if these variables TargetTKGVersion and TargetK8SVersion are meant to be used for cluster creation (or) cluster upgrades.

Please rename them to TargetTKGVersionForUpgradeTest, TargetK8sVersionForUpgradeTest

Done.


tests/e2e/workload_cluster_upgrade_test.go line 32 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

For my benefit, what is the purpose of this "When"?

It's a specific explanation for Describe(). It will not conduct any functional effect on the CAPVCD test.


tests/e2e/utils/cluster_upgrade_utils.go line 68 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

Can you mention an example of return map? It is a good practice to identify key-value pairs by an example.

Done.


tests/e2e/utils/cluster_upgrade_utils.go line 150 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

you can rename this to isTkgVersionCompatible.

TKG version need not be greater.

Done.


tests/e2e/utils/cluster_upgrade_utils.go line 177 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

Please provide examples on how these maps will look like.
for example, what is the key - tkg version? K8s version? ova name? and what is the value?

I had to read through the entire method and then tkgmap.json to find that out.

Done.


tests/e2e/utils/cluster_upgrade_utils.go line 252 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

Method name and return value do not match.

Please change it to ApplyK8sVersionUpgrade()

Done.


tests/e2e/utils/cluster_upgrade_utils.go line 425 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

error retrieving TKG version for a given K8s version [%s] from the tkgmap

Done.


tests/e2e/utils/cluster_upgrade_utils.go line 428 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

I am unsure if we need to get the entire supported upgrade list, we could pass the target tkg version and target K8s version also into just one method, and the output could be tkgmapKey and vAppTemplate.

I don't see a need for two methods, "getSupportedUpgrades" and "getRdeUpgradeResources".

It is okay for now, please create a task to refactor this. Please don't hesitate to chat if you think I am missing something.

We just want one reusable method

  • getTKGInfoForAGivenK8sAndTKGVersion // get all the tkg details for a given K8s and tkg version
    // parses through entire tkg json and outputs the map (tkg details)

And then see if that ova exists in vApp templates.

Will create a task for all the refractor work post GA


tests/e2e/utils/cluster_upgrade_utils.go line 439 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

error applying the modified objects of KCP, MD, VCDMachineTemplate for upgrade operation

Done.


tests/e2e/utils/cluster_upgrade_utils.go line 454 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

what are the timeout and poll interval values? What is the value of timeoutMinutes?

all these params are from testingSdk common code. Already added a comment here

Code snippet:

// timeout: 40 minutes; pollInterval: 2 minutes

tests/e2e/utils/cluster_upgrade_utils.go line 459 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

Is there a need to get the node details?

I think we can just wait on the machines.
Machines also has K8 version as one of its columns.

if every machine has target k8s version and is in running state, that is good enough check.Please verify

yeah sure, Sahithi. I am using node check because it's already in the testing common code.
I will refract this function post GA

@ymo24 ymo24 requested a review from sahithi July 12, 2023 21:54
Copy link
Contributor Author

@ymo24 ymo24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 7 files reviewed, 12 unresolved discussions (waiting on @Anirudh9794, @rliang88, and @sahithi)


tests/e2e/utils/cluster_upgrade_utils.go line 375 at r1 (raw file):

Previously, sahithi (Sahithi Ayloo) wrote…

It is fine if testing triggered the upgrades. Just ensure VCDMachineTemplate object is updated first before MachineDeployment

Done.

Copy link
Collaborator

@sahithi sahithi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: after you create a jira task with two things a) refactor retrieving tkg info details b) remove node check details from the monitor code

Reviewable status: 4 of 7 files reviewed, 12 unresolved discussions (waiting on @Anirudh9794 and @rliang88)

@ymo24 ymo24 merged commit fd66032 into vmware:main Jul 13, 2023
@ymo24 ymo24 deleted the clutser-upgrade branch July 13, 2023 00:18
ymo24 added a commit to ymo24/cluster-api-provider-cloud-director that referenced this pull request Jul 13, 2023
* add files

Signed-off-by: ymo24 <[email protected]>

* add updates for cluster upgrade

Signed-off-by: ymo24 <[email protected]>

* add doc update

Signed-off-by: ymo24 <[email protected]>

* address comments

Signed-off-by: ymo24 <[email protected]>

* minor fix

Signed-off-by: ymo24 <[email protected]>

* add refrator jira task reminder

Signed-off-by: ymo24 <[email protected]>
(cherry picked from commit fd66032)
ymo24 added a commit that referenced this pull request Jul 13, 2023
* add files

Signed-off-by: ymo24 <[email protected]>

* add updates for cluster upgrade

Signed-off-by: ymo24 <[email protected]>

* add doc update

Signed-off-by: ymo24 <[email protected]>

* address comments

Signed-off-by: ymo24 <[email protected]>

* minor fix

Signed-off-by: ymo24 <[email protected]>

* add refrator jira task reminder

Signed-off-by: ymo24 <[email protected]>
(cherry picked from commit fd66032)
ymo24 added a commit to ymo24/cluster-api-provider-cloud-director that referenced this pull request Aug 28, 2023
* add files

Signed-off-by: ymo24 <[email protected]>

* add updates for cluster upgrade

Signed-off-by: ymo24 <[email protected]>

* add doc update

Signed-off-by: ymo24 <[email protected]>

* address comments

Signed-off-by: ymo24 <[email protected]>

* minor fix

Signed-off-by: ymo24 <[email protected]>

* add refrator jira task reminder

Signed-off-by: ymo24 <[email protected]>
lzichong pushed a commit to lzichong/cluster-api-provider-cloud-director that referenced this pull request Aug 28, 2023
* add files

Signed-off-by: ymo24 <[email protected]>

* add updates for cluster upgrade

Signed-off-by: ymo24 <[email protected]>

* add doc update

Signed-off-by: ymo24 <[email protected]>

* address comments

Signed-off-by: ymo24 <[email protected]>

* minor fix

Signed-off-by: ymo24 <[email protected]>

* add refrator jira task reminder

Signed-off-by: ymo24 <[email protected]>
ymo24 added a commit to ymo24/cluster-api-provider-cloud-director that referenced this pull request Aug 28, 2023
* add files

Signed-off-by: ymo24 <[email protected]>

* add updates for cluster upgrade

Signed-off-by: ymo24 <[email protected]>

* add doc update

Signed-off-by: ymo24 <[email protected]>

* address comments

Signed-off-by: ymo24 <[email protected]>

* minor fix

Signed-off-by: ymo24 <[email protected]>

* add refrator jira task reminder

Signed-off-by: ymo24 <[email protected]>
ymo24 added a commit that referenced this pull request Aug 28, 2023
* add files

Signed-off-by: ymo24 <[email protected]>

* add updates for cluster upgrade

Signed-off-by: ymo24 <[email protected]>

* add doc update

Signed-off-by: ymo24 <[email protected]>

* address comments

Signed-off-by: ymo24 <[email protected]>

* minor fix

Signed-off-by: ymo24 <[email protected]>

* add refrator jira task reminder

Signed-off-by: ymo24 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants