Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: script for testing operator performance #202

Merged
merged 1 commit into from
Jun 27, 2022

Conversation

iamniting
Copy link
Member

LVM operator performance tests (sc usage + lvmcluster creation) based on #69

Signed-off-by: Juan Miguel Olmo Martínez [email protected]
Signed-off-by: Nitin Goyal [email protected]

@iamniting
Copy link
Member Author

iamniting commented Jun 1, 2022

go run test/performance/operf.go --help
Usage of operf:
This script provides LVM operator metrics retrieval (for all the units in the operator):

- PVCs creation and usage:
	It creates <instances> of "busybox" pods using PVCs provisioned via the Storage class provided
	Example:
	# go run test/performance/operf.go -token sha256~cj81ClyUYu7g05y8K-uLWm2AbrKTbNEQ96hEJcWStQo -instances 16

Parameters:
	-instances	(default: 4)			Number of Pods/Pvcs (each Pod uses a different PVC) to create in the test
	-namespace	(default: lvm-operator-system)	Namespace where operator is deployed and the PVCs and test pods will be deployed/undeployed
	-pattern	(default: perfotest)		Pattern used to build the PVCs/Pods names
	-sc		(default: odf-lvm-vg1)		Name of the topolvm storage class that will be used in the PVCs
	-token		(default: )			Mandatory authentication token needed to connect with the Openshift cluster

$ go run test/performance/operf.go -token sha256~rYqNcFya6xNib_ETudIO47p75cJ4nHGaOzMwoWQstXw -number 5
5 PVCs created
5 Pods created
Waiting for pods running...
Pod perfotest-0 is in phase Pending ...
Pod perfotest-0 is in phase Running ...
Pod perfotest-1 is in phase Running ...
Pod perfotest-2 is in phase Running ...
Pod perfotest-3 is in phase Running ...
Pod perfotest-4 is in phase Running ...
5 Pods running
test Pods deleted
test PVCs deleted
Waiting for PVCS clean 
Waiting for PVCS clean 
Waiting for PVCS clean 
Waiting for PVCS clean 
Times report
--------------------------------------------------------------------------------
Start test      : 2022-06-01 15:13:47 +0530 IST
PVCs created    : 2022-06-01 15:13:47 +0530 IST
PVCs utilization: 2022-06-01 15:13:48 +0530 IST
PVCs Free       : 2022-06-01 15:14:20 +0530 IST
PVCs deleted    : 2022-06-01 15:15:01 +0530 IST
End test        : 2022-06-01 15:15:01 +0530 IST
Usage report: 5 pods in namespace lvm-operator-system using 5 PVCs with Storage Class <odf-lvm-vg1>
--------------------------------------------------------------------------------
Report for vg-manager daemonset between 2022-06-01 15:13:47 +0530 IST and 2022-06-01 15:15:01 +0530 IST
      CPU (max|avg) seconds:     0.0007 |      0.0006
      RAM (max|avg) Mib    :    20.8945 |     20.8945
      FS  (max|avg) Mib    :     0.0117 |      0.0117
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-node daemonset between 2022-06-01 15:13:47 +0530 IST and 2022-06-01 15:15:01 +0530 IST
      CPU (max|avg) seconds:     0.0314 |      0.0198
      RAM (max|avg) Mib    :    92.1914 |     88.3362
      FS  (max|avg) Mib    :     0.5469 |      0.5469
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment between 2022-06-01 15:13:47 +0530 IST and 2022-06-01 15:15:01 +0530 IST
      CPU (max|avg) seconds:     0.0119 |      0.0116
      RAM (max|avg) Mib    :   113.8242 |    113.1025
      FS  (max|avg) Mib    :     3.2070 |      3.2070
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for lvm-operator-controller-manager deployment between 2022-06-01 15:13:47 +0530 IST and 2022-06-01 15:15:01 +0530 IST
      CPU (max|avg) seconds:     0.0054 |      0.0052
      RAM (max|avg) Mib    :    38.8945 |     38.7990
      FS  (max|avg) Mib    :     0.0312 |      0.0312
--------------------------------------------------------------------------------
$ go run test/performance/operf.go -token sha256~rYqNcFya6xNib_ETudIO47p75cJ4nHGaOzMwoWQstXw -number 10
10 PVCs created
10 Pods created
Waiting for pods running...
Pod perfotest-0 is in phase Pending ...
Pod perfotest-0 is in phase Running ...
Pod perfotest-1 is in phase Running ...
Pod perfotest-2 is in phase Running ...
Pod perfotest-3 is in phase Running ...
Pod perfotest-4 is in phase Running ...
Pod perfotest-5 is in phase Running ...
Pod perfotest-6 is in phase Pending ...
Pod perfotest-6 is in phase Running ...
Pod perfotest-7 is in phase Running ...
Pod perfotest-8 is in phase Running ...
Pod perfotest-9 is in phase Running ...
10 Pods running
test Pods deleted
test PVCs deleted
Waiting for PVCS clean 
Waiting for PVCS clean 
Waiting for PVCS clean 
Waiting for PVCS clean 
Times report
--------------------------------------------------------------------------------
Start test      : 2022-06-01 15:15:29 +0530 IST
PVCs created    : 2022-06-01 15:15:30 +0530 IST
PVCs utilization: 2022-06-01 15:15:32 +0530 IST
PVCs Free       : 2022-06-01 15:16:35 +0530 IST
PVCs deleted    : 2022-06-01 15:17:16 +0530 IST
End test        : 2022-06-01 15:17:16 +0530 IST
Usage report: 10 pods in namespace lvm-operator-system using 10 PVCs with Storage Class <odf-lvm-vg1>
--------------------------------------------------------------------------------
Report for vg-manager daemonset between 2022-06-01 15:15:29 +0530 IST and 2022-06-01 15:17:16 +0530 IST
      CPU (max|avg) seconds:     0.0009 |      0.0007
      RAM (max|avg) Mib    :    20.9414 |     20.9089
      FS  (max|avg) Mib    :     0.0117 |      0.0117
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-node daemonset between 2022-06-01 15:15:29 +0530 IST and 2022-06-01 15:17:16 +0530 IST
      CPU (max|avg) seconds:     0.2651 |      0.1258
      RAM (max|avg) Mib    :   107.6055 |     98.5536
      FS  (max|avg) Mib    :     0.5547 |      0.4494
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment between 2022-06-01 15:15:29 +0530 IST and 2022-06-01 15:17:16 +0530 IST
      CPU (max|avg) seconds:     0.0225 |      0.0171
      RAM (max|avg) Mib    :   118.0742 |    116.0549
      FS  (max|avg) Mib    :     3.3945 |      3.2869
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for lvm-operator-controller-manager deployment between 2022-06-01 15:15:29 +0530 IST and 2022-06-01 15:17:16 +0530 IST
      CPU (max|avg) seconds:     0.0091 |      0.0046
      RAM (max|avg) Mib    :    38.9961 |     38.5174
      FS  (max|avg) Mib    :     0.0312 |      0.0312
--------------------------------------------------------------------------------

@iamniting iamniting requested review from nbalacha, sp98 and Yuggupta27 June 1, 2022 09:59
@iamniting iamniting force-pushed the perf-script branch 2 times, most recently from ec59331 to 3681d32 Compare June 1, 2022 10:20
}
_, err := c.CoreV1().PersistentVolumeClaims(*testNamespace).Create(context.TODO(), &pvc, metav1.CreateOptions{})
if err != nil {
fmt.Printf("Error creating PVC <%s-%d>: %s\n", PVCName, i, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that the error should be handled by failing the entire process if there is any error creating the pods. Since we will be evaluating the output based on the number of resources (pods) we are creating, we should ensure that desired number of pods are always running before printing the metrics.

}
_, err := c.CoreV1().Pods(*testNamespace).Create(context.TODO(), &pod, metav1.CreateOptions{})
if err != nil {
fmt.Printf("Error creating Pod <%s-%d>: %s\n", PodName, i, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that the error should be handled by failing the entire process if there is any error creating the pods. Since we will be evaluating the output based on the number of resources (pods) we are creating, we should ensure that desired number of pods are always running before printing the metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test script for metrics and not actual functional testing. IMO we should print such errors and continue with getting the metrics.

for i := 0; i < len(values); i++ {
v, err := strconv.ParseFloat(fmt.Sprintf("%s", values[i][1]), 64)
if err != nil {
fmt.Println("error converting ", values[i][1], " to number")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt.Errorf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error should be returned and handled correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script is designed to do its best and fetch the details it can. It is getting panic only if it can not go ahead.

kubeconfig = &val

// use the current context in kubeconfig
config, err := clientcmd.BuildConfigFromFlags("", *kubeconfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is deploymanager created for e2e tests.
Can that be reused here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, But we need to make sure that we do not change anything in the future which will break this script. Moreover, it is 15 lines of code while replacing it we are hardly able to save a few lines of code.

Copy link
Contributor

@sp98 sp98 Jun 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that, But we need to make sure that we do not change anything in the future which will break this script.

Agreed. But this mostly looks like static code to get the client which I don't expect to change.

Moreover, it is 15 lines of code while replacing it we are hardly able to save a few lines of code.

The idea is the have a common a utility that will provide the client to connect with k8s via kubeconfig. In future, we need to create the client again, then this utility can be reused, rather than writing same logic again to generate the client.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried this, The deploy manager uses the client from "sigs.k8s.io/controller-runtime/pkg/client" and this script uses the client from "k8s.io/client-go/kubernetes". We can not directly use the deploy manager client here directly.

@iamniting iamniting requested a review from nbalacha June 2, 2022 09:24
@iamniting iamniting force-pushed the perf-script branch 6 times, most recently from c254d41 to c15f38b Compare June 2, 2022 10:40
@iamniting iamniting force-pushed the perf-script branch 4 times, most recently from 640a177 to 22af454 Compare June 9, 2022 09:01
@iamniting iamniting requested review from jmolmo, sp98 and Yuggupta27 June 9, 2022 09:02
@iamniting
Copy link
Member Author

$ go run test/performance/operf.go -token sha256~8h5kn54z-Tacv5HFvnbSMV-N-pvO1_Ecx6Dy6mJ0Rl4 -instances 10
10 PVCs created
10 Pods created
Waiting for pods running...
Pod perfotest-0 is in phase Pending ...
Pod perfotest-0 is in phase Running ...
Pod perfotest-1 is in phase Running ...
Pod perfotest-2 is in phase Running ...
Pod perfotest-3 is in phase Running ...
Pod perfotest-4 is in phase Running ...
Pod perfotest-5 is in phase Running ...
Pod perfotest-6 is in phase Pending ...
Pod perfotest-6 is in phase Running ...
Pod perfotest-7 is in phase Running ...
Pod perfotest-8 is in phase Running ...
Pod perfotest-9 is in phase Running ...
10 Pods running
test Pods deleted
test PVCs deleted
Waiting for PVCS clean 
Waiting for PVCS clean 
Waiting for PVCS clean 
Waiting for PVCS clean 
Waiting for PVCS clean 
Times report
--------------------------------------------------------------------------------
Start test      : 2022-06-09 14:39:50 +0530 IST
PVCs created    : 2022-06-09 14:39:51 +0530 IST
PVCs utilization: 2022-06-09 14:39:53 +0530 IST
PVCs Free       : 2022-06-09 14:40:56 +0530 IST
PVCs deleted    : 2022-06-09 14:41:47 +0530 IST
End test        : 2022-06-09 14:41:47 +0530 IST
Usage report: 10 pods in namespace lvm-operator-system using 10 PVCs with Storage Class <odf-lvm-vg1>
--------------------------------------------------------------------------------
Report for lvm-operator-controller-manager deployment's Pod lvm-operator-controller-manager-86847d97f6-2mdfs between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0041 |     0.0069 |     0.0054
	 MEM (min|max|avg) Mib    :    66.8828 |    67.6211 |    67.1406
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for lvm-operator-controller-manager deployment's Container kube-rbac-proxy between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0001 |     0.0008 |     0.0002
	 MEM (min|max|avg) Mib    :    17.0664 |    17.0664 |    17.0664
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for lvm-operator-controller-manager deployment's Container manager between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0040 |     0.0069 |     0.0052
	 MEM (min|max|avg) Mib    :    36.9688 |    37.7070 |    37.2266
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for lvm-operator-controller-manager deployment's Container metricsexporter between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0000 |     0.0000 |     0.0000
	 MEM (min|max|avg) Mib    :    12.8477 |    12.8477 |    12.8477
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment's Pod topolvm-controller-575fdc69bd-2n7l4 between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0086 |     0.0254 |     0.0157
	 MEM (min|max|avg) Mib    :   170.8047 |   175.1250 |   173.6521
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment's Container csi-snapshotter between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0001 |     0.0008 |     0.0003
	 MEM (min|max|avg) Mib    :    24.5742 |    24.5742 |    24.5742
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment's Container liveness-probe between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0000 |     0.0010 |     0.0003
	 MEM (min|max|avg) Mib    :    19.8945 |    20.1836 |    19.9832
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment's Container topolvm-controller between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0058 |     0.0109 |     0.0086
	 MEM (min|max|avg) Mib    :    51.0742 |    51.6055 |    51.3303
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment's Container csi-provisioner between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0013 |     0.0083 |     0.0047
	 MEM (min|max|avg) Mib    :    35.6602 |    38.6133 |    37.7281
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-controller deployment's Container csi-resizer between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0003 |     0.0051 |     0.0019
	 MEM (min|max|avg) Mib    :    39.2617 |    40.4883 |    40.0363
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-node daemonset's Pod topolvm-node-zpk57 between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0008 |     0.1755 |     0.1068
	 MEM (min|max|avg) Mib    :   122.8398 |   147.9609 |   133.0986
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-node daemonset's Container csi-registrar between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0000 |     0.0006 |     0.0002
	 MEM (min|max|avg) Mib    :    15.9531 |    15.9531 |    15.9531
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-node daemonset's Container liveness-probe between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0000 |     0.0007 |     0.0002
	 MEM (min|max|avg) Mib    :    22.7500 |    22.8906 |    22.8078
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-node daemonset's Container lvmd between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0001 |     0.1507 |     0.0964
	 MEM (min|max|avg) Mib    :    40.4648 |    62.5508 |    48.3162
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for topolvm-node daemonset's Container topolvm-node between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0004 |     0.0281 |     0.0100
	 MEM (min|max|avg) Mib    :    43.6406 |    46.5664 |    46.0215
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for vg-manager daemonset's Pod vg-manager-fczqm between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0006 |     0.0016 |     0.0009
	 MEM (min|max|avg) Mib    :    29.0938 |    29.6484 |    29.4859
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Report for vg-manager daemonset's Container vg-manager between 2022-06-09 14:39:50 +0530 IST and 2022-06-09 14:41:47 +0530 IST
	 CPU (min|max|avg) seconds:     0.0006 |     0.0016 |     0.0009
	 MEM (min|max|avg) Mib    :    29.0938 |    29.6484 |    29.4859
--------------------------------------------------------------------------------

for _, u := range unit.ObjectMetrics {
fmt.Println(strings.Repeat("-", 80))
fmt.Printf("Report for %s %s's %s %s between %s and %s\n", unit.Name, unit.WorkloadType, u.Type, u.Name, unit.Start, unit.End)
fmt.Printf("\t CPU (min|max|avg) seconds: % 10.4f | % 10.4f | % 10.4f\n", u.Cpu.Min, u.Cpu.Max, u.Cpu.Avg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the unit for container_cpu_usage_seconds_total is in seconds. The prometheus metrics shows different unit in the openshift.
Screenshot from 2022-06-13 15-00-25

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what i get after running the exact query

Screenshot from 2022-06-13 15-21-10

Copy link
Contributor

@nbalacha nbalacha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still looking into the query. Most online resources seem to use "sum (rate (container_cpu_usage_seconds_total..."

I'll get back on this.

@nbalacha
Copy link
Contributor

I'm still looking into the query. Most online resources seem to use "sum (rate (container_cpu_usage_seconds_total..."

I'll get back on this.

Since the query matches the one used in the Openshift dashboard, we can continue with this.

Signed-off-by: Juan Miguel Olmo Martínez <[email protected]>
Signed-off-by: Nitin Goyal <[email protected]>
@iamniting
Copy link
Member Author

/test lvm-operator-bundle-e2e-aws

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 15, 2022

@iamniting: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/lvm-operator-bundle-e2e-aws 22edccc link false /test lvm-operator-bundle-e2e-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 27, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 27, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: iamniting, nbalacha

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 27, 2022
@openshift-ci openshift-ci bot merged commit af877a4 into openshift:main Jun 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants