e2e tests for Kepler, estimator, and model server #456

rootfs · 2022-12-09T15:43:10Z

Is your feature request related to a problem? Please describe.
Having all of the components e2e tested on baremetal and VM (especially on CI)

Describe the solution you'd like
The tests should verify that:

all the components are configured correctly, up and running
the models (ebpf, cgroup, etc) can be trained and updated online

jichenjc · 2022-12-12T01:07:21Z

some proposal

seems CI result all 0 now (at least some of them)
https://github.com/sustainable-computing-io/kepler/actions/runs/3660745079/jobs/6188232690
so we need fix this first
maybe overtime we can check 1 min consumption and for all sum metrics should see a >= change ?
asked in difference between kepler-model-server and kepler-estimator #375, seems model server and side car are mutual exclusive?
if so, maybe in CI we can run integ-model-server and integ-side-car test ?

sunya-ch · 2022-12-12T04:43:38Z

some proposal

asked in difference between kepler-model-server and kepler-estimator #375, seems model server and side car are mutual exclusive?
if so, maybe in CI we can run integ-model-server and integ-side-car test ?

Yes, it is mutual exclusive.
We can have the deployment scenario as described here: https://sustainable-computing.io/design/power_estimation/#deployment-scenarios.

minimum (no sidecar, no model server) -> kepler local estimator will use LR model weight downloaded by initial URL.
with sidecar only -> kepler exporter requests estimated power from the sidecar. The sidecar estimator will use achieved (currently GBR) model downloaded by initial URL.
with model server only -> kepler local estimator will request LR model weight from model server.
with both (full deployment) -> kepler exporter requests estimated power from the sidecar. The sidecar estimator will request achieved model from model server.

rootfs · 2022-12-12T13:19:07Z

@jichenjc Pods metrics are ok, but the node metrics are currently all zero, we'll deploy a node power model in the CI

sunya-ch · 2022-12-13T13:22:04Z

Idea to create e2e tests for integration.

make build-manifest
make build-manifest OPTS="ESTIMATOR_SIDECAR_DEPLOY"
To check whether sidecar is properly set, according to [estimator sidecar integration] model config not applied. #461, check log line:
```
Model DynComponentPower initiated (true)
```
make build-manifest OPTS="MODEL_SERVER_DEPLOY"
To check whether model-sever is connected, according to [model server integration] model server not connected #463, check log line:
```
LR Model (AbsComponentModelWeight): getWeightFromServer: map[...
```
make build-manifest OPTS="ESTIMATOR_SIDECAR_DEPLOY MODEL_SERVER_DEPLOY"
To check whether both connected,
1. check sidecar is properly set
2. check model-server is connected sidecar (TBD)

btw, the above log comes with v3 log. We may add another OPT like DEBUG to patch the command with -v=3 or 5 in manfest kustomize.

jichenjc · 2022-12-14T01:09:54Z

check log line:

even though we can check the logs but seems my understanding is integration test is mostly a focus on functions
instead of logs ,so the system should work like a black box.. so instead of checking those logs, is there anyway
we can expose some endpoint (like debug endpoint) to curl and see whether the functions works well?

sunya-ch · 2022-12-14T01:26:49Z

check log line:

even though we can check the logs but seems my understanding is integration test is mostly a focus on functions instead of logs ,so the system should work like a black box.. so instead of checking those logs, is there anyway we can expose some endpoint (like debug endpoint) to curl and see whether the functions works well?

I think it is a good ise. we can expose all the status of kepler not only the success/failure of the connection and model config but also the available metrics from its discovery. Then we can let operator use these endpoint to update the Kepler CR status.

stale · 2023-05-17T13:33:04Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

rootfs mentioned this issue Dec 9, 2022

end-to-end integration with model server, estimator, and kepler #349

Closed

2 tasks

stale bot added the wontfix This will not be worked on label May 17, 2023

stale bot closed this as completed May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

e2e tests for Kepler, estimator, and model server #456

e2e tests for Kepler, estimator, and model server #456

rootfs commented Dec 9, 2022

jichenjc commented Dec 12, 2022

sunya-ch commented Dec 12, 2022

rootfs commented Dec 12, 2022

sunya-ch commented Dec 13, 2022 •

edited

Loading

jichenjc commented Dec 14, 2022

sunya-ch commented Dec 14, 2022

stale bot commented May 17, 2023

e2e tests for Kepler, estimator, and model server #456

e2e tests for Kepler, estimator, and model server #456

Comments

rootfs commented Dec 9, 2022

jichenjc commented Dec 12, 2022

sunya-ch commented Dec 12, 2022

rootfs commented Dec 12, 2022

sunya-ch commented Dec 13, 2022 • edited Loading

jichenjc commented Dec 14, 2022

sunya-ch commented Dec 14, 2022

stale bot commented May 17, 2023

sunya-ch commented Dec 13, 2022 •

edited

Loading