Limit objects cached by DevWorkspace controller to reduce memory usage #652

amisevsk · 2021-10-20T15:11:56Z

What does this PR do?

Limits the internal controller cache for objects managed by the DevWorkspace Operator (introduced in controller-runtime v0.9.0; see doc). Since we can specify just one selector for limiting the cache, we use

Existence of the devworkspace_id label for most objects (deployments, services, etc.)
New labels controller.devfile.io/watch-configmap and controller.devfile.io/watch-secret that must be applied to secrets/configmaps we use
- This changes the flow for automounting secrets/configmaps and git credentials
For roles/rolebindings, we limit the cache to the workspace role/binding based on an internal constant

The downside of doing this is that any objects that do not match the selector cannot be read in the controller. This has two impacts on our design:

As mentioned above, it changes the flow for automounting secrets/configmaps, as applying an additional label is required. Without it, the controller won't know those resources exist
It requires refactoring how we manage syncing objects to the cluster, as there are two new edge cases to worry about
- We must make sure any resources we create/update have the appropriate label
- We need to treat failures to Create an object specially, otherwise we enter a loop if someone removes the required label from an object -- getting an AlreadyExists error when trying to create requires us to try and update the object.

To address the second point above, rather than rewrite the reconcile for each place its used, I consolidated all syncing (spec vs cluster) into one package and reworked everywhere else to use this.

The main benefit of this change is drastically reduced memory usage on large clusters: for a cluster with 1000 stopped devworkspaces, ~1000 deployments/services/routes, ~5000 configmaps, ~24000 secets, ~26000 rolebindings, we have

Current main branch: ~1750Mi memory usage
This PR branch: ~90Mi memory usage

This represents an approximate memory use reduction of 18-19x. In the specific case of the cluster I tested against, it appears (unsurprisingly) that secrets are the main culprit. Testing a variant image that only restricts the cache for secrets reduced memory usage to ~350Mi.

I'm opting to restrict the cache for all objects as otherwise memory usage of DWO depends on the objects that exist on the cluster. With the cache restriction, memory use should be governed mainly by how many DevWorkspaces exist on the cluster.

Additional info

Graph of memory usage for DWO while testing various different cases (each image is restarted 5 times as memory usage spikes on startup):

Note the numbers here don't match the ones listed above exactly as these are internal metrics and the numbers above use podmetrics from the cluster. The cases being tested are:

Main branch
Restrict only objects that use the devworkspace_id label (easiest case)
Same as 2. but also restrict roles and rolebindings based on name
Same as 2. but also restrict secrets and configmaps by new label
Restrict everything we can (2 + 3 + 4)
Restrict only secrets from the cache to see the effect there.

Diagram for the new sync object flow:

What issues does this PR fix or reference?

Is it tested? How?

Testing might be tricky:

We need to make sure I didn't miss adding the required label to any secrets/configmaps.
We need to make sure all objects that get created can be read (e.g. workspace ServiceAccounts weren't previously getting the devworkspace_id label)
We ideally need to make sure that updating from a branch that doesn't restrict the cache to the current commit functions as expected and that all owned objects are updated if necessary

I've tried testing the above locally and haven't seen any issues.

PR Checklist

E2E tests pass (when PR is ready, comment /test v8-devworkspace-operator-e2e, v8-che-happy-path to trigger)
- v8-devworkspace-operator-e2e: DevWorkspace e2e test
- v8-che-happy-path: Happy path for verification integration with Che

Modify cache used in controller in order to restrict cache to items with the devworkspace_id label. This has the downside of making all objects _without_ that label invisible to the controller, but has the benefit of reduced memory usage on large clusters. Signed-off-by: Angel Misevski <[email protected]>

Add labels - controller.devfile.io/watch-configmap - controller.devfile.io/watch-secret which must be set to "true" in order for the DevWorkspace Operator to see the corresponding secret/configmap. This is required (compare to the previous commit) because the controller is not only interested in secrets and configmaps it creates, but also any configmap/secret on the cluster with e.g. the automount label attached. Since each type in the controller gets a single informer, we can only specify a single label selector for the objects we are interested in. This means we cannot have e.g. "has devworkspace_id label OR has mount-to-devworkspace label". Signed-off-by: Angel Misevski <[email protected]>

Restricting the cache to only configmaps with the new label results in existing workspaces failing to reconcile. This occurs because attempting to Get() the configmap from the cluster returns a IsNotFound error, whereas attempting to Create() the configmap returns an AlreadyExists error (Create interacts with the cluster, Get interacts with the cache). To avoid this, if we encounter an AlreadyExists error when attempting to create an object, we optimistically try to update the object (thus adding the required label). This resolves the issue above, as if the obejct is updated, the subsequent Get() call will return the object as expected. Signed-off-by: Angel Misevski <[email protected]>

Restricting the controller-runtime cache to specific objects means that once-tracked objects can disappear from the controller's knowledge if the required label is removed. To work around this, it is necessary to update how we sync objects to specifically handle the case where: * client.Get(object) returns IsNotFound * client.Create(object) returns AlreadyExists This occurs because we can't read objects that aren't in the cache, but attempting to create objects collides with the actual object on the cluster. Since the basic flow of Get -> Create/Update is repeated for each type we handle, this commit collects that repeated logic into one package (pkg/provision/sync), allowing object handling to be done in one place. Signed-off-by: Angel Misevski <[email protected]>

Adapt the metadata and storage cleanup tasks to use the new sync flow Signed-off-by: Angel Misevski <[email protected]>

sleshchenko · 2021-10-21T09:36:53Z

/test v8-devworkspace-operator-e2e, v8-che-happy-path

sleshchenko

Great job!

Regarding the changes - nothing to comment, apart from the fact that I don't like (don't like should show that it's personal opinion that can be ignored) that sync func returns errors when object is updated successfully.

Is going to test now.

sleshchenko · 2021-10-21T09:43:04Z

pkg/constants/metadata.go

@@ -26,6 +26,14 @@ const (
 	// DevWorkspaceNameLabel is the label key to store workspace name
 	DevWorkspaceNameLabel = "controller.devfile.io/devworkspace_name"

+	// DevWorkspaceWatchConfigMapLabel marks a configmap so that it is watched by the controller. This label is required on all
+	// configmaps that should be seen by the controller
+	DevWorkspaceWatchConfigMapLabel = "controller.devfile.io/watch-configmap"


I wonder if we need a dedicated label for each of objects type?

WDYM? For most objects, the devworkspace_id label is sufficient, but I had to add separate ones for configmaps/secrets since those by default do not have one label they always use. We might run into a similar issue to workaround for Deployments if we continue supporting async storage, but we could fudge that by using devworkspace_id: all or something.

We could use one label for both secrets and configmaps, but then we run into the issue of how to name it -- watch-resource may be unclear as it only applies to configmaps/secrets, and we use different labels for other objects.

I think you got my question)

We could use one label for both secrets and configmaps, but then we run into the issue of how to name it -- watch-resource may be unclear as it only applies to configmaps/secrets, and we use different labels for other objects.

That's exactly what I have in mind, including the concern ) So, then I think it makes sense to leave as is.

One tiny +1 to use common watch annotation - it will allow us to avoid having different articles/section to some docs, like here https://docs.google.com/document/d/1IR78XlxO37VTWXOu-uE-2nKC93D1GNhZomGoRaN518o/edit?usp=sharing

So, we'll just provide a single patch command instead two.

The concern:

We could use one label for both secrets and configmaps, but then we run into the issue of how to name it -- watch-resource may be unclear as it only applies to configmaps/secrets, and we use different labels for other objects.

may be addressed by the following explanation:

DevWorkspace operator watches objects owned by DevWorkspace CR (ones which are in additional labeled with workspace id) or standalone additional objects labeled with watch: true

I assume that there always is at least some kind of label on the secret/configmap that DWO handles, even if such labels differ depending on the purpose.

I think (I have not tried this out), it should be possible to write an "OR" label selector - if not using the existing code then by implementing a custom labels.Selector.

I personally think requiring 2 labels on a single object for a single purpose is a little bit weird from the UX perspective.

an "or" selector is apparently impossible, so please ignore me :)

I personally think requiring 2 labels on a single object for a single purpose is a little bit weird from the UX perspective.

The reality is that there are multiple labels that can get applied to configmaps or secrets, and they each serve a different purpose

controller.devfile.io/watch-[secret|configmap]: mark this secret/configmap as "of interest" to the controller; necessary due to caching change

controller.devfile.io/mount-to-devworkspace: mount this resource to the workspace; used by external tools/users to share info across multiple workspaces

controller.devfile.io/git-credential: mark a secret as holding git credentials, which is handled differently from above

controller.devfile.io/devworkspace_id: associate this resource with the workspace with workspace ID specified.

Cases 2, 3 and 4 can exist independently of each other, e.g. a user-defined mounted configmap won't have the devworkspace_id label, and the metadata configmap we provision for workspaces won't have the mount-to-devworkspace label. As a result, there's no label selector we can use here, so we have to add the watch label to cover all use cases. Moreover, there will be cases when there are secrets/configmaps on the cluster that we're interested in that only have the controller.devfile.io/watch-[secret|configmap] label and no others.

The concern [...] may be addressed by the following explanation:

DevWorkspace operator watches objects owned by DevWorkspace CR (ones which are in additional labeled with workspace id) or standalone additional objects labeled with watch: true

Potentially, but it's still somewhat unclear: we watch PVCs without the label applied, and secrets/configmaps become invisible if the label is removed, even if it has the workspace ID label. I'm open to using one label for both, but I'm not sure it's a huge gain in documentation burden.

pkg/provision/sync/sync.go

pkg/cache/cache.go

sleshchenko

I haven't tested through proper load testing flow but with usual testing, it works fine.
Please, after it's merged drop PR for Che operator to update DWO go dep, or create an issue for them.

mmorhun · 2021-10-21T13:56:37Z

Is it possible to use non-caching client (like here) instead of adding labels just to manage caches?

amisevsk · 2021-10-21T22:48:30Z

Is it possible to use non-caching client (like here) instead of adding labels just to manage caches?

It would be possible, but goes against the intention of the controller design. With caching, we can avoid making API calls out to the cluster most of the time (only really hitting the API server on create/update calls), so I'd be concerned about performance/behavior if we went that route.

As a rough example, it takes around 10-15 reconciles to start a workspace, and all of these reconciles tend to happen within the first couple of seconds. If during that process we have to list/get secrets/configmaps multiple times (in the case of mounting git credentials, we might make two secrets, two configmaps, and read secrets + configmaps 2-3 times per reconcile), we're looking at potentially 50+ API requests per started workspace that are not present with caching.

We do use a non-caching client in a few places, but those are reserved for startup or one-time tasks. In load testing, I've seen 200+ reconciles per second in the controller, which could result in thousands of get/list calls per second.

The other concern I'd have is in throughput, as making actual calls to the API will necessarily be slower than reading from the cache. This could directly impact startup time when under load, which we'd prefer to avoid.

Update sync methods in devworkspaceRouting controller to use updated sync package where appropriate. Note deleting out-of-date network objects must still be done in the controller, as it's not possible to iterate through a generic list of client.Objects. Signed-off-by: Angel Misevski <[email protected]>

Signed-off-by: Angel Misevski <[email protected]>

On Kubernetes, we can't restrict the cache for Routes since they are not a part of the included scheme. As a result we have to work around adding Routes to the cache only on OpenShift. Signed-off-by: Angel Misevski <[email protected]>

Signed-off-by: Angel Misevski <[email protected]>

Pass around the full clusterAPI struct to methods in automount package, to allow for shared Context/Logging. Signed-off-by: Angel Misevski <[email protected]>

Signed-off-by: Angel Misevski <[email protected]>

For most objects, we can client.Update() using the spec object without issue. However, for Services, updates are rejected if they try to unset spec.ClusterIP. This means we need to copy the ClusterIP from the cluster service before updating. This commit adds an extensible mechanism for specifying type-specific update functions that are called whenever we attempt to update a cluster object. Signed-off-by: Angel Misevski <[email protected]>

Use diffOpts when printing spec vs cluster object diffs when updates are required. Signed-off-by: Angel Misevski <[email protected]>

Signed-off-by: Angel Misevski <[email protected]>

amisevsk · 2021-10-26T13:35:38Z

/test v8-devworkspace-operator-e2e, v8-che-happy-path

openshift-ci · 2021-10-26T18:00:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amisevsk, JPinkney, sleshchenko

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [JPinkney,amisevsk,sleshchenko]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

amisevsk · 2021-11-01T17:04:14Z

/retest

openshift-ci · 2021-11-01T18:13:47Z

@amisevsk: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/v8-che-happy-path	`8e07871`	link	true	`/test v8-che-happy-path`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

nickboldt · 2021-11-22T15:18:28Z

Is this fixed in DWO 0.10? Asking because

amisevsk · 2021-11-22T17:30:44Z

v0.10 was branched before this PR was merged; it'll be included in v0.11

amisevsk added 5 commits October 20, 2021 02:04

Adapt more packages to use sync package for managing objects

1b4a7e0

Adapt the metadata and storage cleanup tasks to use the new sync flow Signed-off-by: Angel Misevski <[email protected]>

amisevsk requested review from sleshchenko and JPinkney October 20, 2021 15:11

openshift-ci bot added the approved label Oct 20, 2021

amisevsk changed the title ~~Limit cache~~ Limit objects cached by DevWorkspace controller to reduce memory usage Oct 20, 2021

amisevsk force-pushed the limit-cache branch from d5d0932 to 9a9ef1d Compare October 20, 2021 15:18

sleshchenko reviewed Oct 21, 2021

View reviewed changes

sleshchenko approved these changes Oct 21, 2021

View reviewed changes

openshift-ci bot assigned sleshchenko Oct 21, 2021

openshift-ci bot added the lgtm label Oct 21, 2021

amisevsk added 9 commits October 21, 2021 18:58

Add documentation and logging to pkg/provision/sync package

2130ef1

Signed-off-by: Angel Misevski <[email protected]>

Fix cache initialization on Kubernetes

43aae00

On Kubernetes, we can't restrict the cache for Routes since they are not a part of the included scheme. As a result we have to work around adding Routes to the cache only on OpenShift. Signed-off-by: Angel Misevski <[email protected]>

Use sync package for async storage deployment

465405e

Signed-off-by: Angel Misevski <[email protected]>

Rework automount package to use ClusterAPI instead of client

8290933

Pass around the full clusterAPI struct to methods in automount package, to allow for shared Context/Logging. Signed-off-by: Angel Misevski <[email protected]>

Adapt automount git credentials to use new sync mechanism

299186d

Signed-off-by: Angel Misevski <[email protected]>

Improve diff logging when updating items

6c83e0a

Use diffOpts when printing spec vs cluster object diffs when updates are required. Signed-off-by: Angel Misevski <[email protected]>

Restrict caches for Roles and Rolebindings by name

8e07871

Signed-off-by: Angel Misevski <[email protected]>

amisevsk force-pushed the limit-cache branch from 9a9ef1d to 8e07871 Compare October 21, 2021 22:58

openshift-ci bot removed the lgtm label Oct 21, 2021

JPinkney approved these changes Oct 26, 2021

View reviewed changes

openshift-ci bot assigned JPinkney Oct 26, 2021

openshift-ci bot added the lgtm label Oct 26, 2021

amisevsk merged commit cd7b1e4 into devfile:main Nov 5, 2021

amisevsk deleted the limit-cache branch November 5, 2021 18:41

amisevsk mentioned this pull request Nov 15, 2021

feat: Use custom cache function in Che Operator eclipse-che/che-operator#1166

Merged

11 tasks

amisevsk mentioned this pull request Dec 16, 2021

Add automount subpaths option #710

Merged

3 tasks

This was referenced Jul 5, 2022

Webhooks server memory usage depends on number of pods in cluster. #888

Closed

Limit objects cached in webhooks manager #889

Merged

dkwon17 mentioned this pull request Dec 2, 2024

DWO Stress Testing #1353

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit objects cached by DevWorkspace controller to reduce memory usage #652

Limit objects cached by DevWorkspace controller to reduce memory usage #652

amisevsk commented Oct 20, 2021

sleshchenko commented Oct 21, 2021

sleshchenko left a comment

sleshchenko Oct 21, 2021

amisevsk Oct 21, 2021

sleshchenko Oct 25, 2021

sleshchenko Oct 25, 2021

metlos Oct 25, 2021

metlos Oct 26, 2021

amisevsk Oct 28, 2021

sleshchenko left a comment

mmorhun commented Oct 21, 2021

amisevsk commented Oct 21, 2021

amisevsk commented Oct 26, 2021

openshift-ci bot commented Oct 26, 2021

amisevsk commented Nov 1, 2021

openshift-ci bot commented Nov 1, 2021

nickboldt commented Nov 22, 2021

amisevsk commented Nov 22, 2021

Limit objects cached by DevWorkspace controller to reduce memory usage #652

Limit objects cached by DevWorkspace controller to reduce memory usage #652

Conversation

amisevsk commented Oct 20, 2021

What does this PR do?

Additional info

Graph of memory usage for DWO while testing various different cases (each image is restarted 5 times as memory usage spikes on startup):

Diagram for the new sync object flow:

What issues does this PR fix or reference?

Is it tested? How?

PR Checklist

sleshchenko commented Oct 21, 2021

sleshchenko left a comment

Choose a reason for hiding this comment

sleshchenko Oct 21, 2021

Choose a reason for hiding this comment

amisevsk Oct 21, 2021

Choose a reason for hiding this comment

sleshchenko Oct 25, 2021

Choose a reason for hiding this comment

sleshchenko Oct 25, 2021

Choose a reason for hiding this comment

metlos Oct 25, 2021

Choose a reason for hiding this comment

metlos Oct 26, 2021

Choose a reason for hiding this comment

amisevsk Oct 28, 2021

Choose a reason for hiding this comment

sleshchenko left a comment

Choose a reason for hiding this comment

mmorhun commented Oct 21, 2021

amisevsk commented Oct 21, 2021

amisevsk commented Oct 26, 2021

openshift-ci bot commented Oct 26, 2021

amisevsk commented Nov 1, 2021

openshift-ci bot commented Nov 1, 2021

nickboldt commented Nov 22, 2021

amisevsk commented Nov 22, 2021