Bring in newer cryptnono version #3569

yuvipanda · 2024-01-04T01:57:02Z

I've been upgrading cryptnono quite a bit over the last few months, bringing in new detectors that have been quite effective on mybinder.org. We automatically bump cryptnono on our clusters (#3482), but recent progress have included some breaking changes to the helm chart config.

This PR just brings in the new config changes, but does not change behavior in any real way. No new detectors are enabled.

I've re-measured resource usage for the individual daemonset container (rather than the initContainer) as that can now be set separately. This probably requires us to redo some of the resource allocation generated profiles, which I'll do once this is merged. However, it is an overall reduction in daemonset requests, so deploying this shouldn't result in any profile being undeployable.

Merging this should allow #3482 to move forward as well.

I've been upgrading cryptnono quite a bit over the last few months, bringing in new detectors that have been quite effective on mybinder.org. We automatically bump cryptnono on our clusters (2i2c-org#3482), but recent progress have included some breaking changes to the helm chart config. This PR just brings in the new config changes, but does not change behavior in any real way. No new detectors are enabled. I've re-measured resource usage for the individual daemonset container (rather than the initContainer) as that can now be set separately. This probably requires us to redo some of the resource allocation generated profiles, which I'll do once this is merged. However, it is an overall reduction in daemonset requests, so deploying this shouldn't result in any profile being undeployable. Merging this should allow 2i2c-org#3482 to move forward as well.

github-actions · 2024-01-04T01:58:12Z

Merging this PR will trigger the following deployment actions.

Support and Staging deployments

Cloud Provider	Cluster Name	Upgrade Support?	Reason for Support Redeploy	Upgrade Staging?	Reason for Staging Redeploy
aws	gridsst	Yes	Support helm chart has been modified	No
gcp	linked-earth	Yes	Support helm chart has been modified	No
aws	nasa-esdis	Yes	Support helm chart has been modified	No
gcp	hhmi	Yes	Support helm chart has been modified	No
gcp	pangeo-hubs	Yes	Support helm chart has been modified	No
aws	jupyter-meets-the-earth	Yes	Support helm chart has been modified	No
aws	nasa-ghg	Yes	Support helm chart has been modified	No
gcp	2i2c-uk	Yes	Support helm chart has been modified	No
gcp	awi-ciroh	Yes	Support helm chart has been modified	No
gcp	meom-ige	Yes	Support helm chart has been modified	No
kubeconfig	utoronto	Yes	Support helm chart has been modified	No
aws	smithsonian	Yes	Support helm chart has been modified	No
gcp	leap	Yes	Support helm chart has been modified	No
gcp	qcl	Yes	Support helm chart has been modified	No
aws	openscapes	Yes	Support helm chart has been modified	No
aws	nasa-veda	Yes	Support helm chart has been modified	No
gcp	catalystproject-latam	Yes	Support helm chart has been modified	No
aws	ubc-eoas	Yes	Support helm chart has been modified	No
aws	catalystproject-africa	Yes	Support helm chart has been modified	No
aws	nasa-cryo	Yes	Support helm chart has been modified	No
gcp	2i2c	Yes	Support helm chart has been modified	No
aws	victor	Yes	Support helm chart has been modified	No
gcp	cloudbank	Yes	Support helm chart has been modified	Yes	Following prod hubs require redeploy: csulb
gcp	callysto	Yes	Support helm chart has been modified	No
aws	2i2c-aws-us	Yes	Support helm chart has been modified	No

Production deployments

Cloud Provider	Cluster Name	Hub Name	Reason for Redeploy
gcp	cloudbank	csulb	Following helm chart values files were modified: csulb.values.yaml

consideRatio · 2024-01-04T02:12:24Z

helm-charts/support/values.yaml

+          cpu: 0.005
+        requests:
+          memory: 64Mi
+          cpu: 0.0001


I think the lowest resolution is 1/1024 of a CPU, and I recall that 2m was the practical minimum value for dockerd and containerd. I'm not sure if there are any enforcement to prevent specification of 0.1m or extremely low values, but I'd be inclined to not optimize this extreme and go for a value of at least 1m to avoid issues.

Good catch, @consideRatio. Based on the 'note' in https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu, I switched to using m units and specified 1m - 0.0001 is definitely not enforceable.

consideRatio

I made a better-safe-than-sorry comment on a detail, but this lgtm!

github-actions · 2024-01-04T15:28:39Z

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/7411484364

2i2c-org#3569 changed the cryptnono daemonset to have different resource requests for the init containers as well as the container. While working on 2i2c-org#3566, I noticed this was generating wrong choices - the overhead was calculated wrong (too small). We were intentionally ignoring init containers while calculating overhead, and turns out the scheduler and the autoscaler both do take it into consideration. The effective resource request for a pod is the higher of the resource requests for the containers *or* the init containers - this ensures that a pod with higher requests for init containers than containers (like our cryptnono pod!) will actually run. This is documented at https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#resource-sharing-within-containers, and implemented in Kubernetes itself at https://github.com/kubernetes/kubernetes/blob/9bd0ef5f173de3cc2d1d629a4aee499d53690aee/pkg/api/v1/resource/helpers.go#L50 (this is the library code that the cluster autoscaler uses). This PR updates the two places we currently have that calculate effective resource requests (I assume eventually these will be merged into one - I haven't kept up with the team's work last quarter here). I've updated the node-capacity-info.json file, which is what seems to be used by the generator script right now.

yuvipanda requested a review from a team as a code owner January 4, 2024 01:57

github-actions bot assigned yuvipanda Jan 4, 2024

consideRatio reviewed Jan 4, 2024

View reviewed changes

consideRatio approved these changes Jan 4, 2024

View reviewed changes

Use milliCPU as units for clarity

5595b13

yuvipanda merged commit 9a16b57 into 2i2c-org:master Jan 4, 2024
32 checks passed

This was referenced Jan 4, 2024

Include initContainers when calculating pod overhead #3572

Merged

Bumping helm chart dependency versions: support #3482

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring in newer cryptnono version #3569

Bring in newer cryptnono version #3569

yuvipanda commented Jan 4, 2024

github-actions bot commented Jan 4, 2024 •

edited

Loading

consideRatio Jan 4, 2024

yuvipanda Jan 4, 2024

consideRatio left a comment

github-actions bot commented Jan 4, 2024

Bring in newer cryptnono version #3569

Bring in newer cryptnono version #3569

Conversation

yuvipanda commented Jan 4, 2024

github-actions bot commented Jan 4, 2024 • edited Loading

Support and Staging deployments

Production deployments

consideRatio Jan 4, 2024

Choose a reason for hiding this comment

yuvipanda Jan 4, 2024

Choose a reason for hiding this comment

consideRatio left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 4, 2024

github-actions bot commented Jan 4, 2024 •

edited

Loading