-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of NFS dependency for in-cluster building #1798
Comments
Has there been any progress on this issue? We're still having quite some operational headaches to do with the nfs in our building flows. Having to clear images but then having the cleanup command fail which leads to us having to remove and remount volumes manually and a lot more minor problems that come up in between etc. etc. If that dependency could be mitigated I think would help greatly. |
We unfortunately had to defer this until post 0.12, but now that 0.12 is out, I'm going to see again about getting this done. |
This issue has been automatically marked as stale because it hasn't had any activity in 60 days. It will be closed in 14 days if no further activity occurs (e.g. changing labels, comments, commits, etc.). Please feel free to tag a maintainer and ask them to remove the label if you think it doesn't apply. Thank you for submitting this issue and helping make Garden a better product! |
This issue has been automatically marked as stale because it hasn't had any activity in 90 days. It will be closed in 14 days if no further activity occurs (e.g. changing labels, comments, commits, etc.). Please feel free to tag a maintainer and ask them to remove the label if you think it doesn't apply. Thank you for submitting this issue and helping make Garden a better product! |
This issue has been automatically marked as stale because it hasn't had any activity in 90 days. It will be closed in 14 days if no further activity occurs (e.g. changing labels, comments, commits, etc.). Please feel free to tag a maintainer and ask them to remove the label if you think it doesn't apply. Thank you for submitting this issue and helping make Garden a better product! |
Background
One of the most frustrating issues with our remote building feature is the reliance on shareable (ReadWriteMany) volumes, which currently works through NFS at the moment, and optionally other RWX-capable volume provisioners such as EFS. This has turned out to cost a lot of maintenance burden and operational issues for our users.
The reason we currently have this requirement is that in order to sync code from the user's local build staging directory, we rsync to an in-cluster volume, that needs to be mountable by the in-cluster builder, whether that is an in-cluster docker daemon, or kaniko.
At the time we didn't see an alternative, that would be reasonably performant. We also had a performance goal in mind that may simply not be as important anymore. The raw efficiency of rsync is appealing, but in many (even most) cases the sync of code is only a small part of the overall build time.
Proposed solution
We can avoid this requirement altogether, through some added code complexity in the actual build flows, but in turn avoiding the complexity relating to managing finicky storage providers.
We still rsync over to the cluster, but instead of directly mounting the sync volume, we modify the flow to allow using a simpler RWO volume.
Build flow
Depending on the build mode, we do the following:
Migration
We change the name of the
build-sync
service tobuild-sync-v2
, anddocker-daemon
todocker-daemon-v2
(and the helm release names accordingly). This is to avoid conflicts during rollout since they both function differently from the prior versions, and cannot cover both client versions simultaneously.Users need to be instructed to remove the old
build-sync
anddocker-daemon
deployments and volumes manually, as well as the NFS provisioner, when their team has updated to the new version. Or uninstall and re-init completely, of course. We can print out a message to this effect in the cluster-init command.The current
storage.sync
parameters still apply to the newbuild-sync-v2
volume without any changes.We still always install
build-sync-v2
, even though it isn't necessary when using thecluster-docker
build mode, in order to avoid headaches around using cluster-docker and kaniko in different scenarios on the same cluster.If it is installed, the
cleanup-cluster-registry
script now also executes through the rsync container in the docker daemon Pod, in addition to thebuild-sync-v2
Pod.Benefits
Drawbacks
Prioritization
I'd say sooner is better for this, since this causes annoying operational issues for users, and support issues for the Garden team by extension.
The text was updated successfully, but these errors were encountered: