Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customise Docker volume #1834

Closed
mausch opened this issue Sep 5, 2020 · 11 comments
Closed

Customise Docker volume #1834

mausch opened this issue Sep 5, 2020 · 11 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@mausch
Copy link

mausch commented Sep 5, 2020

It would be great if users could customise the docker volume created by kind on cluster creation.

Currently the volume is anonymous so it's hard to identify it in docker volume ls.
Also because it grows big with all the images and stuff I'd like to put it on a different host partition instead of the default /var/lib/docker/volumes .

I guess the implementation for this would go in

func createAnonymousVolume(label string) (string, error) {
?
(or just write a different function)

@mausch mausch added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 5, 2020
@BenTheElder
Copy link
Member

All docker volumes can grow arbitrarily large, is there a reason to not move all of them?

What particular volumes we create is an implementation detail subject to change at any time.
Making them configurable would prevent us from changing that.

@mausch
Copy link
Author

mausch commented Sep 7, 2020

Hi @BenTheElder , thanks for the quick reply!
Like many people, my desktop machines have several disks/partitions with different properties (speed, capacity, filesystem, etc)
I know exactly what containers I'm running and which volumes will actually grow large and what they contain so I will schedule them accordingly on my disks.
For example, I have a machine with a primary 256GB SSD and a secondary 2TB HDD. At work I have a Postgres database that won't fit on the SSD so I put that volume on the HDD. I also run Elasticsearch but I know that the data I index there will fit on the SSD so I leave it there.
As for kind, its volume on the machine I'm testing it is 6GB just from infrastructure images, I haven't even started deploying my own applications (some of which are pretty large unfortunately, one in particular is 5GB!).

@BenTheElder
Copy link
Member

I don't think docker provides a mechanism to place an anonymous volume on a different disk.

If there is a mechanism that I'm not aware of to place anonymous volumes on another disk, we could support an enviornment variable to point kind to a different anonymous volume location.

I don't think we should switch to named volumes for these as that guarantees users will expect to depend on them, and this is a highly internal detail that can and has changed in order to keep kind functioning well.

@mausch
Copy link
Author

mausch commented Sep 8, 2020

I don't think docker provides a mechanism to place an anonymous volume on a different disk.

This works for me:

docker volume create --opt type=none --opt device=/path/to/some/other/disk --opt o=bind

If there is a mechanism that I'm not aware of to place anonymous volumes on another disk, we could support an enviornment variable to point kind to a different anonymous volume location.

That sounds great! 👍

I don't think we should switch to named volumes for these as that guarantees users will expect to depend on them, and this is a highly internal detail that can and has changed in order to keep kind functioning well.

Well, I'd argue it's not really internal since anyone can just docker volume ls and see and manipulate the volume, and not setting a name just obscures things unnecessarily. But fair enough, if need be people can get the volume from a docker inspect kind-control-plane or something, and kind doesn't need to change. Also just to be super clear, volume naming is just a minor quibble compared to being able to relocate the volume 🙂

Ultimately I think this issue isn't too different from what happened with the network customisation. People running kind on different environments with different capabilities will need to customise how kind will run.

@BenTheElder
Copy link
Member

This works for me:

That's not an anonymous volume. Anonymous volumes sometimes come from the image spec even and are bound to the container lifecycle.

Obscured volume names that are bound to the lifecycle of the node do very much signal that the existence and contents of these are not to be relied on. We've changed them multiple times to fix things. We also avoid cleanup bugs by letting docker bind their lifecycle to the container (anonymous + RM option)

Well, I'd argue it's not really internal since anyone can just docker volume ls and see and manipulate the volume, and not setting a name just obscures things unnecessarily. But fair enough, if need be people can get the volume from a docker inspect kind-control-plane or something, and kind doesn't need to change. Also just to be super clear, volume naming is just a minor quibble compared to being able to relocate the volume 🙂

You can't really do anything here you couldn't do more sanely with host paths inside the node which is expected. You can't change the volume itself in any way, you can use the contents but obviously you're risking a dependency on arbitrary volumes which users probably won't do and we explicitly will not support.

Ultimately I think this issue isn't too different from what happened with the network customisation. People running kind on different environments with different capabilities will need to customise how kind will run.

Which network customization are you referring to? We don't really have anything comparable to this. The network used is a single entity that is unavoidably public and we have intentionally documented it.

@BenTheElder
Copy link
Member

I recommend moving your anonymous volumes storage under dockerd to the high capacity disk and opting specific known public volumes into using the fast disk instead. Docker already has a configurable data root etc.

I'm not keen on creating bind mounts for these (where users will then depend on the contents being present on the host) and it's not portable anyhow. Bind mounts have different meaning across the docker desktop platforms.

How docker internally implements the existing volumes is something users could try to work around but then they're depending on docker's internals which we're obviously not going to support and should be a red flag when thinking about doing this.

@mausch
Copy link
Author

mausch commented Sep 9, 2020

That's not an anonymous volume.

Ah, you mean anonymous at container creation time, gotcha. 👍
Yeah, I see that's what happens in

.

In that case, this should do it:

docker run --name test --mount 'type=volume,target=/var,volume-opt=type=none,volume-opt=o=bind,volume-opt=device=/path/to/some/other/disk' ubuntu

That volume will be bound to the lifecycle of the container, though obviously not its contents. Also, granted it's not portable.
Then again, if you define something like that you should know what you're doing and IMHO that's out of the scope of kind.
All I'm proposing is that kind lets you pass such options to the creation of the volume through an optional and unsupported environment variable like you did with KIND_EXPERIMENTAL_DOCKER_NETWORK.
Hopefully the change in kind should be pretty small: firstly change

from using --volume to using --mount (which BTW is recommended according to the docs for some reason), then check for that variable and merge/append its contents.

Which network customization are you referring to?

I meant this: #1538

@BenTheElder
Copy link
Member

That volume will be bound to the lifecycle of the container, though obviously not its contents. Also, granted it's not portable.

The volume and it's contents are supposed to be created and deleted with the container.

from using --volume to using --mount (which BTW is recommended according to the docs for some reason), then check for that variable and merge/append its contents.

AFAICT that recommendation does not apply to anonymous volumes, like we're using here. As I said before some of these volumes have even come from VOLUME in the image instead.

All I'm proposing is that kind lets you pass such options to the creation of the volume through an optional and unsupported environment variable like you did with KIND_EXPERIMENTAL_DOCKER_NETWORK.

This is just one object name, and I think we're going to remove this, it's not necessarily a good pattern, and we're mostly seeing users try to use networks that won't work anyhow.

I think a bit of additional context is still missing here ...

  • These volumes only exist because we need to avoid overlay on overlay, or for minor performance tweaks in the future. If possible we'd use no volumes and you'd need to schedule the container on another disk anyhow by changing the docker data root.

  • Which volumes exist has changed extremely recently and is likely to change again soon. What particular mount paths we use volumes at is not documented for a reason, they only exist to keep kind functional, they're not for consumption. We will likely split /var into a number of specific subsets of var again in the future. We might move some in-memory. We may change paths that components use within the nodes.

The current PR design will be broken by these changes.

@BenTheElder
Copy link
Member

If I set just the volume option for a device path like /dev/sda2 (not a bind mounted path) what happens?

If we have to switch to bind mounts that we generate ourselves instead of arbitrary customization of the volumes and trying to reconcile which volumes exist in the future, perhaps we could support a concept akin to DOCKER_DATA_ROOT.

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 9, 2020
@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 8, 2021
@BenTheElder BenTheElder removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 22, 2021
@kubernetes-sigs kubernetes-sigs deleted a comment from fejta-bot Jan 22, 2021
@kubernetes-sigs kubernetes-sigs deleted a comment from fejta-bot Jan 22, 2021
@BenTheElder
Copy link
Member

I think I have a more elegant solution for the next revision: we can support node extraMount entries that clash with automatically provided volumes by simply taking the extraMount instead of the volume we would have used, while allowing volumes there #1966

Then you can create a docker volume as you see fit and mount it to /var without the awkward volume options env, and if kind changes the volumes used in the future you can continue to select portions of the filesystem to be redirected to your volumes which will be unioned with the volumes we actually create.

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 22, 2021
@kubernetes-sigs kubernetes-sigs deleted a comment from fejta-bot Apr 22, 2021
@BenTheElder BenTheElder removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 22, 2021
@BenTheElder
Copy link
Member

#1966 subsumes this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants