Skip to content
This repository has been archived by the owner on Jan 17, 2018. It is now read-only.

Share is mounted multiple times which deletes all data from the share #23

Closed
sascha-egerer opened this issue Feb 29, 2016 · 10 comments
Closed

Comments

@sascha-egerer
Copy link

As described in #19 shares are somtimes mounted multiple times.
There is a big problem with that as in the unmount function the mount-path is removed. That means the volume is only umounted once and then the folder is removed recursivle which will remove everything from the share if it is still mounted.

I think there theses tasks:

  1. The folder should not be removed recursivly to prevent the deletion of data on the share (
    if err := os.RemoveAll(path); err != nil {
    )
  2. The mount should only be done of it is not already mounted
  3. Ensure that the volume is correctly unmounted even if a problem occures after the mount
$ mount
/dev/sda1 on / type ext4 (rw,discard)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/cgroup type tmpfs (rw)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
udev on /dev type devtmpfs (rw,mode=0755)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
none on /run/shm type tmpfs (rw,nosuid,nodev)
none on /run/user type tmpfs (rw,noexec,nosuid,nodev,size=104857600,mode=0755)
none on /sys/fs/pstore type pstore (rw)
none on /etc/network/interfaces.dynamic.d type tmpfs (rw,noexec,nosuid,nodev,size=64K)
/dev/sdb1 on /mnt type ext4 (rw)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)
//foo.file.core.windows.net/my-file-storage on /run/docker/volumedriver/azurefile/my_my-file-storage type cifs (rw)
//foo.file.core.windows.net/my-config on /run/docker/volumedriver/azurefile/my_my-config type cifs (rw)
//foo.file.core.windows.net/my-config on /run/docker/volumedriver/azurefile/my_my-config type cifs (rw)
//foo.file.core.windows.net/my-file-storage on /run/docker/volumedriver/azurefile/my_my-file-storage type cifs (rw)
//foo.file.core.windows.net/my-file-storage on /run/docker/volumedriver/azurefile/my_my-file-storage type cifs (rw)
@ahmetb
Copy link
Contributor

ahmetb commented Feb 29, 2016

Thanks for reporting, seems like a valid issue. I was not aware that mount would behave like this and create duplicate entries.

For (1), the os.RemoveAll is only executed when umount is successful, so I wonder if Remove vs RemoveAll would make any difference at all. The gain of RemoveAll is, for some reason the mountpoint path doesn't exist, we don't get a failure. RemoveAll ignores non-existent path errors. This probably would occur when maybe someone has manually unmounted and removed the path but now they want to do docker volume rm and if we use Remove (and not RemoveAll) we'd have to check if the error we get is a non-existent path (os.IsNotExist) error.

For (2), yes this certainly sounds like a bug, I was not aware that mount would create duplicate entries.

For (3), this goes beyond my knowledge, can mount partially succeed and create an entry so that we need to try to umount even when it fails? I see that we can remove the mountpoint in case of a mount error (although it's probably harmless to leave it around) but I am not sure that umount would be necessary for a failed mount.

@sascha-egerer
Copy link
Author

For (1), i've updated the pull request and added a !os.IsNotExist

For (2), i think we should give docker/docker/pkg/mount a try again as this already handles such cases.

For (3), It was more about an error that happens in the driver implementation after the mount. But looks like we dont need anything here... so forget about it.

@ahmetb
Copy link
Contributor

ahmetb commented Mar 1, 2016

@sascha-egerer while implementing mount I actually looked at implementation at docker/pkg/mount and it still appears like Mount func works only if there is nothing mounted at the path. [godoc] I'm not clear how this package handles such cases.

Your pull request with !os.IsNotExist still will not solve the problem: If you mount the same volume to multiple containers, you'll end up duplicate mount entries and the first attempt to unmount (and thus the call to os.Remove) will fail, because the directory is not empty, right?

Docker Engine doesn't keep track of volumes mounted, so it goes on and issues another Mount call to the driver when a container is created. We can be smart in the drive: We can see if a path has mounts and prevent further duplication. BUT when it comes to unmounting, we can't tell for sure if there are any other containers still running and using this mount.

Best solution I can think of is, we let these duplicate mounts happen. We delete the mountpoint only if there's finally nothing mounted anymore on that path. This assumes umount command actually removes a single mount entry among the multiple duplicate ones and not all of them at once.

@sascha-egerer
Copy link
Author

@sascha-egerer while implementing mount I actually looked at implementation at docker/pkg/mount and it still appears like Mount func works only if there is nothing mounted at the path. [godoc] I'm not clear how this package handles such cases.

Isn't that what the driver should do? I don't think that mounting multiple times is a good idea.
But there is also a ForceMount which does also mount if the path is already mounted. But i really dislike that.

Your pull request with !os.IsNotExist still will not solve the problem: If you mount the same volume to multiple containers, you'll end up duplicate mount entries and the first attempt to unmount (and thus the
call to os.Remove) will fail, because the directory is not empty, right?

It does solve the problem that data is not accidentially deleted. If everything works correct the folder should always be empty when it will be removed. If not something went wrong and that should be visible and not silenty ignored.

Docker Engine doesn't keep track of volumes mounted, so it goes on and issues another Mount call to the driver when a container is created. We can be smart in the drive: We can see if a path has mounts and prevent further duplication. BUT when it comes to unmounting, we can't tell for sure if there are any other containers still running and using this mount.

Does the docker engine not use their own API? I've really no experiance with docker and go development and thats why i can't say anything about that. But if i look at the Docker Volume Plugin API that looks good to me. There is the List command which lists all volumes with their path. So it can be compared against this list and if there is nothing mounted on the path we can remove it.

@ahmetb
Copy link
Contributor

ahmetb commented Mar 1, 2016

If everything works correct the folder should always be empty when it will be removed.

this holds true only when you don't mount multiple times. It's easy to prevent duplicate mounts, however it's hard(read:impossible?) to tell when to actually unmount. If 2 containers use the same volume, if you stop one, it will issue an /Unmount call (even though there is still another container using it), there's no way to find out if the volume is still in use by the other running container.

Does the docker engine not use their own API?

It appears like it is not... 😦 I will check this closely to find out if Docker can do better here and open issue at docker/docker if that's the case. /List API only returns created volumes on that machine (a volume can be "created" but not yet "mounted"). So /List will not really help us here.

@ahmetb
Copy link
Contributor

ahmetb commented Mar 1, 2016

Indeed docker is being really dumb here.

So I have:

docker volume create --name=vol1 -d azurefile [...]
docker run -d -v vol1:/data --name=c1 [...]
docker run -d -v vol1:/data --name=c2 [...]

docker run c1:

DEBU[0110] request accepted                              name=vol1 operation=path
DEBU[0111] request accepted                              name=vol1 operation=mount

docker run c2:

DEBU[0123] request accepted                              name=vol1 operation=mount

so at this point, mount list has:

...
//k4portalvhdsrmq6gptnqhb5.file.core.windows.net/share on /run/docker/volumedriver/azurefile/vol1 type cifs (rw)
//k4portalvhdsrmq6gptnqhb5.file.core.windows.net/share on /run/docker/volumedriver/azurefile/vol1 type cifs (rw)

docker stop c1, plugin logs:

DEBU[0147] request accepted                              name=vol1 operation=unmount
ERRO[0147] error removing mountpoint: remove /var/run/docker/volumedriver/azurefile/vol1: device or resource busy  name=vol1 operation=unmount

but it stops the container regardless. docker.log has this line saying:

WARN[0427] 1b07b927c0d99948809009e11e441715a0f6f308aabbbb5b1d760e4af7c5ff70 cleanup: Failed to umount volumes: VolumeDriver.Unmount: error removing mountpoint: remove /var/run/docker/volumedriver/azurefile/vol1: device or resource busy

and the unmount operation has actually executed:

  • umount succeeded, one of the duplicate entries is gone
  • os.RemoveAll failed with error: “device or resource busy” because it is still mounted. (so your os.Remove would fail here too)

So at this /Unmount operation, apparently what I need to do is:

  • run umount regardless
  • check if mountpoint has active mounts (using docker/pkg/mount), if not, os.Remove(mountpoint) else noop.

In this solution, I will let duplicate mounts happen, otherwise there is no way of telling when it is okay to os.Remove the dir or not. These duplicate mount entries are actually helping us.

@ahmetb
Copy link
Contributor

ahmetb commented Mar 1, 2016

I just submitted a patch implementing the logic I described above, so I don't prevent duplicate mount entries as they're helpful in determining if the path still has more active mounts, and while unmounting I remove the mountpoint (with os.Remove) only if no active mounts are remaining for the path (by reading the mount table from /proc/self/mountinfo).

I think it solves your problem here based on my testing. I will go ahead and merge that, please try downloading v0.2 in a minute.

@ahmetb ahmetb closed this as completed in #25 Mar 1, 2016
@ahmetb
Copy link
Contributor

ahmetb commented Mar 2, 2016

@sascha-egerer did you have a chance to try out v0.2. It should be addressing this issue and other problems caused by this.

@sascha-egerer
Copy link
Author

@ahmetalpbalkan I've updated now to v0.2 and could not find any problems so far. But I'm still very unhappy with the multiple mounts. If i start 100 containers i have 100 active mounts which really does not make sense. I would still try to prevent that...

@ahmetb
Copy link
Contributor

ahmetb commented Mar 3, 2016

@sascha-egerer as I said, apparently there's no way to tell if it's okay to Unmount that one single entry because other containers might be still using it. It's a docker-engine issue. I'll open an issue entry over there soon.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants