Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lost data after a docker upgrade #14461

Open
xmontero opened this issue Dec 2, 2024 · 4 comments
Open

Lost data after a docker upgrade #14461

xmontero opened this issue Dec 2, 2024 · 4 comments

Comments

@xmontero
Copy link

xmontero commented Dec 2, 2024

Description

This issue is closely related to #14453 although it derived to a different nature, this is why I'm spinning-off it from there.

After upgrading the Docker Desktop as a response to the normal prompt of "there's a new version", I (apparently) lost all of my data during the process of upgrade. There's a point in time today at which I don't see any images, containers, networks or volumes.

The context

Running on Win11, with an underlying WSL2 layer. This configuration has worked perfectly during multiple years.

WSL Distro disappearing

As a context too, the current version pre-upgrading was 4.35 but I installed docker desktop so much time ago that in the past, I usually had 2 WSL distributions: docker-desktop and docker-desktop-data that were usually listed when I did wsl -l.

As of now, I don't see docker-desktop-data anymore. But I can't tell for certain if it "disappeared today" with the issue, or Docker Desktop already managed to "merge" to the new format that was changed at 4.30 at some previous moment. I'm not completely sure of when it happened the "distro vanishing".

What I can tell is that until yesterday night I usually had a approximately a dozen of containers permamently running in my laptop all working perfect and with zero issues. Some of the with named volumes, and some networks connecting them together. As for the images I had both downloaded images plus images built by me.

All this was present yesterday and I was working with it normally.

So all this data either was present in the docker-desktop-data WSL distro or was previously merged but it was there, perfectly up and running.

What happened today

I copy this sequence here again, for simplicity to the reader but it was originally posted here #14453 (comment)

  1. Downloaded the dev-version
  2. Run the installer
  3. After the installer run, I run docker desktop
  4. A black window with many logs appeared and at some point Avast was persistently complaining. I thought as it was a dev version and non-signed maybe that was not good for avast so I disabled avast.
  5. Finally the Desktop Docker GUI appeared and I could see the images, containers, volumes, etc. there inside so it seemed I did not loose anything.
  6. I rebooted the PC to check all was in place.
  7. After rebooting the PC, no images, no containers, no anything...
  8. Doing wsl --list I see Ubuntu-20.04 and docker-desktop but no docker-desktop-data.

Keypoints: step 5 => I see the data, step 8 => I don't see the data.

As a side-note, I double-checked: No listings on the GUI, and empty results in CLI with docker ps -a or docker image ls or docker volume ls. Nothing at all.

Additional info: After I saw there's no data, I unchecked the box "start when windows starts" and closed Docker desktop to avoid any possible extra damage. So I've not tried "re-launching" it again and again to avoid possible "rewriting" or "re-deletion" and whatnot.

Maybe I just re-launch it and the data appears again... but I wanted to make some backing-up before going that way.

Virtual disk existence?

After that I checked if the docker_data.vhdx existed - comment here: #14453 (comment)

This file is 160 GB in size, so I expect it's "my data", not an "empty data virtual disk of that size" but I've not really checked the content yet (see below) at section [x]

Diagnostics

As requested, I sent the logs here #14453 (comment)

Diagnostics ID: BB5FCCBA-A57A-422E-A449-A09F8402C9FD/20241202115602 (uploaded)

Backups [x]

After feeling to loose the data, I was affraid of starting Docker Desktop more times. I felt it could potentially destroy things. So I did not want to start it again "just to try" to avoid potential destruction, at least until making some backups.

Once the docker_data.vhdx has been located, and seen its 160 GB big, I just made a binary backup of the virtulal disk to another computer and did a sha1sum to check I have a "good backup".

What am I going to do now? Now I'm going to try to discover if I may "mount" Read-Only the vhdx somehow to see the contents and check if my data is in it or not.

  • If I can't mount the vhdx maybe it's time to start Docker Desktop again to see what happens (now that I have a copy of the disk)
  • If I can mount the virtual disk RO then:
    • If I can see my data there, I'll take a deep breath and the only remaining thing will be discover how to tell Docker Desktop to use it.
    • If I cannot see my data there, the next step will be to discover where the data could potentially reside (if it's still there).

Please help!

Reproduce

See description

Expected behavior

Install the "dev build" and, after a reboot, continue seeing my containers, volumes, etc.

docker version

See description

docker info

See description

Diagnostics ID

BB5FCCBA-A57A-422E-A449-A09F8402C9FD/20241202115602

Additional Info

No response

@xmontero
Copy link
Author

xmontero commented Dec 3, 2024

Okey, so... I don't understand anything... but it seems I may access the data now (at least it seems so).

Backup

Now that I made a backup of docker_data.vhdx I feel secure to play with the original file. In the worst-case scenario I may restore the backup and be again "as now I am".

Trying direct access to the virtual disk files: FAIL

First, I used Mount-VHD and Dismount-VHD to test if I could access the data directly.

More info here:

So the steps are:

  1. On windows, Mount-VHD to make a file appear as a disk to windows.
  2. On windows, wsl --mount xxxx --bare to make that disk appear as a block device inside the WSL distro
  3. On the WSL distro, normal mount to mount the block device in the filesystem.

This is what happened:

In PowerShell as Administrator:

> Write-Output "\\.\PhysicalDrive$(( Mount-VHD -Path C:\Users\xavi\AppData\Local\Docker\wsl\disk\docker_data.vhdx -ReadOnly -PassThru | Get-Disk ).Number)"
\\.\PhysicalDrive1
> wsl --mount \\.\PHYSICALDRIVE1 --bare
La operación se completó correctamente

Then in wsl, as root, it got set as /dev/sdd

# lsblk -f /dev/sdd
NAME FSTYPE LABEL UUID FSAVAIL FSUSE% MOUNTPOINT
sdd
# mount /dev/sdd /mnt/xaviDisk/
mount: /mnt/xaviDisk: cannot mount /dev/sdd read-only.

It could not mount anything. So I unmounted it from Windows:

In PowerShell as Administrator:

> wsl --unmount \\.\PHYSICALDRIVE1
La operación se completó correctamente
> Dismount-VHD -Path C:\Users\xavi\AppData\Local\Docker\wsl\disk\docker_data.vhdx

Instead, if I try with ...wsl\main\ext4.vhdx instead of ...wsl\disk\docker_data.vhdx I could see the contents of it within the WSL.

Conclusions so far:

  • I could use Mount-VHD and Dismount-VHD to create a disk from the *.vhdx file and set it ReadOnly
  • I could properly read the contents from wsl\main\ext4.vhdx
  • I could not properly read the contents from wsl\disk\docker_data.vhdx (not mounted).

Docker Desktop again. Will it work?

So at this moment I thought "okey, all lost"... what will I loose if I restart Docker Desktop again?

And... boummm... the images and the volumes and the containers (nearly*) all were back again.

I swear I saw it empty this morning! I did!!

What?!?

So... this arises me a question... where is my data (the one that I am currently seeing in the docker desktop) physically located? If there's no docker-desktop-data distro now, where's my real data?

I know, I know... I should backup the volumes "from the docker itself"... it's only just curiosity to understand what *.vhdx file finally contains my data to test if I could manually access it if I want.

Confidence state

When something fails and then suddently works, one feels like "hummm... why? and will it fail again?"

For example, I am scared to reboot again. Will I loose it again after a reboot? Why it did not appear this morning if the data is there tonight? What's the reason? Will it happen again?

Why I say "nearly*" before

The volumes are there and the containers having volumes mounted stated well.

Instead, all the containers having "bind mounts" to the real filesystem are stuck.

A few examples:

Image

All of them were running with normality yesterday, I just picked a few:

Some running yesterday and that tonight run: appsmith, grist-prod, grist-test, mercuer, socks5, mysql-prod-5.7 or wordpress-multisite.

Some running yesterday that today do not start: devel-sf7, kibana, elasticsearch, followup-devel, devel-sf4, devel-php72

There's a common thing between the ones that fail... they have filesystem-bound mount points.

For example I try to start devel-sf7 and it says:

xavi@msi-laptop:/files/repos$ docker start devel-sf7
Error response from daemon: invalid mount config for type "bind": bind source path does not exist: /run/desktop/mnt/host/wsl/docker-desktop-bind-mounts/Ubuntu-20.04/f4aa79e429af3c4da5700e4561167336f3392f73dd44f8c0a11980292eb71663
Error: failed to start containers: devel-sf7
xavi@msi-laptop:/files/repos$

Image

IDK if this is related to all the upgrade process and so on.

So... questions

  • I see it now but... should we investigate why I did not see it this morning? (I do have the backup of the *vhdx if needed to explore)
  • May I rely on that they won't disappear again? May I reboot my PC?
  • Where are the files now stored for everything, now that there's no docker-desktop-data VM/distro? How can I check it out?
  • Why the "bind mounts" are all broken? Should I throw away those containers and relaunch them? Or should we take advantage of the state and check what's up with it to discover any underlying bug?

Thanks in advance!

@xmontero
Copy link
Author

xmontero commented Dec 3, 2024

Okey... I'm getting crazy...

Checked again the "Start docker when starting windows"

Rebooted... and.... you know what? Lost again!!

Image

Image

I start thinking that:

  • When Windows starts maybe a different version of Docker Desktop is launching. (The GUI says it's v4.36)
  • When Windows starts with the new devel-buidl maybe it starts it "before" some needed files are ready.

Then I click on the systray icon to stop it (lasted like 3 mins at stopping "nothing") and then opened it again from the start menu...

Image

and... you know what??? they are back again!!!

Image

Also showing v4.36.

Partial conclusion

So it seems not that the installer "deleted the data files" but that for some reason the new version "sometimes get the data files, sometimes does not".

I'll be more than happy to help debug what's going on. Let me know instructions.

@andrea-reale
Copy link

andrea-reale commented Dec 3, 2024

@xmontero Thanks for the detailed explanation and for furthering the investigation. Really interesting finds.
I am glad to see that your data is not lost! Now I'd also like to figure out why the data seems to disappear when auto-starting Docker Desktop on boot.

Some answers to your questions:

.. where is my data (the one that I am currently seeing in the docker desktop) physically located?

Your data is in docker_data.vhdx as you correctly identified. Docker Desktop doesn't need an extra docker-desktop-data distribution and manages its own VHDX, using a similar mechanism to what you've used to explore the virtual disk (wsl.exe --mount --bare <....>\docker_data.vhdx). Therefore, as you have now backed it up, you are safe to experiment with it.

Where are the files now stored for everything, now that there's no docker-desktop-data VM/distro? How can I check it out?

You can check the content of your virtual disk by manually mounting the disk in a WSL distribution, much like you did in your exploration above. For example wsl.exe --mount <....>\docker_data.vhdx --name mymountpoint.
Then you can access it, for example, from your ubuntu distribution with:

> wsl -d ubuntu
$ cd /mnt/mymountpoint

You'll see there's a data folder in there. That folder contains a number of subdirectories which contain the data used by the linux docker daemon. They are not straighforward to manually inspect, but they use the exact same format that the docker daemon would use on Linux.

Rember to wsl.exe --unmount <...>\docker_data.vhdx before starting Docker Desktop up again


Now, moving to the real question on why you don't see the data after a windows reboot. I'd ask you to try the following steps if you are happy to help debug.

Please, make sure to have a backup copy of docker_data.vhdx before proceeding.

Check the Docker Desktop version running when containers are present and when they are not

I'd expect them to be the same, but let's make sure of it. You can check the exact version (also known as build number) by clicking on About Docker Desktop on the tray icon. The build number is next to the main version, for example: 4.36.0 (176956).

Ensure you are running the WSL2 backend in both cases.

The other backends (Hyper-V and Windows Containers) use a different storage drive, so that could explain why you don't see the data.

  • Confirm you are using WSL2: open Settings -> General. Make sure that Use the WSL2 based engine is selected.
  • Confirm you are not using Windows containers: right-click on the tray whale icon. There should be an entry "switch to windows containers": that's what I'd expect. If instead you see an entry saying "switch to linux containers" it means that for some reason you are now using the windows containers backend.

Gather diagnostics immediately after starting

I'd kindly ask you to gather two more diagnostics. One immediately after starting Docker Desktop when the containers are missing. One more immediately after starting Docker Desktop when the containers are present. This will help me compare the logs and possibly spot the differences.

You can gather the diagnostics from Docker Desktop by clicking on the question mark icon on the top of the dashboard ("Troubleshoot") and then Get support.

Try restarting the engine

After staring DOcker Desktop with missing containers, could you try to hit "Restart" in the three dots menu on the bottom-left, next to the green "Engine running" label? Does that bring the containers back?

Manually explore the content of the /var/lib directory

Again this will help you manually compare the content of the docker folders when containers are present vs when containers are missing.

  1. Start Docker Desktop
  2. Enter the docker desktop distribution with wsl -d docker-destop
  3. cd /tmp/docker-desktop-root/var/lib/docker
  4. visually the content of this folder in the two cases.

@xmontero
Copy link
Author

xmontero commented Dec 3, 2024

Hi @andrea-reale thanks for your guiding.

I will try all of that and report accordingly. Nevertheless, I run a travel agency and I need to recover the lost time from yesterday and today and I need to pass quotations to clients, sort out travel documents, and all those kind of things.

I am writing to make you know that I received your guidance and I'll do ASAP but not probably today or tomorrow. If I don't sell trips, I dont eat :) you know!!

Just ACK all received and in the queue to be processed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants