Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulties using backup to skip initial sync #174

Open
magks opened this issue May 30, 2023 · 10 comments
Open

Difficulties using backup to skip initial sync #174

magks opened this issue May 30, 2023 · 10 comments

Comments

@magks
Copy link

magks commented May 30, 2023

Has anyone had any luck using a backup or better machine to complete the initial sync for a another machine?

I have a weaker machine with an HDD that was struggling to complete the initial sync (at around 400k blocks it slowed to a couple hundred to 1k addrs/s and 2-5 blocks/s). So I installed bitcoin core and latest (1.9.1) Fulcrum on a more powerful machine. It completed the initial sync and I stopped Fulcrum to transfer the db/ files over to the weaker machine. However, (after first seeing an error mentioning something about headers I realized I had 1.8.1 Fulcrum on the weaker machine. After installing 1.9.1 and dropping the db files into place I still see an error:

FATAL: Caught exception: Error opening meta database: Corruption: unknown checksum type 4 from footer of /home/fulcrum/.fulcrum/db/meta/002785.sst, while checking block at offset 959 size 32
Not sure what other info is relevant but the weaker machine is running Fulcrum in a docker container bulilt from debian:bullseye-slim image which downloads the 1.9.1 x86_64 release from github while the other machine that did the sync is not using docker and installed 1.9.1 via arch linux pacman.

Any help on how to properly do this kind drop in from machine A to machine B or any info on how I might go about diagnose what could be wrong would be appreciated.

Cheers

@SuBPaR42
Copy link

Moving a completed sync (even if not now current - i.e., a few weeks behind) to another machine works. I wasn't able to do as you are trying either.

The old days of E-X had what we called the "foundry" which included completed backups of the DB as their process of creating the DB from genesis was rather time consuming. Fulcrum is ~4x faster than E-X in DB creation in my experiences (Ryzen5 3600, 32GB RAM, BitcoinD on HDD and Fulcrum/E-X on NVMe).

@magks
Copy link
Author

magks commented May 31, 2023

Anyone know of a trustworthy and recent source of Fulcrum DB ?

I may need to just try again to zip and send the DB -- possibly corruption occurred during transfer?

@cculianu
Copy link
Owner

cculianu commented Jun 1, 2023

I think there could be issues with different rocksdb versions. ./Fulcrum --version shows the version. I suspect if you go from newer rocksb to older that's when the trouble happens. Perhaps.

@magks
Copy link
Author

magks commented Jun 3, 2023

I think there could be issues with different rocksdb versions. ./Fulcrum --version shows the version. I suspect if you go from newer rocksb to older that's when the trouble happens. Perhaps.

Nice call, Calin. There is indeed a mismatch in the rocksdb versions. On the weaker computer's fulcrum docker container I see:

Fulcrum 1.9.1 (Release 713d2d7)
Protocol: version min: 1.4, version max: 1.5
compiled: gcc 8.4.0
jemalloc: version 5.2.1-0-gea6b3e9
Qt: version 5.15.6
rocksdb: version 6.14.6-ed43161
simdjson: version 0.6.0
ssl: OpenSSL 1.1.1  11 Sep 2018
zmq: libzmq version: 4.3.3, cppzmq version: 4.7.1

while the machine where I completed the initial sync reports:

Fulcrum 1.9.1 (Release a25ae92)
Protocol: version min: 1.4, version max: 1.5
compiled: gcc 13.110
jemalloc: version 5.3.0-0-g5eaed1
Qt: version 5.15.9
rocksdb: version 8.1.1-unk
simdjson: version 0.6.0
ssl: OpenSSL 3.0.9  30 May 2023
zmq: libzmq version: 4.3.4, cppzmq version: 4.7.1

I'm going to try to remake the docker image to match the other system more closely and hopefully that will get the backup of the db to work as a drop-in. I see why the mismatch in rockdbs version occurred: the machine that complete the initial sync was installed fulcrum via arch linux pacman which brought in newest version of rocksdb as a dependency and then applied a small patch to the source code to work with v8 rocksdb.

@cculianu
Copy link
Owner

cculianu commented Jun 3, 2023

Yeah ... I really wish rocksdb maintained full backwards and forwards compatibility but it doesn't. :/. The compatibility is only backward not forward, AFAIK. I am pretty sure that if you do reproduce the same rocksdb setup as you proposed on the other machine, it should work ok.

@ghost
Copy link

ghost commented Aug 27, 2023

I synced a Fulcrum on my M1 Mac with these software versions:

Fulcrum 1.9.1 (Release b4a7a90-dirty)
Protocol: version min: 1.4, version max: 1.5.1
compiled: clang 14.0.3 (clang-1403.0.22.14.1)
jemalloc: unavailable
Qt: version 6.5.1
rocksdb: version 8.3.2-unk
simdjson: version 0.6.0
ssl: Secure Transport, macOS Ventura (13.5)
zmq: libzmq version: 4.3.4, cppzmq version: 4.7.1

which could NOT be loaded on Linux (same error as mentioned above):

Fulcrum 1.9.1 (Release 713d2d7)
Protocol: version min: 1.4, version max: 1.5
compiled: gcc 8.4.0
jemalloc: version 5.2.1-0-gea6b3e9
Qt: version 5.15.6
rocksdb: version 6.14.6-ed43161
simdjson: version 0.6.0
ssl: OpenSSL 1.1.1  11 Sep 2018
zmq: libzmq version: 4.3.3, cppzmq version: 4.7.1

I can use the database created on the Linux machine on my Mac though.

@cculianu
Copy link
Owner

Yeah that may be because going backwards on rocksdb versions doesn’t work :/

@saradotramli
Copy link

Same issue. Did initial sync on Windows machine and transferred data folder to linux machine.
Fulcrum doesn't start citing db version being different. However the rocksdb version is same, except for the part after the -.

Any possibility of fixing this without resynch?
Even if we use the same versions of Fulcrum binaries on Windows and Linux, db compatibility is not guaranteed?

From Windows box:

Fulcrum 1.10.0 (Release 4aa8f84)
Protocol: version min: 1.4, version max: 1.5.3
compiled: gcc 11.2.0
jemalloc: version 5.2.1-0-gea6b3e9
Qt: version 5.15.2
rocksdb: version 6.14.6-@@
simdjson: version 0.6.0
ssl: OpenSSL 3.0.1 14 Dec 2021
zmq: libzmq version: 4.3.3, cppzmq version: 4.7.1

From Linux box:

Fulcrum 1.10.0 (Release 4aa8f84)
Protocol: version min: 1.4, version max: 1.5.3
compiled: gcc 8.4.0
jemalloc: version 5.2.1-0-gea6b3e9
Qt: version 5.15.6
rocksdb: version 6.14.6-ed43161
simdjson: version 0.6.0
ssl: OpenSSL 1.1.1 11 Sep 2018
zmq: libzmq version: 4.3.3, cppzmq version: 4.7.1

Debug log:

[2024-03-28 09:10:46.321] (Debug) DB "meta" mem: 0.26 MiB
[2024-03-28 09:10:47.127] (Debug) DB "blkinfo" mem: 10.24 MiB
[2024-03-28 09:10:48.657] (Debug) DB "utxoset" mem: 128.00 MiB
[2024-03-28 09:10:50.917] (Debug) DB "scripthash_history" mem: 153.60 MiB
[2024-03-28 09:10:56.735] (Debug) DB "scripthash_unspent" mem: 128.00 MiB
[2024-03-28 09:11:00.755] (Debug) DB "undo" mem: 20.22 MiB
[2024-03-28 09:11:04.092] (Debug) DB "txhash2txnum" mem: 51.20 MiB
[2024-03-28 09:11:07.531] (Debug) DB "rpa" mem: 20.48 MiB
[2024-03-28 09:11:08.085] DB memory: 512.00 MiB
> [2024-03-28 09:11:08.143] FATAL: Caught exception: Incompatible database format -- delete the datadir and resynch.
[2024-03-28 09:10:46.273] (Debug) started with stack size: default
[2024-03-28 09:11:08.143] (Debug) void App::cleanup()
[2024-03-28 09:11:08.143] Stopping Controller ...
[2024-03-28 09:11:08.143] (Debug) Controller cleaned up 2 signal/slot connections
[2024-03-28 09:11:08.143] Closing storage ...

@ghost
Copy link

ghost commented Mar 29, 2024

Any possibility of fixing this without resynch? Even if we use the same versions of Fulcrum binaries on Windows and Linux, db compatibility is not guaranteed?

Calin really can't do anything about it, he's just using rocksdb and does not have any influence how rockdb stores its database internally.

Not sure if this works but it's probably worth a try: You could try running Fulcrum with a newer rocksdb version.

You are using version 6 of rocksdb, Fulcrum comes with it pre-built. I more or less accidentally compiled Fulcrum against a version 8 of rocksdb on my Mac. I found Fulcrum with rockdsb v8 WAY more stable, I had a lot of issues with corrupted databases before, and that almost went away. Now I always compile my own version of Fulcrum to make sure it uses the right rocksdb (but I regularly make backups of the Fulcrum db!).

Opening your v6 database with v8 MIGHT work. I have different versions of rocksdb running and I CAN use my Linux db (v8.7.0) on my Mac (v8.11.3). Not sure if it works the other way around, as this would be a downgrade of the db version.

Linux:

Mar 29 13:01:00 rpi Fulcrum[48379]: jemalloc: version 5.2.1-0-gea6b3e9
Mar 29 13:01:00 rpi Fulcrum[48379]: Qt: version 5.15.3
Mar 29 13:01:00 rpi Fulcrum[48379]: rocksdb: version 8.7.0-unk
Mar 29 13:01:00 rpi Fulcrum[48379]: simdjson: version 0.6.0
Mar 29 13:01:00 rpi Fulcrum[48379]: ssl: OpenSSL 3.0.2 15 Mar 2022
Mar 29 13:01:00 rpi Fulcrum[48379]: zmq: libzmq version: 4.3.4, cppzmq version: 4.7.1
Mar 29 13:01:00 rpi Fulcrum[48379]: Fulcrum 1.10.0 (Release 4aa8f84) - Fri Mar 29, 2024 13:01:00.401 CET - starting up ...

Mac M1:

[2024-03-29 12:59:53.483] jemalloc: version 5.3.0-0-g54eaed1
[2024-03-29 12:59:53.483] Qt: version 6.6.2
[2024-03-29 12:59:53.483] rocksdb: version 8.11.3-unk
[2024-03-29 12:59:53.483] simdjson: version 0.6.0
[2024-03-29 12:59:53.483] ssl: OpenSSL 3.2.1 30 Jan 2024
[2024-03-29 12:59:53.483] zmq: libzmq version: 4.3.5, cppzmq version: 4.7.1
[2024-03-29 12:59:53.483] Fulcrum 1.9.8 (Release d4b3fa1-dirty) - Fri Mar 29, 2024 12:59:53.483 CET - starting up ...

@saradotramli
Copy link

@martinneustein Thanks for your suggestion.

However my scenario is slightly different.

  1. If you look at RocksDb releases, there was only one release for v6.14.6:
    https://github.com/facebook/rocksdb/releases/tag/v6.14.6
  2. It is this version that Fulcrum 1.10.0 uses for both its windows and linux precompiled binaries
  3. You'll see 6.14.6 reported as the RocksDb version by both Fulcrum 1.10.0 binaries
  4. The only difference being the part that comes after the '-' in the version number
  5. This looks more like a 'tag' that was added when the RocksDb libraries were built
  6. On Linux, this 'tag' is ed43161 and actually a reference to the git commit for this RocksDb release
  7. On Windows, for some reason this 'tag' is @@
  8. So the question to @cculianu is, 'Should the DB compatibility check (in Storage.cpp) really take into account that part of the version that comes after the '-', or just compare the major.minor.patch part of it alone?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants