Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] panicked when importing snapshot #3539

Closed
jiangying000 opened this issue Jul 17, 2022 · 6 comments · Fixed by #3561
Closed

[Bug Report] panicked when importing snapshot #3539

jiangying000 opened this issue Jul 17, 2022 · 6 comments · Fixed by #3561
Assignees
Labels
bug Something isn't working

Comments

@jiangying000
Copy link
Collaborator

jiangying000 commented Jul 17, 2022

Bug Report

Starcoin version:

current master HEAD
commit: 9349f09

Current behavior:
run command ( command took from cookbook )

./import_snapshot.sh main ~/snapshot/ ~/.starcoin/main

release version silently fails.
debug version panicked with

pure virtual method called
terminate called without an active exception
Aborted (core dumped)

Expected behavior:

snapshot imported

Steps to reproduce:

./import_snapshot.sh main ~/snapshot/ ~/.starcoin/main

Related code:

logs show code run to this line before panick

mbar.join_and_clear()?;

Other information:

run many times, always panick 3-5s after import acc_node_block reaches 100%

just a guess, maybe relate to

  1. RocksDB objects can be live during static destruction, which is UB rust-rocksdb/rust-rocksdb#463
  2. integrity issue of snapshot file

env
Ubuntu 18.04 LTS x86_64

@jiangying000 jiangying000 added the bug Something isn't working label Jul 17, 2022
@nkysg
Copy link
Collaborator

nkysg commented Jul 18, 2022

@jiangying000 could you show logs?

@jiangying000
Copy link
Collaborator Author

jiangying000 commented Jul 18, 2022

image

text version of the image:

run RUST_BACKTRACE=full cargo run --release --bin starcoin_db_exporter apply-snapshot -i /home/jiangying/snapshot/snapshot - n main -o /home/jiangying/.starcoin/main_from_snapshot_master

logs:

   Compiling starcoin-config v1.11.11 (/home/jiangying/code/playground/starcoin/config)
   Compiling starcoin-vm-runtime v1.11.11 (/home/jiangying/code/playground/starcoin/vm/vm-runtime)
   Compiling starcoin-transaction-builder v1.11.11 (/home/jiangying/code/playground/starcoin/vm/transaction-builder)
   Compiling starcoin-storage v1.11.11 (/home/jiangying/code/playground/starcoin/storage)
   Compiling starcoin-dev v1.11.11 (/home/jiangying/code/playground/starcoin/vm/dev)
   Compiling starcoin-executor v1.11.11 (/home/jiangying/code/playground/starcoin/executor)
   Compiling starcoin-open-block v1.11.11 (/home/jiangying/code/playground/starcoin/chain/open-block)
   Compiling starcoin-chain v1.11.11 (/home/jiangying/code/playground/starcoin/chain)
   Compiling starcoin-genesis v1.11.11 (/home/jiangying/code/playground/starcoin/genesis)
   Compiling db-exporter v1.11.11 (/home/jiangying/code/playground/starcoin/cmd/db-exporter)
    Finished release [optimized] target(s) in 1m 03s
     Running `target/release/starcoin_db_exporter apply-snapshot -i /home/jiangying/snapshot/snapshot -n main -o /home/jiangying/.starcoin/main_from_snapshot_master`
[00:00:08] ███████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11% import acc_node_block 771
[00:01:25] ████████████████████████████████████████████████████████████████████████████████████████████████████ 100% import acc_node_block 6821
[00:01:25] ██████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 14% import block 966
[00:01:25] ████████████████████████████████████████████████████████████████████████████████████████████████████ 100% import acc_node_block 6823
[00:01:29] ███████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 15% import block 1034
[00:01:29] ████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 17% import block_info 1144
[00:00:01] ████████████████████████████████████████████████████████████████████████████████████████████████████ 100% import state_node index 22

without --release option, will additionally output following lines at the end of the log

pure virtual method called
terminate called without an active exception
Aborted (core dumped)

@nkysg
Copy link
Collaborator

nkysg commented Jul 18, 2022

I have test using the main net 6850080 height block 。It shows that "acc_node_block hash not match root_hash 0xce60e948a82904cee4909edf30904b168ecf1585fa2c7b9807d42a13c05f56ab verify_hash 0xe8d68cee8d2d7e14a523993ca8a450c126dd1d8c28c2cd42cc1931e21fba11bd"。
you can use this cmd starcoin_db_exporter apply-snapshot -i /home/jiangying/snapshot/snapshot -n main -o /home/jiangying/.starcoin/main_from_snapshot_master > log 2>&1 to see what it print out。It's not releated about rocksdb。 you can see manifest.cvs see what's the height of the block。
It's like this

acc_node_block 6850080 0xe8d68cee8d2d7e14a523993ca8a450c126dd1d8c28c2cd42cc1931e21fba11bd
block 6850080 0x28d996652d9aff5af34969abe9763db10c9880d830c1d6666fea9f3f58ab95b6
block_info 6850080 0x28d996652d9aff5af34969abe9763db10c9880d830c1d6666fea9f3f58ab95b6
acc_node_transaction 7519546 0x6a8c1b7c366a933781c00ef59e955faf4c6a6124f44dcae44ba52096c3e6f36a
state_node 22305 0x88482f560534b3b31ee11b03f6d62bb4481a95760286101e3097a1eec49b9785

the block height is 6850080

@jiangying000
Copy link
Collaborator Author

thank you! I see the err log after redirecting it to file, maybe progress bar overwritten the error log in console?

snapshot_state hash match
acc_node_block hash not match root_hash 0x385f09e15b1bb0131a63b45a11564255c5521723dc8197b549fa0c2b9644077f verify_hash 0x10b5bc67d5fa8697744b4dff7c6f4aaed18f49774cc21574f58c9df030ab13d5

and my snapshot version is

acc_node_block 6815760 0x10b5bc67d5fa8697744b4dff7c6f4aaed18f49774cc21574f58c9df030ab13d5
block 6815760 0x2266af7af53b50dbcb3f8172b977d3bc9efd15c6d8119bd29744054731e4c3bc
block_info 6815760 0x2266af7af53b50dbcb3f8172b977d3bc9efd15c6d8119bd29744054731e4c3bc
acc_node_transaction 7482345 0x8f03e33833f9cbe7130f4421042d205915778fb660bd08605e20b6ed9c1bfab3
state_node 22294 0x1dc1ebe7ae7a0497aea81487136de335ce8eb57b7ef39e0ffb045b15746d7cc8

seems the snapshot file is broken

@nkysg
Copy link
Collaborator

nkysg commented Jul 18, 2022

I don't know the reason。I test the barnard snapshot.tar.gz it works well. I will find the reason.

@nkysg
Copy link
Collaborator

nkysg commented Jul 18, 2022

my snapshot version is

acc_node_block 6850080 0xe8d68cee8d2d7e14a523993ca8a450c126dd1d8c28c2cd42cc1931e21fba11bd
block 6850080 0x28d996652d9aff5af34969abe9763db10c9880d830c1d6666fea9f3f58ab95b6
block_info 6850080 0x28d996652d9aff5af34969abe9763db10c9880d830c1d6666fea9f3f58ab95b6
acc_node_transaction 7519546 0x6a8c1b7c366a933781c00ef59e955faf4c6a6124f44dcae44ba52096c3e6f36a
state_node 22305 0x88482f560534b3b31ee11b03f6d62bb4481a95760286101e3097a1eec49b9785

but when i use wc -l block, it's output

 wc -l block
 6863080 block
wc -l acc_node_block
6858080

I find that main net block height [6456001-6469000] in file block dup。So I think if there is something wrong run code 'https://github.com/starcoinorg/starcoin/blob/master/scripts/sync_block.py#L93', because increment export will append these file fail, may be it will append twice。 But import is not idempotent, so there is something wrong。
I think it's the reason why acc_node_block verify fail。
And I think sync_block.py increment export should atomic.

I think sync_block.py should backupmv /sc-data/snapshot to /sc-data/snapshot.bak , then run increment export, if it success,
run rm /sc-data/snapshot.bak。if it fail run rm /sc-data/snapshot; mv /sc-data/snapshot.bak /sc-data/snapshot。How to check it success or fail, check manifest.csv every line , it's a tuple (filename, lines_cnt, hash) , check the lines of filename == lines_cnt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants