Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BSC version 1.3.7 got panic when syncing using pbss snapshot geth-pbss-pebble-20231217.tar.lz4 #2131

Closed
Tronglx opened this issue Jan 3, 2024 · 19 comments
Assignees

Comments

@Tronglx
Copy link

Tronglx commented Jan 3, 2024

Panic:
github.com/ethereum/go-ethereum/core/rawdb.(*ResettableFreezer).AncientRange(0xcfee40?, {0x28e4c02?, 0xc0149c2828?}, 0xc0149c2928?, 0x248d3a0?, 0xc014d08060?)
	/home/runner/work/bsc/bsc/core/rawdb/freezer_resettable.go:126 +0x5c
github.com/ethereum/go-ethereum/core/rawdb.ReadStateHistoryMetaList(...)
	/home/runner/work/bsc/bsc/core/rawdb/accessors_state.go:180
github.com/ethereum/go-ethereum/trie/triedb/pathdb.checkHistories(0x0, 0x7e83553c7d1bf918?, 0x5028b97545f8df6c?, 0xc0154a7750)
	/home/runner/work/bsc/bsc/trie/triedb/pathdb/history.go:548 +0x85
github.com/ethereum/go-ethereum/trie/triedb/pathdb.(*Database).Recoverable(0xc0113db900, {0x70, 0x16, 0x98, 0xfd, 0xca, 0xe7, 0x2d, 0xa0, 0x18, ...})
	/home/runner/work/bsc/bsc/trie/triedb/pathdb/database.go:363 +0x205
github.com/ethereum/go-ethereum/trie.(*Database).Recoverable(0x7fdafcede2e8?, {0x70, 0x16, 0x98, 0xfd, 0xca, 0xe7, 0x2d, 0xa0, 0x18, ...})
	/home/runner/work/bsc/bsc/trie/database.go:320 +0x45
github.com/ethereum/go-ethereum/core.NewBlockChain({0x33776f8?, 0xc0014b4cc0}, 0x0?, 0x7ffffffe805afca8?, 0x0?, {0x3365fe0?, 0xc001392a00?}, {{0x0, 0x0}, 0x0, ...}, ...)
	/home/runner/work/bsc/bsc/core/blockchain.go:403 +0x14b0
github.com/ethereum/go-ethereum/eth.New(0xc0001860e0, 0xc001701000)
	/home/runner/work/bsc/bsc/eth/backend.go:252 +0x170f
github.com/ethereum/go-ethereum/cmd/utils.RegisterEthService(0x0?, 0xc001701000)
	/home/runner/work/bsc/bsc/cmd/utils/flags.go:2154 +0x167
main.makeFullNode(0xc00153fbf0?)
	/home/runner/work/bsc/bsc/cmd/geth/config.go:181 +0x255
main.geth(0xc001703380)
	/home/runner/work/bsc/bsc/cmd/geth/main.go:341 +0xf3
github.com/urfave/cli/v2.(*Command).Run(0xc0016bc2c0, 0xc001703380, {0xc00012e000, 0x10, 0x10})
	/home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/command.go:274 +0x9eb
github.com/urfave/cli/v2.(*App).RunContext(0xc0004e6d20, {0x334fb30?, 0xc000128010}, {0xc00012e000, 0x10, 0x10})
	/home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:332 +0x616
github.com/urfave/cli/v2.(*App).Run(...)
	/home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.25.7/app.go:309
main.main()
	/home/runner/work/bsc/bsc/cmd/geth/main.go:284 +0x47

Version:
Geth
Version: 1.3.7
Git Commit: f28b98a
Git Commit Date: 20231219
Architecture: amd64
Go Version: go1.20.12
Operating System: linux
GOPATH=
GOROOT=/opt/hostedtoolcache/go/1.20.12/x64

Command:
./geth_linux --config=config.toml --datadir=node --port=30304 --http --http.port 8575 --cache 8000 --rpc.allow-unprotected-txs --history.transactions=0 --syncmode=full --tries-verify-mode=local --pruneancient --db.engine=pebble --state.scheme=path

config.toml
[Eth]
NetworkId = 56
LightPeers = 100
TrieTimeout = 150000000000
StateScheme = "path"

[Eth.Miner]
GasCeil = 140000000
GasPrice = 3000000000
Recommit = 10000000000

[Eth.TxPool]
Locals = []
NoLocals = true
Journal = "transactions.rlp"
Rejournal = 3600000000000
PriceLimit = 3000000000
PriceBump = 10
AccountSlots = 200
GlobalSlots = 8000
AccountQueue = 200
GlobalQueue = 4000

[Eth.GPO]
Blocks = 20
Percentile = 60
OracleThreshold = 1000

[Node]
IPCPath = "geth.ipc"
HTTPHost = "localhost"
InsecureUnlockAllowed = false
HTTPPort = 8545
HTTPVirtualHosts = ["localhost"]
HTTPModules = ["eth", "net", "web3", "txpool", "parlia"]
WSPort = 8546
WSModules = ["net", "web3", "eth"]

[Node.P2P]
MaxPeers = 200
NoDiscovery = false
StaticNodes = []
ListenAddr = ":30311"
EnableMsgEvents = false

[Node.LogConfig]
FilePath = "bsc.log"
MaxBytesSize = 10485760
Level = "info"
FileRoot = ""

OS version:
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal

@Tronglx Tronglx changed the title BSC version 1.3.7 full node panic when using pbss snapshot geth-pbss-pebble-20231217.tar.lz4 BSC version 1.3.7 got panic when syncing using pbss snapshot geth-pbss-pebble-20231217.tar.lz4 Jan 3, 2024
@du5
Copy link

du5 commented Jan 3, 2024

48Club/bsc-snapshots#132

@du5
Copy link

du5 commented Jan 3, 2024

Judging from stdout, geth experienced an OOM, which was the root cause of database corruption.

Unfortunately I didn't capture the metrics

telegram-cloud-document-1-4945218610105682726

@Tronglx
Copy link
Author

Tronglx commented Jan 3, 2024

@du5 is there any way to bypass it?

@du5
Copy link

du5 commented Jan 3, 2024

@du5 is there any way to bypass it?

https://github.com/48Club/bsc-snapshots#geth-full-node-with-pbss

keep version v1.3.6

@Tronglx
Copy link
Author

Tronglx commented Jan 3, 2024

@du5 is there any way to bypass it?

https://github.com/48Club/bsc-snapshots#geth-full-node-with-pbss

keep version v1.3.6

I tried but it didn't work. Do you mean you have to redownload the snapshot? I used 48Club snapshot in both versions: 1.3.6 and 1.3.7 but it didn't work. I always try version 1.3.7 first.

@du5
Copy link

du5 commented Jan 3, 2024

I tried but it didn't work. Do you mean you have to redownload the snapshot? I used 48Club snapshot in both versions: 1.3.6 and 1.3.7 but it didn't work. I always try version 1.3.7 first.

It cannot repair corrupted databases, but will work fine on healthy databases

U need download snapshot first.

Because our snapshots are relatively small, it is recommended to try them first

@Tronglx
Copy link
Author

Tronglx commented Jan 3, 2024

I tried but it didn't work. Do you mean you have to redownload the snapshot? I used 48Club snapshot in both versions: 1.3.6 and 1.3.7 but it didn't work. I always try version 1.3.7 first.

It cannot repair corrupted databases, but will work fine on healthy databases

U need download snapshot first.

Because our snapshots are relatively small, it is recommended to try them first

Ok, let me try.

@Tronglx
Copy link
Author

Tronglx commented Jan 3, 2024

But, I think the root cause need to be fixed.

@du5
Copy link

du5 commented Jan 3, 2024

But, I think the root cause need to be fixed.

lol, This needs to be solved by the BSC team,

@Tronglx
Copy link
Author

Tronglx commented Jan 4, 2024

It didn't work. I'm switching to use Hash-Base Storage Scheme.

@du5
Copy link

du5 commented Jan 4, 2024

It didn't work. I'm switching to use Hash-Base Storage Scheme.

Weird, v1.3.6 is works fine for me

@sysvm
Copy link
Contributor

sysvm commented Jan 5, 2024

@Tronglx Does panic happen immediately or run for some time when geth running? And which snapshot do you use? The config you provided above is complete?

@du5
Copy link

du5 commented Jan 6, 2024

@Tronglx Does panic happen immediately or run for some time when geth running? And which snapshot do you use? The config you provided above is complete?

#2131 (comment) This is a screenshot when oom occurs, without any error and panic logs

48Club/bsc-snapshots#132 (comment) This is the complete log of starting geth after oom occurred

@sysvm
Copy link
Contributor

sysvm commented Jan 7, 2024

@Tronglx Does panic happen immediately or run for some time when geth running? And which snapshot do you use? The config you provided above is complete?

#2131 (comment) This is a screenshot when oom occurs, without any error and panic logs

48Club/bsc-snapshots#132 (comment) This is the complete log of starting geth after oom occurred

Why do you think is OOM? Can you reproduce this?

@Tronglx
Copy link
Author

Tronglx commented Jan 8, 2024

@Tronglx Does panic happen immediately or run for some time when geth running? And which snapshot do you use? The config you provided above is complete?

It occurs after synchronization for a certain period of time.

@Tronglx
Copy link
Author

Tronglx commented Jan 8, 2024

@Tronglx Does panic happen immediately or run for some time when geth running? And which snapshot do you use? The config you provided above is complete?

I provided all config.

@sysvm
Copy link
Contributor

sysvm commented Jan 8, 2024

@Tronglx Does panic happen immediately or run for some time when geth running? And which snapshot do you use? The config you provided above is complete?

I provided all config.

thanks, I'm solving it.

@sysvm
Copy link
Contributor

sysvm commented Jan 11, 2024

@Tronglx Does this panic happen after you restart geth?

@Tronglx
Copy link
Author

Tronglx commented Jan 12, 2024

@Tronglx Does this panic happen after you restart geth?

No, I have no reason to restart it. If an error occurs, the supervisor will automatically restart, of course there must be an error first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants