Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Panic Error When Fetching Height #1037

Open
gfanton opened this issue Aug 9, 2023 · 1 comment
Open

Unexpected Panic Error When Fetching Height #1037

gfanton opened this issue Aug 9, 2023 · 1 comment
Assignees
Labels
🐞 bug Something isn't working gnopher-hole

Comments

@gfanton
Copy link
Member

gfanton commented Aug 9, 2023

Unexpected Panic Error When Fetching Height

Description

While restarting the node after a long run using gnolang start, I encountered a panic error (twice), related to fetching height 5063, which seems not to be present in the WAL file. This issue appears to occur when waking up the system from sleep mode on my laptop after an extended period of inactivity.

According to @ajnavarro's answer on Slack, the problem might occur when saving height data into both the key/value database and the WAL file. There's suspicion of missing flush calls in various parts of the code.

Your environment

  • Apple M1, Ventura 13.4.1
  • master (9045a8e)

Steps to reproduce

  • Encountered the issue during a regular run of the node
  • I am not able to reproduce the issue intentionally yet
  • I was able to backup the corrupted testdir testdir.corrupted.zip

Expected behavior

The application should fetch height information without causing a panic error.

Actual behavior

A panic error occurs, with the message "panic: should not happen."

Logs

09:19:29 gno/gno.land  $ ./build/gnoland start
.level 1 .msg Starting multiAppConn [impl multiAppConn [module proxy]]
.level 1 .msg Starting localClient [[impl localClient [module abci-client connection query]] [module proxy]]
.level 1 .msg Starting localClient [[impl localClient [module abci-client connection mempool]] [module proxy]]
.level 1 .msg Starting localClient [[impl localClient [module abci-client connection consensus]] [module proxy]]
.level 1 .msg Starting IndexerService [impl IndexerService [module txindex]]
.level 1 .msg ABCI Handshake App Info [height 7610 hash 9EB20EC963DB59891297A978FF3E71F2CE6E67B49C30C792B8ECF42C79C1536E abci-version  app-version  [module consensus]]
.level 1 .msg ABCI Replay Blocks [appHeight 7610 storeHeight 7610 stateHeight 7610 [module consensus]]
.level 1 .msg Completed ABCI Handshake - Tendermint and App are synced [appHeight 7610 appHash 9EB20EC963DB59891297A978FF3E71F2CE6E67B49C30C792B8ECF42C79C1536E [module consensus]]
.level 1 .msg Version info version v1.0.0-rc.0
.level 1 .msg This node is a validator [addr g1r6kdta4wyxk5fss27srvzz63ynhvv8kruend6l pubKey gpub1pggj7ard9eg82cjtv4u52epjx56nzwgjyg9zqxuxs8ufrsqtw2tjwksfqrksfj4q08gusccqf9kvlhd9te7dg3mjsqa4rk [module consensus]]
.level 1 .msg P2P Node ID [ID g1jqn90xmvwnxpqcfyzg3xm6tpzjwvcwj0dxycq0 file testdir/config/node_key.json [module p2p]]
.level 1 .msg Adding persistent peers [addrs [] [module p2p]]
Node created.
.level 1 .msg Starting Node impl Node
.level 1 .msg Starting RPC HTTP server on 127.0.0.1:26657 [[module rpc-server]]
.level 1 .msg Starting P2P Switch [impl P2P Switch [module p2p]]
.level 1 .msg Starting Reactor [impl Reactor [module mempool]]
.level 1 .msg Starting BlockchainReactor [impl BlockchainReactor [module blockchain]]
.level 1 .msg Starting ConsensusReactor [impl ConsensusReactor [module consensus]]
.level 1 .msg ConsensusReactor  [fastSync false [module consensus]]
.level 1 .msg Starting ConsensusState [impl ConsensusState [module consensus]]
.level 1 .msg Starting baseWAL [[impl baseWAL [wal testdir/data/cs.wal/wal]] [module consensus]]
.level 1 .msg Starting Group [[impl Group [wal testdir/data/cs.wal/wal]] [module consensus]]
.level 1 .msg Starting TimeoutTicker [impl TimeoutTicker [module consensus]]
.level 1 .msg Searching for height [[height 7612 min 0 max 1 [wal testdir/data/cs.wal/wal]] [module consensus]]
.level 0 .msg Starting timeout routine [[module consensus]]
panic: should not happen

goroutine 1 [running]:
github.com/gnolang/gno/tm2/pkg/bft/wal.(*baseWAL).SearchForHeight(0x14006702690, 0x1dbc, 0x14005c4c160)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/bft/wal/wal.go:316 +0xcec
github.com/gnolang/gno/tm2/pkg/bft/consensus.(*ConsensusState).catchupReplay(0x14005e1e300, 0x1dbb)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/bft/consensus/replay.go:110 +0xa0
github.com/gnolang/gno/tm2/pkg/bft/consensus.(*ConsensusState).OnStart(0x14005e1e300)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/bft/consensus/state.go:319 +0x218
github.com/gnolang/gno/tm2/pkg/service.(*BaseService).Start(0x14005e1e300)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/service/service.go:139 +0x27c
github.com/gnolang/gno/tm2/pkg/bft/consensus.(*ConsensusReactor).OnStart(0x1400610fc80)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/bft/consensus/reactor.go:76 +0x10c
github.com/gnolang/gno/tm2/pkg/service.(*BaseService).Start(0x1400610fc80)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/service/service.go:139 +0x27c
github.com/gnolang/gno/tm2/pkg/p2p.(*Switch).OnStart(0x14005382780)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/p2p/switch.go:201 +0x7c
github.com/gnolang/gno/tm2/pkg/service.(*BaseService).Start(0x14005382780)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/service/service.go:139 +0x27c
github.com/gnolang/gno/tm2/pkg/bft/node.(*Node).OnStart(0x1400550f500)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/bft/node/node.go:607 +0x304
github.com/gnolang/gno/tm2/pkg/service.(*BaseService).Start(0x1400550f500)
	/Users/gfanton/code/gnolang/gno/tm2/pkg/service/service.go:139 +0x27c
main.execStart(0x140002ef5e0, {0x104a74640?, 0x0?, 0x0?}, 0x140000950e0)
	/Users/gfanton/code/gnolang/gno/gno.land/cmd/gnoland/start.go:169 +0x378
main.newStartCmd.func1({0x0?, 0x0?}, {0x10598f7c8?, 0x14000291538?, 0x0?})
	/Users/gfanton/code/gnolang/gno/gno.land/cmd/gnoland/start.go:51 +0x38
github.com/gnolang/gno/tm2/pkg/commands.(*Command).Run(0x0?, {0x105434448?, 0x1400011a010?})
	/Users/gfanton/code/gnolang/gno/tm2/pkg/commands/command.go:233 +0x17c
github.com/gnolang/gno/tm2/pkg/commands.(*Command).Run(0x140002913f0?, {0x105434448?, 0x1400011a010?})
	/Users/gfanton/code/gnolang/gno/tm2/pkg/commands/command.go:237 +0x12c
github.com/gnolang/gno/tm2/pkg/commands.(*Command).ParseAndRun(0x1400010c000?, {0x105434448, 0x1400011a010}, {0x14000136010?, 0x60?, 0x0?})
	/Users/gfanton/code/gnolang/gno/tm2/pkg/commands/command.go:118 +0x4c
main.main()
	/Users/gfanton/code/gnolang/gno/gno.land/cmd/gnoland/root.go:17 +0x74
@moul
Copy link
Member

moul commented Aug 9, 2023

  • @jaekwon will review the tarball
  • would be nice to write a fuzzing/testing tool to try generating similar issues

@moul moul moved this to 🏆 Needed for Launch in 🚀 The Launch [DEPRECATED] Sep 5, 2023
@moul moul added this to the 🚀 main.gno.land milestone Sep 6, 2023
@Kouteki Kouteki removed this from the 🚀 Mainnet launch milestone Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working gnopher-hole
Projects
Status: 🚀 Needed for Launch
Development

No branches or pull requests

5 participants