Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node crash on Windows server 2012R2 , RocksDB V23.0DB2 #3491

Closed
zhyatt opened this issue Oct 5, 2021 · 5 comments · Fixed by #3568
Closed

Node crash on Windows server 2012R2 , RocksDB V23.0DB2 #3491

zhyatt opened this issue Oct 5, 2021 · 5 comments · Fixed by #3568
Assignees
Milestone

Comments

@zhyatt
Copy link
Collaborator

zhyatt commented Oct 5, 2021

Summary

Originally reported by Ricki in Discord. Node crashes when receiving incoming blocks, may require RocksDB.

Node version

V23.0DB2

Build details

Binary build for Windows: https://s3.us-east-2.amazonaws.com/repo.nano.org/beta/binaries/nano-node-V23.0DB2-win64.exe

OS and version

Windows server 2012R2

Steps to reproduce the behavior

  1. Start Windows node on beta
  2. Use rocksdb
  3. Use this simple config file:
[node]
receive_minimum = "1"
[node.rocksdb]
enable = true
  1. Load an account that needs to receive transactions (I have one that should work, if needed ping me)
  2. Run nano_wallet.exe

There may also be a related bug when setting work_watcher_period = 90 in the config, which when added to the above generates a c0000005 exception code.

Expected behavior

Node would receive all transactions and function normally on the network.

Actual behavior

Node crashes with output:

Faulting application name: nano_wallet.exe, version: 0.0.0.0, time stamp: 0x615b2283
Faulting module name: ucrtbase.DLL, version: 10.0.14393.2990, time stamp: 0x5caeb96f
Exception code: 0xc0000409
Fault offset: 0x000000000006e00e
Faulting process id: 0xb90
Faulting application start time: 0x01d7ba07c4f9dbbd
Faulting application path: C:\Program Files\nanocurrency-beta\nano_wallet.exe
Faulting module path: C:\Windows\SYSTEM32\ucrtbase.DLL
Report Id: 04ff3276-25fb-11ec-8418-0a7fdc2bfc9a
Faulting package full name: 
Faulting package-relative application ID:

No helpful details in nano logs.

Possible solution

No response

Supporting files

No response

@cryptocode
Copy link
Contributor

If anyone wanna take a stab at a PR: Looks like do_wallet_actions can get called (from a separate thread) before the node object is fully constructed. This causes the distributed_work constructor's call to node_a.shared () to fail with an invalid weak_ptr.

Maybe using node_initialized_latch is an idea, before the call to do_wallet_actions. That said, I wonder if node_initialized_latch is currently racy because it's counted down at the end of the node constructor. Should likely instead be counted down in node::start since then the node object is fully constructed, but not sure if that refactoring has consequences.

@zhyatt zhyatt added this to the V23.0 milestone Oct 6, 2021
@clemahieu
Copy link
Contributor

If anyone wanna take a stab at a PR: Looks like do_wallet_actions can get called (from a separate thread) before the node object is fully constructed. This causes the distributed_work constructor's call to node_a.shared () to fail with an invalid weak_ptr.

Maybe using node_initialized_latch is an idea, before the call to do_wallet_actions. That said, I wonder if node_initialized_latch is currently racy because it's counted down at the end of the node constructor. Should likely instead be counted down in node::start since then the node object is fully constructed, but not sure if that refactoring has consequences.

I'm in favor of adding a wallets::start pattern. Since getting block work from distributed_work doesn't work before the node constructor completes it needs to start after the constructor.

@zhyatt
Copy link
Collaborator Author

zhyatt commented Nov 2, 2021

@argakiig Ricki found this issue which references the same Faulting module path: C:\Windows\SYSTEM32\ucrtbase.DLL and suggests turning off the "Sign the ClickOnce Manifest" in VS for the build. Is this an option we have set and could generate a new DB3 build without to see if the issue can still be replicated after? https://stackoverflow.com/questions/55386391/qbfc-faulting-module-name-ucrtbase-dll-exception-code-0xc0000409

@argakiig
Copy link
Contributor

argakiig commented Nov 2, 2021

Do you know what the default option is? we dont have access to the UI on the github runners so would need to research a programatic way to disable this if it was enabled by default.

@zhyatt
Copy link
Collaborator Author

zhyatt commented Nov 3, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants