Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop panicking when pruning unused queued blocks #2842

Merged
merged 2 commits into from
Oct 7, 2021
Merged

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Oct 7, 2021

Motivation

  1. When Zebra prunes queued blocks, it drops their senders without sending a response. This caused a panic in one Zebra testnet node, shortly after NU5 activation.

  2. When Zebra prunes queued blocks, it doesn't remove their known UTXOs.

This is unexpected work in Sprint 20.

Logs

The panic happens when the CommitBlock future checks for a response on its oneshot receiver.

Oct 06 22:10:10.960  INFO {zebrad="339fefb" net="Test"}:crawl_and_dial{crawl_new_peer_interval=60s}: zebra_network::peer_set::candidate_set: timeout waiting for the peer service to become rea
dy                                                                                                                                                                                             
Oct 06 22:10:28.037  INFO {zebrad="339fefb" net="Test"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 13 peers to answer
 requests address_metrics=AddressMetrics { responded: 13, never_attempted_gossiped: 0, never_attempted_alternate: 0, failed: 546, attempt_pending: 49, recently_live: 13, recently_stopped_resp
onding: 0 }                                                                                                                                                                                    
Oct 06 22:10:41.680  INFO {zebrad="339fefb" net="Test"}:crawl_and_dial{crawl_new_peer_interval=60s}: zebra_network::peer_set::candidate_set: timeout waiting for the peer service to become rea
dy                                                                                                                                                                                             
The application panicked (crashed).                                                                                                                                                            
Message:  sender is not dropped: RecvError(())                                                                                                                                                 
Location: zebra-state/src/service.rs:694                                                                                                                                                       
                                                                                                                                                                                               
Metadata:                                                                                                                                                                                      
version: 1.0.0-alpha.17+39.g339fefb                                                                                                                                                            
Zcash network: Testnet                                                                                                                                                                         
state version: 10                                                                                                                                                                              
branch: main
git commit: 339fefb                            
commit timestamp: 2021-10-06T01:08:41+00:00
target triple: x86_64-unknown-linux-gnu
build profile: release

Solution

  • Send an error on pruned queued blocks
  • Remove known UTXOs from pruned queued blocks
    • Add tests for UTXO removal

Review

Anyone can review this PR.

Reviewer Checklist

  • Code implements Specs and Designs
  • Tests for Expected Behaviour
  • Tests for Errors

@teor2345 teor2345 added C-bug Category: This is a bug A-rust Area: Updates to Rust code P-Medium I-panic Zebra panics with an internal error message I-heavy Problems with excessive memory, disk, or CPU usage labels Oct 7, 2021
@teor2345 teor2345 added this to the 2021 Sprint 20 milestone Oct 7, 2021
@teor2345 teor2345 requested a review from upbqdn October 7, 2021 02:19
@teor2345 teor2345 self-assigned this Oct 7, 2021
@teor2345 teor2345 changed the title Stop Stop panicking when pruning unused queued blocks Oct 7, 2021
This causes a rare panic, because Zebra expects every queued sender
to send a response.
@dconnolly dconnolly merged commit 0b82298 into main Oct 7, 2021
@dconnolly dconnolly deleted the state-queue-panic branch October 7, 2021 04:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust Area: Updates to Rust code C-bug Category: This is a bug I-heavy Problems with excessive memory, disk, or CPU usage I-panic Zebra panics with an internal error message
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants