Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add log for ungraceful shutdown on startup #2215

Merged
merged 7 commits into from
Oct 25, 2023
Merged

Conversation

joshua-kim
Copy link
Contributor

@joshua-kim joshua-kim commented Oct 25, 2023

Why this should be merged

Adds a log to detect if a node previously shutdown ungracefully

How this works

Upon node startup, the node writes a flag into the database. Upon shutdown it cleans up this flag. If we see a prior flag upon startup we log it as a warning to the operator.

How this was tested

Tested manually by killing a node and verifying the log on startup

[10-25|12:43:55.259] WARN node/node.go:568 detected previous ungraceful shutdown

@joshua-kim joshua-kim self-assigned this Oct 25, 2023
@joshua-kim joshua-kim added incident response monitoring This primarily focuses on logs, metrics, and/or tracing labels Oct 25, 2023
@joshua-kim joshua-kim added this to the v1.10.14 milestone Oct 25, 2023
Signed-off-by: Joshua Kim <[email protected]>
@joshua-kim joshua-kim marked this pull request as ready for review October 25, 2023 15:46
node/node.go Outdated
@@ -89,13 +89,19 @@ import (
)

var (
genesisHashKey = []byte("genesisID")
genesisHashKey = []byte("genesisID")
sessionKey = []byte("session")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also consider adding some node prefixdb to put keys under but I didn't do this since we already write genesisID directly.

node/node.go Outdated Show resolved Hide resolved
node/node.go Outdated Show resolved Hide resolved
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
node/node.go Outdated Show resolved Hide resolved
@StephenButtolph StephenButtolph added this pull request to the merge queue Oct 25, 2023
Merged via the queue into dev with commit de168b1 Oct 25, 2023
16 checks passed
@StephenButtolph StephenButtolph deleted the graceful-shutdown branch October 25, 2023 20:10
joshua-kim added a commit that referenced this pull request Oct 27, 2023
commit a4cee60
Author: Dan Laine <[email protected]>
Date:   Fri Oct 27 11:37:05 2023 -0400

    `merkledb` -- don't pass `BranchFactor` to `encodeDBNode` (#2217)

    Signed-off-by: Dan Laine <[email protected]>

commit cacbb9b
Author: Dhruba Basu <[email protected]>
Date:   Thu Oct 26 16:57:56 2023 -0400

    Move all blst function usage to `bls` pkg (#2222)

    Signed-off-by: Dan Laine <[email protected]>
    Co-authored-by: Dan Laine <[email protected]>

commit 3b213fc
Author: Dhruba Basu <[email protected]>
Date:   Thu Oct 26 16:53:26 2023 -0400

    Add json marshal tests to existing serialization tests in `platformvm/txs` pkg (#2227)

commit a6448ac
Author: Dhruba Basu <[email protected]>
Date:   Thu Oct 26 16:48:00 2023 -0400

    Update `golangci-lint` to `v1.55.1` (#2228)

commit fe74ed1
Author: Dan Laine <[email protected]>
Date:   Thu Oct 26 12:54:34 2023 -0400

    `merkledb` -- shift nit (#2218)

    Signed-off-by: Dan Laine <[email protected]>

commit e933587
Author: David Boehm <[email protected]>
Date:   Thu Oct 26 11:40:57 2023 -0400

    Reduce allocations on insert and remove (#2201)

    Signed-off-by: David Boehm <[email protected]>
    Signed-off-by: Dan Laine <[email protected]>
    Co-authored-by: Darioush Jalali <[email protected]>
    Co-authored-by: Dan Laine <[email protected]>

commit 638000c
Author: Stephen Buttolph <[email protected]>
Date:   Thu Oct 26 02:08:05 2023 -0400

    Update versions for v1.10.14 (#2225)

commit 8463690
Author: Stephen Buttolph <[email protected]>
Date:   Thu Oct 26 00:12:05 2023 -0400

    Improve logging for block verification failure (#2224)

commit 787f0b6
Author: Stephen Buttolph <[email protected]>
Date:   Wed Oct 25 17:35:17 2023 -0400

    Fix unexpected unlock (#2221)

commit 1f779af
Author: Stephen Buttolph <[email protected]>
Date:   Wed Oct 25 17:13:25 2023 -0400

    Revert networking AllowConnection change (#2219)

commit f020a05
Author: Dhruba Basu <[email protected]>
Date:   Wed Oct 25 16:51:24 2023 -0400

    Add `TransferSubnetOwnershipTx` (#2178)

    Signed-off-by: Dhruba Basu <[email protected]>

commit 150ffae
Author: Dan Laine <[email protected]>
Date:   Wed Oct 25 16:39:10 2023 -0400

    Add pebble database implementation (#1999)

    Co-authored-by: Dhruba Basu <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit de168b1
Author: Joshua Kim <[email protected]>
Date:   Wed Oct 25 15:47:48 2023 -0400

    Add log for ungraceful shutdown on startup (#2215)

    Signed-off-by: Joshua Kim <[email protected]>

commit cd77a1e
Author: Dhruba Basu <[email protected]>
Date:   Wed Oct 25 13:27:43 2023 -0400

    Remove `aggregate` struct (#2213)

commit 128757d
Author: Ceyhun Onur <[email protected]>
Date:   Wed Oct 25 02:03:45 2023 +0300

    Move the overridden manager into the node (#2199)

    Signed-off-by: Ceyhun Onur <[email protected]>
    Co-authored-by: Alberto Benegiamo <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit 7903676
Author: Ceyhun Onur <[email protected]>
Date:   Tue Oct 24 20:44:01 2023 +0300

    Remove contains from validator manager interface (#2198)

    Signed-off-by: Ceyhun Onur <[email protected]>
    Signed-off-by: Stephen Buttolph <[email protected]>
    Co-authored-by: Alberto Benegiamo <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit 963aeb0
Author: Joshua Kim <[email protected]>
Date:   Tue Oct 24 12:54:31 2023 -0400

    Update TestDialContext to use ManuallyTrack (#2209)

    Signed-off-by: Joshua Kim <[email protected]>

commit e337dda
Author: Stephen Buttolph <[email protected]>
Date:   Tue Oct 24 12:42:05 2023 -0400

    Remove duplicate networking check (#2204)

    Signed-off-by: Dan Laine <[email protected]>
    Co-authored-by: Dan Laine <[email protected]>

commit 020e802
Author: Stephen Buttolph <[email protected]>
Date:   Mon Oct 23 17:37:14 2023 -0400

    Add RSA max key length test (#2205)

commit d3287dd
Author: Alberto Benegiamo <[email protected]>
Date:   Mon Oct 23 12:09:10 2023 -0700

    Use custom codec for validator metadata (#1510)

commit 3a1dcca
Author: Stephen Buttolph <[email protected]>
Date:   Mon Oct 23 13:47:37 2023 -0400

    Update local network readme (#2203)

commit 7b7931b
Author: Ceyhun Onur <[email protected]>
Date:   Mon Oct 23 20:36:47 2023 +0300

    Redesign validator set management to enable tracking all subnets (#1857)

    Signed-off-by: Ceyhun Onur <[email protected]>
    Co-authored-by: Alberto Benegiamo <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit 4cd7051
Author: Alberto Benegiamo <[email protected]>
Date:   Mon Oct 23 08:48:56 2023 -0700

    Shutdown TimeoutManager during node Shutdown (#1707)

    Signed-off-by: Alberto Benegiamo <[email protected]>
    Co-authored-by: Dan Laine <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>
    Co-authored-by: Joshua Kim <[email protected]>

commit 6624270
Author: Joshua Kim <[email protected]>
Date:   Mon Oct 23 11:37:03 2023 -0400

    Add Heap Set (#2136)

    Signed-off-by: Joshua Kim <[email protected]>
    Signed-off-by: Stephen Buttolph <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>
    Co-authored-by: Dhruba Basu <[email protected]>
    Co-authored-by: dboehm-avalabs <[email protected]>
    Co-authored-by: David Boehm <[email protected]>
    Co-authored-by: Alberto Benegiamo <[email protected]>

commit 804f45b
Author: Stephen Buttolph <[email protected]>
Date:   Fri Oct 20 14:16:44 2023 -0400

    Move selectStartGear lock from Handler into Engines (#2182)

commit 7ed450b
Author: Joshua Kim <[email protected]>
Date:   Fri Oct 20 09:55:40 2023 -0400

    Implement Heap Map (#2137)

    Signed-off-by: Joshua Kim <[email protected]>
    Co-authored-by: Alberto Benegiamo <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit a21d0cf
Author: Stephen Buttolph <[email protected]>
Date:   Thu Oct 19 18:13:46 2023 -0400

    Move HealthCheck lock from Handler into Engines (#2173)

commit 26b1505
Author: Stephen Buttolph <[email protected]>
Date:   Thu Oct 19 17:48:22 2023 -0400

    Move Shutdown lock from Handler into Engines (#2179)

commit 8c6b9d3
Author: Dhruba Basu <[email protected]>
Date:   Thu Oct 19 15:36:31 2023 -0400

    Add tests for BanffBlock serialization (#2194)

commit d9460de
Author: Dan Laine <[email protected]>
Date:   Thu Oct 19 15:26:16 2023 -0400

    Deprecate keystore config (#2195)

commit a9c260b
Author: David Boehm <[email protected]>
Date:   Thu Oct 19 14:44:51 2023 -0400

    Merkle db Make Paths only refer to lists of nodes (#2143)

    Signed-off-by: David Boehm <[email protected]>
    Co-authored-by: Darioush Jalali <[email protected]>

commit 0faab95
Author: Joshua Kim <[email protected]>
Date:   Thu Oct 19 14:43:29 2023 -0400

    Update P2P proto docs (#2181)

    Signed-off-by: Joshua Kim <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit 927c23d
Author: Dan Laine <[email protected]>
Date:   Thu Oct 19 13:07:46 2023 -0400

    Deprecate IPC configs (#2168)

    Co-authored-by: Stephen Buttolph <[email protected]>

commit 3b843a3
Author: Stephen Buttolph <[email protected]>
Date:   Thu Oct 19 12:14:58 2023 -0400

    Update cgo usage (#2184)

Signed-off-by: Joshua Kim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
incident response monitoring This primarily focuses on logs, metrics, and/or tracing
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants