refactor(zetaclient): streamline process initialization #3291

swift1337 · 2024-12-12T19:54:34Z

This PR streamlines the start.go file and introduces the graceful package to facilitate a graceful shutdown. It also includes various other miscellaneous improvements.

However, the diff is quite challenging to navigate. I suggest checking out the branch to see the changes in start.go and pkg/graceful (I also like Github Dev https://github.dev/zeta-chain/node/pull/3291)

Summary by CodeRabbit

Release Notes

New Features
- Added a new telemetry endpoint /systemtime in the ZetaClient.
- Introduced a function to resolve the database path for the chain observer.
Bug Fixes
- Fixed issues with unsupported transaction versions for Solana.
- Improved error handling for various operations, including configuration loading and logger creation.
Improvements
- Enhanced initialization process for the ZetaClient with graceful shutdown capabilities.
- Streamlined logging setup and error reporting across multiple components.
Tests
- Updated test suite for the graceful shutdown process, ensuring robust validation of service behaviors.
Documentation
- Updated changelog to reflect recent changes and improvements across features and fixes.

gitguardian · 2024-12-12T19:54:39Z

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request

GitGuardian id	GitGuardian status	Secret	Commit	Filename
14567965	Triggered	Generic Password	`19c089f`	cmd/zetaclientd/start.go	View secret

🛠 Guidelines to remediate hardcoded secrets

Understand the implications of revoking this secret by investigating where it is used in your code.
Replace and store your secret safely. Learn here the best practices.
Revoke and rotate this secret.
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider

following these best practices for managing and storing secrets including API keys and other credentials
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.

^{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}

coderabbitai · 2024-12-12T19:54:42Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This pull request introduces significant updates across various components of the zetaclient. Key changes include the addition of a new telemetry endpoint /systemtime, restructuring of the initialization process to enhance error handling and logging, and modifications to the test organization. The TSS package has been revamped, and several fixes have been implemented to improve transaction handling, particularly for Solana. Additionally, a new graceful package has been introduced to manage service shutdowns, along with comprehensive tests to ensure robust functionality.

Changes

File Path	Change Summary
changelog.md	Updated to reflect changes across features, tests, refactoring, and fixes; added `/systemtime` telemetry endpoint; modified test organization; revamped TSS package; implemented various fixes.
cmd/zetaclientd/start.go	Restructured `Start` function; introduced `promptPasswords` function; improved error handling; integrated telemetry server initialization; modified shutdown handling.
cmd/zetaclientd/utils.go	Removed `createAuthzSigner`; integrated its logic into `createZetacoreClient`; replaced `waitForZetaCore` with `waitForBlocks`; added `prepareZetacoreClient`, `isObserverNode`, and `resolveObserverPubKeyBech32`.
pkg/graceful/graceful.go	Introduced a new package for graceful shutdowns; defined `Process` struct and `Service` interface; added methods for service management and shutdown handling.
pkg/graceful/graceful_test.go	Added comprehensive test suite for `Process` functionality; validated service shutdown scenarios and error handling.
pkg/os/console.go	Corrected typographical error in `readPassword` function; renamed `delimitor` to `delimiter`.
zetaclient/chains/base/logger.go	Renamed `ComplianceLogFile` to `complianceLogFile`; renamed `InitLogger` to `NewLogger`; refactored logging initialization logic.
zetaclient/chains/base/logger_test.go	Updated test function to reflect renaming of logger initialization function from `InitLogger` to `NewLogger`.
zetaclient/chains/interfaces/interfaces.go	Removed `GetLogger()` method from `ZetacoreClient` interface.
zetaclient/config/config.go	Added `ResolveDBPath` function to resolve the path to the chain observer database.
zetaclient/metrics/metrics.go	Updated `Start` and `Stop` methods for improved error handling; changed method signatures to include `context.Context`.
zetaclient/metrics/metrics_test.go	Modified `SetUpSuite` to start `m.Start()` in a new goroutine with `context.Background()`.
zetaclient/metrics/telemetry.go	Updated `Start` and `Stop` methods for better control flow and error handling; reduced shutdown timeout from 10 seconds to 5 seconds.
zetaclient/zetacore/client.go	Removed `GetLogger` method from `Client` struct.

Possibly related PRs

chore: upgrade btcd (and tss deps) #2750: Upgrades dependencies related to TSS, relevant to the TSS package revamp.
feat: implementation of Solana restricted address #2795: Adds functionality for restricted addresses in Solana, related to transaction handling updates.
ci: add rpcimportable test #2817: Enhances CI workflow, connecting to improvements in testing processes.
fix(zetaclient): use on-chain chain struct #2834: Refactors on-chain structures, relevant to initialization and configuration changes.
chore: sync v20 changelog #2872: Syncs changelog for version 20, providing context for the main PR changes.
chore: allow full zetaclient config overlay #2945: Allows full configuration overlay for the zetaclient, aligning with configuration improvements.
refactor: add error field in cctx #2952: Adds error field in CCTX, directly related to error handling enhancements.
fix: skip solana unsupported transaction version #3206: Fixes unsupported transaction versions for Solana, directly relevant to transaction handling updates.

Suggested labels

no-changelog, UPGRADE_LIGHT_TESTS

Suggested reviewers

fbac
kingpinXD
ws4charlie
brewmaster012
skosito
lumtis

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

github-actions · 2024-12-12T19:55:06Z

!!!WARNING!!!
nosec detected in the following files: cmd/zetaclientd/start.go

Be very careful about using #nosec in code. It can be a quick way to suppress security warnings and move forward with development, it should be employed with caution. Suppressing warnings with #nosec can hide potentially serious vulnerabilities. Only use #nosec when you're absolutely certain that the security issue is either a false positive or has been mitigated in another way.

Only suppress a single rule (or a specific set of rules) within a section of code, while continuing to scan for other problems. To do this, you can list the rule(s) to be suppressed within the #nosec annotation, e.g: /* #nosec G401 */ or //#nosec G201 G202 G203
Broad #nosec annotations should be avoided, as they can hide other vulnerabilities. The CI will block you from merging this PR until you remove #nosec annotations that do not target specific rules.

Pay extra attention to the way #nosec is being used in the files listed above.

pkg/graceful/graceful.go

codecov · 2024-12-12T20:03:30Z

Codecov Report

Attention: Patch coverage is 46.66667% with 120 lines in your changes missing coverage. Please review.

Project coverage is 61.68%. Comparing base (462fcea) to head (e3c3587).

Files with missing lines	Patch %	Lines
zetaclient/zetacore/client_start.go	0.00%	83 Missing ⚠️
pkg/graceful/graceful.go	86.31%	12 Missing and 1 partial ⚠️
zetaclient/metrics/metrics.go	16.66%	9 Missing and 1 partial ⚠️
zetaclient/config/config.go	0.00%	8 Missing ⚠️
zetaclient/metrics/telemetry.go	0.00%	6 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3291      +/-   ##
===========================================
- Coverage    61.79%   61.68%   -0.12%     
===========================================
  Files          429      431       +2     
  Lines        30816    30999     +183     
===========================================
+ Hits         19043    19121      +78     
- Misses       10914    11018     +104     
- Partials       859      860       +1

Files with missing lines	Coverage Δ
pkg/os/console.go	`76.92% <100.00%> (ø)`
zetaclient/chains/base/logger.go	`94.00% <100.00%> (-0.45%)`	⬇️
zetaclient/zetacore/client.go	`58.57% <ø> (+0.68%)`	⬆️
zetaclient/metrics/telemetry.go	`0.00% <0.00%> (ø)`
zetaclient/config/config.go	`0.00% <0.00%> (ø)`
zetaclient/metrics/metrics.go	`69.64% <16.66%> (-5.36%)`	⬇️
pkg/graceful/graceful.go	`86.31% <86.31%> (ø)`
zetaclient/zetacore/client_start.go	`0.00% <0.00%> (ø)`

coderabbitai

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (9)

cmd/zetaclientd/utils.go (2)
14-15: Use descriptive aliases for imported authz packages

The alias authz2 for the imported github.com/zeta-chain/node/pkg/authz package may cause confusion with github.com/zeta-chain/node/zetaclient/authz. Consider using more descriptive aliases, such as coreauthz and clientauthz, to enhance code clarity.

Line range hint 69-85: Log error details when retrying block height retrieval

In the waitForBlocks function, when zc.GetBlockHeight(ctx) fails or the block height is insufficient, the error details are not logged. Including the error information in the log message will aid in diagnosing connection issues or other problems.

Proposed change:
- logger.Info().Msgf("Failed to get block number, retry: %d/%d", retryCount, attempts)
+ logger.Info().Err(err).Msgf("Failed to get block number, retry: %d/%d", retryCount, attempts)
pkg/graceful/graceful_test.go (1)
145-147: Remove unreachable code following a panic

In the Panic handling test, the return nil statement after panic("oopsie") is unreachable. Removing it enhances code clarity and avoids confusion.

Proposed change:
 func (ctx context.Context) error {
     panic("oopsie")
-    return nil
 }
zetaclient/chains/base/logger.go (2)
67-72: Consider adding format validation

While the writer configuration is cleaner, consider validating the log format explicitly.
+const (
+    logFormatJSON = "json"
+    logFormatText = "text"
+)
+
 func NewLogger(cfg config.Config) (Logger, error) {
     // ...
+    switch cfg.LogFormat {
+    case logFormatJSON:
+        stdWriter = os.Stdout
+    case logFormatText:
         stdWriter = zerolog.ConsoleWriter{
             Out:        os.Stdout,
             TimeFormat: time.RFC3339,
         }
+    default:
+        return Logger{}, fmt.Errorf("unsupported log format: %s", cfg.LogFormat)
     }
75-77: Consider adding sampling configuration validation

The sampling configuration could benefit from validation of the sampling rate.
+const defaultSampleRate = 5
+
 func NewLogger(cfg config.Config) (Logger, error) {
     // ...
     if cfg.LogSampler {
+        rate := defaultSampleRate
+        if cfg.LogSampleRate > 0 {
+            rate = cfg.LogSampleRate
+        }
-        std = std.Sample(&zerolog.BasicSampler{N: 5})
+        std = std.Sample(&zerolog.BasicSampler{N: rate})
     }
zetaclient/config/config.go (1)
120-130: Consider making the database path configurable

While the implementation is correct, having the database path hardcoded as .zetaclient/chainobserver reduces flexibility. Consider making it configurable through environment variables or configuration files.
 // ResolveDBPath resolves the path to chain observer database
 func ResolveDBPath() (string, error) {
-    const dbpath = ".zetaclient/chainobserver"
+    dbpath := os.Getenv("ZETACLIENT_DB_PATH")
+    if dbpath == "" {
+        dbpath = ".zetaclient/chainobserver"
+    }
 
     userDir, err := os.UserHomeDir()
     if err != nil {
         return "", errors.Wrap(err, "unable to resolve user home directory")
     }
 
     return filepath.Join(userDir, dbpath), nil
 }
zetaclient/metrics/telemetry.go (1)
Line range hint 201-213: Enhance error handling and utilize context

The implementation could be improved by utilizing the context parameter and enhancing error handling.
-func (t *TelemetryServer) Start(_ context.Context) error {
+func (t *TelemetryServer) Start(ctx context.Context) error {
     if t.s == nil {
         return errors.New("invalid http server instance")
     }
 
-    if err := t.s.ListenAndServe(); err != nil {
-        if !errors.Is(err, http.ErrServerClosed) {
-            return fmt.Errorf("fail to start http server: %w", err)
-        }
-    }
+    errCh := make(chan error, 1)
+    go func() {
+        if err := t.s.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
+            errCh <- fmt.Errorf("fail to start http server: %w", err)
+        }
+    }()
+
+    select {
+    case err := <-errCh:
+        return err
+    case <-ctx.Done():
+        return ctx.Err()
+    }
 
     return nil
 }
changelog.md (2)

18-18: Add description for PR #3291

The changelog entry should include a brief description of the changes made in PR #3291 for better clarity and documentation.

Line range hint 1-1589: Standardize bullet point formatting

For better readability and consistency, consider:

Using a single bullet point style throughout (either - or *)

Maintaining consistent spacing after bullet points

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 462fcea and 9f78be0.

📒 Files selected for processing (14)

changelog.md (1 hunks)
cmd/zetaclientd/start.go (5 hunks)
cmd/zetaclientd/utils.go (6 hunks)
pkg/graceful/graceful.go (1 hunks)
pkg/graceful/graceful_test.go (1 hunks)
pkg/os/console.go (1 hunks)
zetaclient/chains/base/logger.go (4 hunks)
zetaclient/chains/base/logger_test.go (1 hunks)
zetaclient/chains/interfaces/interfaces.go (0 hunks)
zetaclient/config/config.go (2 hunks)
zetaclient/metrics/metrics.go (2 hunks)
zetaclient/metrics/metrics_test.go (2 hunks)
zetaclient/metrics/telemetry.go (2 hunks)
zetaclient/zetacore/client.go (0 hunks)

💤 Files with no reviewable changes (2)

zetaclient/chains/interfaces/interfaces.go
zetaclient/zetacore/client.go

✅ Files skipped from review due to trivial changes (1)

pkg/os/console.go

🧰 Additional context used

📓 Path-based instructions (10)

zetaclient/chains/base/logger_test.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/metrics/metrics_test.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/config/config.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/metrics/telemetry.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/chains/base/logger.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

cmd/zetaclientd/start.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/metrics/metrics.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

pkg/graceful/graceful_test.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

cmd/zetaclientd/utils.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

pkg/graceful/graceful.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

🔇 Additional comments (8)

pkg/graceful/graceful.go (2)

51-55: AddService method effectively integrates service lifecycle management

The AddService method elegantly combines AddStarter and AddStopper, providing a clean and modular way to manage services that implement the Service interface. This enhances code readability and maintainability.

104-145: ShutdownNow method robustly handles graceful termination with timeout

The ShutdownNow method correctly initiates the shutdown sequence, stopping services in reverse order and safely recovering from any panics within stoppers. The implementation of a timeout prevents indefinite hanging during shutdown.

pkg/graceful/graceful_test.go (1)

1-273: Test suite comprehensively validates graceful shutdown behavior

The test cases effectively cover various scenarios, including synchronous and asynchronous service operations, error and panic handling, manual starters and stoppers, and shutdown timeouts. This thorough testing ensures the reliability and robustness of the graceful package.

zetaclient/chains/base/logger_test.go (1)

82-82: LGTM: Function rename properly reflected in test

The test correctly uses the renamed NewLogger function, maintaining consistency with the implementation changes.

zetaclient/chains/base/logger.go (1)

60-64: Well-structured helper function extraction

Good refactoring to extract common logger configuration into augmentLogger. This improves code organization and reduces duplication.

zetaclient/metrics/metrics.go (1)

201-207: LGTM: Clean implementation of graceful shutdown

The implementation correctly handles graceful shutdown with a reasonable timeout.

zetaclient/metrics/telemetry.go (1)

216-222: LGTM: Clean implementation of graceful shutdown

The implementation correctly handles graceful shutdown with a reasonable timeout.

changelog.md (1)

Line range hint 1-1589: Version organization looks good!

The changelog follows semantic versioning and maintains a clear chronological order from newest to oldest versions.

pkg/graceful/graceful.go

cmd/zetaclientd/utils.go

cmd/zetaclientd/start.go

zetaclient/metrics/metrics_test.go

zetaclient/metrics/metrics.go

coderabbitai

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (12)

cmd/zetaclientd/start.go (2)

85-85: Ensure consistent use of logger across the codebase

For uniformity and maintainability, it's important to use the same logger instance throughout the application. Replace direct calls to the global log package with the configured logger instance.

Apply this diff to use logger.Std:
- log.Info().Msgf("Config is updated from zetacore\n %s", cfg.StringMasked())
+ logger.Std.Info().Msgf("Config is updated from zetacore\n %s", cfg.StringMasked())
185-201: Use the configured logger instead of the global logger

In the startTelemetry function, relying on the global log package can lead to inconsistent logging behavior. Opt for the injected logger instance to maintain consistency and respect the configured logging settings.

Apply this diff to pass and use logger.Std:
func startTelemetry(ctx context.Context, cfg config.Config) (*metrics.TelemetryServer, error) {
+	func startTelemetry(ctx context.Context, cfg config.Config, logger zerolog.Logger) (*metrics.TelemetryServer, error) {

	// 1. Init pprof http server
	pprofServer := func(_ context.Context) error {
		addr := os.Getenv(envPprofAddr)
		if addr == "" {
			addr = "localhost:6061"
		}

-		log.Info().Str("addr", addr).Msg("starting pprof http server")
+		logger.Info().Str("addr", addr).Msg("starting pprof http server")

		// #nosec G114 -- timeouts unneeded
		err := http.ListenAndServe(addr, nil)
		if err != nil {
-			log.Error().Err(err).Msg("pprof http server error")
+			logger.Error().Err(err).Msg("pprof http server error")
		}

		return nil
	}
	...
}

+// Update the call site:
-telemetry, err := startTelemetry(ctx, cfg)
+telemetry, err := startTelemetry(ctx, cfg, logger.Std)

zetaclient/chains/base/logger.go (1)

52-57: Consider adding cleanup mechanism for compliance file

While the error handling for file opening is appropriate, there's no mechanism to ensure the file is properly closed when the logger is no longer needed.

Consider adding a Close method to the Logger struct:
type Logger struct {
	Std        zerolog.Logger
	Compliance zerolog.Logger
+	complianceFile io.Closer
}

+func (l *Logger) Close() error {
+	if l.complianceFile != nil {
+		return l.complianceFile.Close()
+	}
+	return nil
+}

zetaclient/config/config.go (1)

120-130: Enhance ResolveDBPath for better configurability and testability

While the current implementation is functional, it could be improved for better flexibility and testing.

Consider making the path configurable and adding test support:
+const DefaultDBPath = ".zetaclient/chainobserver"

-func ResolveDBPath() (string, error) {
+func ResolveDBPath(customPath string) (string, error) {
+	if customPath != "" {
+		return filepath.Clean(customPath), nil
+	}

	userDir, err := os.UserHomeDir()
	if err != nil {
		return "", errors.Wrap(err, "unable to resolve user home directory")
	}

-	return filepath.Join(userDir, dbpath), nil
+	return filepath.Join(userDir, DefaultDBPath), nil
}

pkg/graceful/graceful_test.go (2)

16-46: Consider using table-driven tests for better organization

The test cases follow a similar pattern and could be refactored into a table-driven test structure to reduce code duplication and improve maintainability.

Example refactor:

func TestProcess(t *testing.T) {
+    testCases := []struct {
+        name     string
+        timeout  time.Duration
+        async    bool
+        setup    func(*testSuite)
+        validate func(*testing.T, *testSuite, time.Time)
+    }{
+        {
+            name:    "Service sync",
+            timeout: defaultTimeout,
+            async:   false,
+            setup: func(ts *testSuite) {
+                go func() {
+                    time.Sleep(time.Second)
+                    ts.mockSignal <- os.Interrupt
+                }()
+            },
+            validate: func(t *testing.T, ts *testSuite, start time.Time) {
+                assert.Less(t, time.Since(start), defaultTimeout)
+                assert.Contains(t, ts.logBuffer.String(), "Shutdown completed")
+                assert.Contains(t, ts.logBuffer.String(), "mock is running in blocking mode")
+            },
+        },
+        // Add other test cases here
+    }
+
+    for _, tc := range testCases {
+        tc := tc
+        t.Run(tc.name, func(t *testing.T) {
+            t.Parallel()
+            ts := newTestSuite(t, tc.timeout, tc.async)
+            ts.process.AddService(ctx, ts.mockService)
+            start := time.Now()
+            tc.setup(ts)
+            ts.process.WaitForShutdown()
+            tc.validate(t, ts, start)
+        })
+    }
}

173-197: Enhance timeout test precision

The test could be more precise by measuring the actual duration of the shutdown operation.

Apply this diff to improve the test:

     t.Run("Shutdown timeout", func(t *testing.T) {
         t.Parallel()
         ts := newTestSuite(t, defaultTimeout, false)
+        start := time.Now()
         const workDuration = defaultTimeout + 5*time.Second
         ts.process.AddStopper(func() {
             ts.logger.Info().Msg("Stopping something")
             time.Sleep(workDuration)
             ts.logger.Info().Msg("Stopped something")
         })
         ts.process.ShutdownNow()
+        duration := time.Since(start)
+        assert.Greater(t, duration, defaultTimeout)
+        assert.Less(t, duration, workDuration)
         assert.Contains(t, ts.logBuffer.String(), "Stopping something")
         assert.Contains(t, ts.logBuffer.String(), "Shutdown interrupted by timeout")
         assert.NotContains(t, ts.logBuffer.String(), "Stopped something")
     })

zetaclient/metrics/metrics.go (1)

201-207: Consider adding cleanup operations in Stop method

The Stop method could benefit from additional cleanup operations, such as closing any open metric collectors or resetting gauges.

Apply this diff to enhance the shutdown process:
 func (m *Metrics) Stop() {
+    // Reset all gauges to avoid stale metrics
+    PendingTxsPerChain.Reset()
+    RelayerKeyBalance.Reset()
+    LastScannedBlockNumber.Reset()
+    RPCInProgress.Reset()
+
     ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
     defer cancel()
 
     if err := m.s.Shutdown(ctx); err != nil {
         log.Error().Err(err).Msg("failed to shutdown metrics server")
     }
 }

zetaclient/metrics/telemetry.go (2)

Line range hint 201-213: Improve context handling in Start method

The context parameter should be used to control the server's lifecycle and ensure proper cleanup.

Apply this diff to enhance context handling:

 func (t *TelemetryServer) Start(ctx context.Context) error {
     if t.s == nil {
         return errors.New("invalid http server instance")
     }
 
+    // Use base context for the server
+    t.s.BaseContext = func(net.Listener) context.Context {
+        return ctx
+    }
+
+    errCh := make(chan error, 1)
+    go func() {
+        if err := t.s.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
+            errCh <- fmt.Errorf("fail to start http server: %w", err)
+        }
+        close(errCh)
+    }()
 
-    if err := t.s.ListenAndServe(); err != nil {
-        if !errors.Is(err, http.ErrServerClosed) {
-            return fmt.Errorf("fail to start http server: %w", err)
-        }
+    select {
+    case err := <-errCh:
+        return err
+    case <-ctx.Done():
+        return ctx.Err()
     }
-
-    return nil
 }

216-222: Enhance cleanup operations in Stop method

The Stop method should ensure proper cleanup of all resources and provide more detailed error logging.

Apply this diff to improve the shutdown process:

 func (t *TelemetryServer) Stop() {
+    t.mu.Lock()
+    // Clear internal state
+    t.lastScannedBlockNumber = make(map[int64]uint64)
+    t.rtt = make(map[peer.ID]int64)
+    t.mu.Unlock()
+
     c, cancel := context.WithTimeout(context.Background(), 5*time.Second)
     defer cancel()
 
     if err := t.s.Shutdown(c); err != nil {
-        log.Error().Err(err).Msg("Failed to shutdown the TelemetryServer")
+        log.Error().
+            Err(err).
+            Str("timeout", "5s").
+            Msg("Failed to gracefully shutdown the TelemetryServer")
     }
 }

changelog.md (3)

Line range hint 1-1: Add version number to the main header

The main header should include the project name and current version number for better tracking.
-# CHANGELOG
+# CHANGELOG (v23.x)
18-18: Fix indentation in refactor section

The bullet point is not properly aligned with other entries in the section.
-* [3291](https://github.com/zeta-chain/node/pull/3291) - revamp zetaclient initialization (+ graceful shutdown)
+* [3291](https://github.com/zeta-chain/node/pull/3291) - revamp zetaclient initialization (+ graceful shutdown)
Line range hint 3-24: Consider adding migration notes for breaking changes

The unreleased section contains significant changes to the zetaclient initialization and TSS package. Consider adding a "Breaking Changes" section to document any migration steps required.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 462fcea and 9f78be0.

📒 Files selected for processing (14)

changelog.md (1 hunks)
cmd/zetaclientd/start.go (5 hunks)
cmd/zetaclientd/utils.go (6 hunks)
pkg/graceful/graceful.go (1 hunks)
pkg/graceful/graceful_test.go (1 hunks)
pkg/os/console.go (1 hunks)
zetaclient/chains/base/logger.go (4 hunks)
zetaclient/chains/base/logger_test.go (1 hunks)
zetaclient/chains/interfaces/interfaces.go (0 hunks)
zetaclient/config/config.go (2 hunks)
zetaclient/metrics/metrics.go (2 hunks)
zetaclient/metrics/metrics_test.go (2 hunks)
zetaclient/metrics/telemetry.go (2 hunks)
zetaclient/zetacore/client.go (0 hunks)

💤 Files with no reviewable changes (2)

zetaclient/chains/interfaces/interfaces.go
zetaclient/zetacore/client.go

✅ Files skipped from review due to trivial changes (1)

pkg/os/console.go

🧰 Additional context used

📓 Path-based instructions (10)

zetaclient/config/config.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/chains/base/logger_test.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/metrics/metrics_test.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/metrics/telemetry.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/metrics/metrics.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

zetaclient/chains/base/logger.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

cmd/zetaclientd/start.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

cmd/zetaclientd/utils.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

pkg/graceful/graceful_test.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

pkg/graceful/graceful.go (1)

Pattern **/*.go: Review the Go code, point out issues relative to principles of clean code, expressiveness, and performance.

🔇 Additional comments (7)

pkg/graceful/graceful.go (2)

61-66: 🛠️ Refactor suggestion

Include stack trace when logging recovered panics for better debugging

When recovering from a panic in the service starter, including the stack trace provides valuable context for diagnosing issues. Consider using debug.Stack() to capture and log the stack trace.

Apply this diff to include the stack trace:

+import (
+	"runtime/debug"
+	...
+)

...

func (p *Process) AddStarter(ctx context.Context, fn func(ctx context.Context) error) {
	go func() {
		defer func() {
			if r := recover(); r != nil {
				p.logger.Error().
					Interface("panic", r).
+					Bytes("stack", debug.Stack()).
					Msg("panic in service")
				p.ShutdownNow()
			}
		}()

		if err := fn(ctx); err != nil {
			p.logger.Error().Err(err).Msg("failed to start service")
			p.ShutdownNow()
		}
	}()
}

125-130: 🛠️ Refactor suggestion

Include stack trace when logging recovered panics during shutdown

Including the stack trace when recovering from a panic during shutdown enhances the ability to troubleshoot and resolve issues efficiently. Utilize debug.Stack() to log the stack trace.

Apply this diff to include the stack trace:

+import (
+	"runtime/debug"
+	...
+)

...

func (p *Process) ShutdownNow() {
	...
	go func() {
		defer func() {
			if r := recover(); r != nil {
				p.logger.Error().
					Interface("panic", r).
+					Bytes("stack", debug.Stack()).
					Msg("panic during shutdown")
			}

			// complete shutdown
			close(done)
		}()

		// stop services in the reverse order
		for i := len(p.stopStack) - 1; i >= 0; i-- {
			p.stopStack[i]()
		}
	}()
	...
}

zetaclient/chains/base/logger_test.go (1)

82-82: Logger initialization updated correctly to base.NewLogger

The test now properly uses base.NewLogger for logger initialization, aligning with the updated function signature.

zetaclient/chains/base/logger.go (2)

60-64: LGTM: Well-structured logger augmentation

The augmentLogger helper function provides a clean abstraction for logger configuration.

82-83: Consider potential race conditions with global logger

Setting the global logger instance could lead to race conditions in concurrent scenarios.

Consider:

Avoiding global state by passing logger instances explicitly
If global logger is necessary, initialize it once during application startup

✅ Verification successful

Let me gather more information about the logger initialization and usage patterns.

Let me gather more information about concurrent access patterns and logger initialization.

Let me check the logger initialization context and its usage pattern.

Based on the analysis of the codebase, I can now provide a final response:

Global logger modification is safe in this context

The global logger is set only once during initialization in NewLogger() function, which is called during application startup. The codebase shows:

Logger initialization is centralized in zetaclient/chains/base/logger.go
The global logger is set after all configuration is complete
Subsequent logger usage is read-only through method calls
No evidence of concurrent modification of the logger instance

The current implementation follows a common pattern where the global logger is configured once at startup and then used in a read-only manner throughout the application's lifecycle.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for concurrent access to the global logger
rg -A 2 "log\.Logger\."

Length of output: 493

Script:

#!/bin/bash
# Search for logger initialization and configuration
rg -A 5 "log\.(Init|New|Config)" 

# Search for concurrent operations or goroutines near logger usage
rg -B 2 -A 2 "go\s+func" zetaclient/

# Find all logger imports to understand the logging package
rg -l "\".*log\""

Length of output: 25607

Script:

#!/bin/bash
# Search for goroutines or concurrent operations near logger usage
rg -B 3 -A 3 "log\.Logger\." zetaclient/

# Check logger initialization in main or setup functions
ast-grep --pattern 'func $_($$) {
  $$$
  log.Logger = $_
  $$$
}'

# Look for any mutex or sync primitives around logger
rg -B 2 -A 2 "sync\.(Mutex|RWMutex)" zetaclient/chains/base/

Length of output: 2443

Script:

#!/bin/bash
# Check the logger initialization file content
cat zetaclient/chains/base/logger.go

# Look for any concurrent logger usage patterns
rg -B 3 -A 3 "go\s+func.*\{.*log\." zetaclient/

# Check if logger is modified after initialization
rg -B 2 -A 2 "log\.(Set|Init|Configure)" zetaclient/

Length of output: 2561

zetaclient/config/config.go (1)

124-127: LGTM: Proper error handling with wrapped context

The error handling using errors.Wrap provides good context for debugging.

changelog.md (1)

Line range hint 26-1589: LGTM! Historical changelog entries are well-documented

The historical changelog entries are well-organized, properly categorized, and provide clear descriptions of changes with appropriate PR references.

cmd/zetaclientd/utils.go

zetaclient/metrics/metrics_test.go

pkg/graceful/graceful_test.go

zetaclient/metrics/metrics.go

cmd/zetaclientd/utils.go

…TSS ceremony

swift1337 added 3 commits December 12, 2024 13:51

Streamline start.go

19c089f

Implement graceful package

6df29a3

Apply graceful package to start.go

6f43ea0

swift1337 self-assigned this Dec 12, 2024

Merge branch 'develop' into refactor/improve-start-cmd

ef9fb5a

github-actions bot added the breaking:cli label Dec 12, 2024

github-actions bot added the nosec label Dec 12, 2024

Update changelog

9f78be0

swift1337 marked this pull request as ready for review December 12, 2024 19:56

swift1337 requested a review from a team as a code owner December 12, 2024 19:56

swift1337 changed the title ~~refactor(zetaclient): streamline start.go~~ refactor(zetaclient): streamline process initialization Dec 12, 2024

gartnera reviewed Dec 12, 2024

View reviewed changes

pkg/graceful/graceful.go Outdated Show resolved Hide resolved

coderabbitai bot reviewed Dec 12, 2024

View reviewed changes

cmd/zetaclientd/utils.go Outdated Show resolved Hide resolved

zetaclient/metrics/metrics_test.go Show resolved Hide resolved

pkg/graceful/graceful_test.go Show resolved Hide resolved

zetaclient/metrics/metrics.go Show resolved Hide resolved

kingpinXD reviewed Dec 12, 2024

View reviewed changes

cmd/zetaclientd/utils.go Outdated Show resolved Hide resolved

kingpinXD reviewed Dec 12, 2024

View reviewed changes

cmd/zetaclientd/utils.go Outdated Show resolved Hide resolved

kingpinXD reviewed Dec 12, 2024

View reviewed changes

cmd/zetaclientd/utils.go Outdated Show resolved Hide resolved

swift1337 added 2 commits December 13, 2024 13:56

Move zetaclient bootstrapping to client_start. Move isObserver after …

6a8c135

…TSS ceremony

Locate panic line

e3c3587

kingpinXD approved these changes Dec 13, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(zetaclient): streamline process initialization #3291

refactor(zetaclient): streamline process initialization #3291

swift1337 commented Dec 12, 2024 •

edited by coderabbitai bot

Loading

gitguardian bot commented Dec 12, 2024 •

edited

Loading

coderabbitai bot commented Dec 12, 2024 •

edited

Loading

Review skipped

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

github-actions bot commented Dec 12, 2024

codecov bot commented Dec 12, 2024 •

edited

Loading

coderabbitai bot left a comment

coderabbitai bot left a comment

refactor(zetaclient): streamline process initialization #3291

Are you sure you want to change the base?

refactor(zetaclient): streamline process initialization #3291

Conversation

swift1337 commented Dec 12, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

gitguardian bot commented Dec 12, 2024 • edited Loading

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

coderabbitai bot commented Dec 12, 2024 • edited Loading

Review skipped

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

github-actions bot commented Dec 12, 2024

codecov bot commented Dec 12, 2024 • edited Loading

Codecov Report

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

swift1337 commented Dec 12, 2024 •

edited by coderabbitai bot

Loading

gitguardian bot commented Dec 12, 2024 •

edited

Loading

coderabbitai bot commented Dec 12, 2024 •

edited

Loading

codecov bot commented Dec 12, 2024 •

edited

Loading