Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: cleanup server logic #15041

Merged
merged 19 commits into from
Feb 22, 2023
Merged

refactor: cleanup server logic #15041

merged 19 commits into from
Feb 22, 2023

Conversation

alexanderbez
Copy link
Contributor

@alexanderbez alexanderbez commented Feb 15, 2023

Description

Closes: #14429

This PR addresses the duct tape that was used during in-process application start. Specifically, we refactor the in-process logic to correctly start the gRPC and API servers, with correct graceful shutdown and signal handling. No more random clunky sleeps.

I took the approach here from the approach I've used for both Umee and Numia, i.e. use Context and ErrGroup as communication mechanisms for blocking processes, effectively forcing the caller to ensure graceful shutdown and starting via the Context.

Changelog

  • Ensure processes that are responsible for starting a "thing" that should block, is indeed blocking and is provided a Context that will gracefully indicate shutdown
  • Remove clunky sleeps
  • Cleanup gRPC and API server creation blocks (the if statements made no sense)

Tested

  • in-process w/o CPU-profiling
  • in-process w/ CPU-profiling
  • in-process w/o API and gRPC servers w/o CPU-profiling
  • in-process w/o API and gRPC servers w/ CPU-profiling
  • out-of-process w/o CPU-profiling
  • out-of-process w CPU-profiling

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • added ! to the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the guidelines for building modules
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • included comments for documenting Go code
  • updated the relevant documentation or specification
  • reviewed "Files changed" and left comments if necessary
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic
  • reviewed API design and naming
  • reviewed documentation is accurate
  • reviewed tests and test coverage
  • manually tested (if applicable)

server/start.go Fixed Show fixed Hide fixed
server/start.go Fixed Show fixed Hide fixed
server/start.go Fixed Show fixed Hide fixed
@github-actions github-actions bot added C:Confix Issues and PR related to Confix C:Cosmovisor Issues and PR related to Cosmovisor C:Rosetta Issues and PR related to Rosetta C:x/evidence C:x/feegrant C:x/nft C:x/upgrade labels Feb 16, 2023
@alexanderbez alexanderbez added the backport/v0.47.x PR scheduled for inclusion in the v0.47's next stable release label Feb 16, 2023
Comment on lines +146 to +155
go func(enableUnsafeCORS bool) {
s.logger.Info("starting API server...", "address", cfg.API.Address)

return tmrpcserver.Serve(s.listener, s.Router, s.logger, cmtCfg)
if enableUnsafeCORS {
allowAllCORS := handlers.CORS(handlers.AllowedHeaders([]string{"Content-Type"}))
errCh <- tmrpcserver.Serve(s.listener, allowAllCORS(s.Router), s.logger, cmtCfg)
} else {
errCh <- tmrpcserver.Serve(s.listener, s.Router, s.logger, cmtCfg)
}
}(cfg.API.EnableUnsafeCORS)

Check notice

Code scanning / CodeQL

Spawning a Go routine

Spawning a Go routine may be a possible source of non-determinism
@alexanderbez alexanderbez marked this pull request as ready for review February 16, 2023 01:33
@alexanderbez alexanderbez requested a review from a team as a code owner February 16, 2023 01:33
@github-prbot github-prbot requested review from a team, JeancarloBarrios and testinginprod and removed request for likhita-809 and a team February 16, 2023 01:33
server/start.go Outdated Show resolved Hide resolved
server/grpc/server.go Outdated Show resolved Hide resolved
server/grpc/server.go Outdated Show resolved Hide resolved
}
})

return g.Wait()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is unreachable, and the loop above is unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it un-reachable? I've tested it and it indeed gets gracefully exits.

server/api/server.go Outdated Show resolved Hide resolved
server/start.go Outdated
Comment on lines 419 to 421
g.Go(func() error {
return servergrpc.StartGRPCServer(ctx, svrCtx.Logger.With("module", "grpc-server"), config.GRPC, grpcSrv)
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So starting a server requires 2 routines long-running routines? One inside the call and the one for the caller. This can be greatly simplified

Copy link
Contributor Author

@alexanderbez alexanderbez Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I follow?

There is the main parent process, server.go, i.e. the application process. Then there are 2 explicitly spawned goroutines, one for the API server (if enabled) and one for the gRPC server (if enabled). Both of these servers are blocking processes. Starting them in a goroutine is pretty idiomatic IMO.

What is wrong with starting the gRPC and API servers in separate goroutines? What do you see that can be cleaned up?

server/grpc/server.go Outdated Show resolved Hide resolved
server/start.go Show resolved Hide resolved
@Wondertan
Copy link
Contributor

Is there any logic that handles the errors coming from Serve and other service errors reporting from eg.Wait?

@alexanderbez alexanderbez merged commit 7f99ad5 into main Feb 22, 2023
@alexanderbez alexanderbez deleted the bez/14429-in-process-cleanup branch February 22, 2023 16:09
mergify bot pushed a commit that referenced this pull request Feb 22, 2023
(cherry picked from commit 7f99ad5)

# Conflicts:
#	CHANGELOG.md
#	go.mod
#	go.sum
#	server/api/server.go
#	server/start.go
#	server/types/app.go
#	server/util.go
#	simapp/go.sum
#	tests/go.sum
#	testutil/network/network.go
#	testutil/network/util.go
#	tools/confix/go.mod
#	tools/confix/go.sum
#	x/evidence/go.mod
#	x/evidence/go.sum
#	x/feegrant/go.mod
#	x/feegrant/go.sum
#	x/nft/go.mod
#	x/nft/go.sum
#	x/upgrade/go.mod
#	x/upgrade/go.sum
@tac0turtle
Copy link
Member

@Mergifyio backport release/v0.47.x

@mergify
Copy link
Contributor

mergify bot commented Mar 20, 2023

backport release/v0.47.x

✅ Backports have been created

yihuang pushed a commit to yihuang/cosmos-sdk that referenced this pull request May 18, 2023
yihuang pushed a commit that referenced this pull request May 18, 2023
yihuang pushed a commit to yihuang/cosmos-sdk that referenced this pull request May 18, 2023
larry0x pushed a commit to larry0x/cosmos-sdk that referenced this pull request May 22, 2023
lcwik added a commit to dydxprotocol/cosmos-sdk that referenced this pull request Oct 25, 2023
…e configurable, and add support for domain sockets for gRPC server and gRPC web server.

ctx.Done() support and was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
lcwik added a commit to dydxprotocol/cosmos-sdk that referenced this pull request Oct 26, 2023
…e configurable, and add support for domain sockets for gRPC server and gRPC web server.

ctx.Done() support and was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
lcwik added a commit to dydxprotocol/cosmos-sdk that referenced this pull request Dec 6, 2023
ctx.Done() support was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
lcwik added a commit to dydxprotocol/cosmos-sdk that referenced this pull request Dec 6, 2023
ctx.Done() support was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
lcwik added a commit to dydxprotocol/cosmos-sdk that referenced this pull request Jan 17, 2024
ctx.Done() support was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
lcwik added a commit to dydxprotocol/cosmos-sdk that referenced this pull request Jan 22, 2024
ctx.Done() support was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
roy-dydx pushed a commit to dydxprotocol/cosmos-sdk that referenced this pull request Feb 20, 2024
ctx.Done() support was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
roy-dydx pushed a commit to dydxprotocol/cosmos-sdk that referenced this pull request Mar 26, 2024
ctx.Done() support was added in 0.50 with cosmos#15041
server start up time was removed in 0.50 with cosmos#15041
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/v0.47.x PR scheduled for inclusion in the v0.47's next stable release C:Confix Issues and PR related to Confix C:x/evidence C:x/feegrant C:x/nft C:x/upgrade
Projects
None yet
Development

Successfully merging this pull request may close these issues.

grpc.StartGRPCServer takes 5 seconds to start
6 participants