Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce BackEnd gRPC server #565

Merged
merged 27 commits into from
Feb 25, 2022
Merged

Introduce BackEnd gRPC server #565

merged 27 commits into from
Feb 25, 2022

Conversation

canepat
Copy link
Member

@canepat canepat commented Feb 13, 2022

This PR adds:

  • base RPC server infrastructure
  • base unary RPC support
  • ETHBACKEND server skeleton implementation [Etherbase/NetVersion empty services]

Key points:

  • the RPC server is asynchronous
  • the threading model based on gRPC async model uses N completion queues (CQs) and 2 * N threads: for each CQ one scheduler thread (implemented using asio::io_context) is dedicated to process all the asynchronous notifications and handle RPC operations, another one (completion runner) waits indefinitely for the next completion tag from CQ and disptches the tag processor to the scheduler
  • usage of C++ templates is to avoid the boilerplate code required for handling RPC state machines
  • just unary call implemeted but support for streaming both unidirectional and bidirectional (required for KV server) should fit nicely

Details:

  • trace logging is still in place because I need to play with this server infrastructure under load/stress tests and spot any issue

This is a high-level class diagram (partial):

classDiagram
    class Server~AsyncService~{
      <<abstract>>
      +run()
      +shutdown()
      #next_context()
      #request_calls()*
    }
    class BackEndServer~ETHBACKEND::AsyncService~{
      #request_calls()
    }
    class RpcService~Rpc~{
      <<abstract>>
      #add_request(Rpc)
      #remove_request(Rpc)
    }
    class BaseRpc{
      <<abstract>>
      +process_done()
      #cleanup()*
    }
    class UnaryRpc~Service, Request, Response~{
      +send_response(Response)
      -process_read()
      -process_finish()
      -cleanup()
    }
    class EtherbaseRpc{
      <<typedef>>
    }
    class NetVersionRpc{
      <<typedef>>
    }
    class EtherbaseService{
      +create_rpc
      +process_rpc
      +cleanup_rpc
    }
    class NetVersionService{
      +create_rpc
      +process_rpc
      +cleanup_rpc
    }
    class ServerContext{
    }
    class asio_io_context{
    }
    class grpc_ServerCompletionQueue{
    }
    class CompletionRunner{
    }
    class ServerContextPool{
      +add_context()
      +run()
      +stop()
      +next_context()
    }
    Server <|-- BackEndServer
    BaseRpc <|-- UnaryRpc
    EtherbaseRpc ..> UnaryRpc : refine
    NetVersionRpc ..> UnaryRpc : refine
    EtherbaseService ..> EtherbaseRpc : create
    NetVersionService ..> NetVersionRpc : create
    RpcService <|-- EtherbaseService : refine
    RpcService <|-- NetVersionService : refine
    BackEndServer *-- "1" EtherbaseService
    BackEndServer *-- "1" NetVersionService
    ServerContext *-- "1" asio_io_context
    ServerContext *-- "1" grpc_ServerCompletionQueue
    ServerContext *-- "1" CompletionRunner
    ServerContextPool *-- "*" ServerContext
    Server *-- "1" ServerContextPool
Loading

@codecov
Copy link

codecov bot commented Feb 16, 2022

Codecov Report

Merging #565 (1f301ad) into master (5348b56) will increase coverage by 0.37%.
The diff coverage is 93.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #565      +/-   ##
==========================================
+ Coverage   81.16%   81.53%   +0.37%     
==========================================
  Files         154      165      +11     
  Lines       13221    13624     +403     
==========================================
+ Hits        10731    11109     +378     
- Misses       2490     2515      +25     
Impacted Files Coverage Δ
node/silkworm/rpc/util.hpp 30.76% <30.76%> (ø)
node/silkworm/rpc/server_config.cpp 72.72% <72.72%> (ø)
node/silkworm/rpc/server_context_pool.cpp 89.13% <89.13%> (ø)
node/silkworm/rpc/call.hpp 96.20% <96.20%> (ø)
node/silkworm/rpc/backend_server.cpp 100.00% <100.00%> (ø)
node/silkworm/rpc/backend_server.hpp 100.00% <100.00%> (ø)
node/silkworm/rpc/completion_runner.cpp 100.00% <100.00%> (ø)
node/silkworm/rpc/completion_runner.hpp 100.00% <100.00%> (ø)
node/silkworm/rpc/server.hpp 100.00% <100.00%> (ø)
node/silkworm/rpc/server_config.hpp 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5348b56...1f301ad. Read the comment docs.

@canepat canepat marked this pull request as ready for review February 18, 2022 12:57
Copy link
Member

@yperbasis yperbasis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running node_test on this branch under AddressSanitizer fails on my MacBook:
address_sanitizer.log

On the master branch 7dc531d node_test works fine for me under ASAN.

That doesn't necessarily mean that this change introduces any memory issues, but I'd hate to lose the ability to run tests under sanitizers. Also perhaps we should prioritize and implement Issue #295. IMHO it's very important, but I have been postponing implementing it, mea culpa.

@canepat
Copy link
Member Author

canepat commented Feb 21, 2022

Running node_test on this branch under AddressSanitizer fails on my MacBook: address_sanitizer.log

It seems the same problem reported in google/sanitizers#1317 and grpc/grpc#21838. I'm going to reproduce and try the suggested solutions (hope at least one works).

That doesn't necessarily mean that this change introduces any memory issues, but I'd hate to lose the ability to run tests under sanitizers.

I agree we need to find a solution to avoid losing such ability.

@canepat
Copy link
Member Author

canepat commented Feb 22, 2022

I've just tried to reproduce the issue on my Linux machine running Clang using libc++ with both ASAN and UB enabled, but stumbled upon this llvm/llvm-project#16778. Removing UB I've got a different error:
AddressSanitizerDEADLYSIGNAL.log

@canepat
Copy link
Member Author

canepat commented Feb 23, 2022

I've added ASAN/TSAN suppression definitions in our CMake to avoid grpc/grpc#19224 without need to rebuild gRPC libraries as suggested in grpc/grpc#22325

@canepat canepat changed the title [WIP] Introduce BackEnd gRPC server Introduce BackEnd gRPC server Feb 23, 2022

namespace silkworm::rpc {

inline types::H128* new_H128_from_bytes(const uint8_t* bytes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const uint8_t* is fine here because this function is only used internally. However, were we to make it public, I'd use gsl::span<const uint8_t, 16>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair enough. There are also other Silkworm public APIs where such change should be considered, e.g. silkworm::endian

@canepat canepat merged commit b2b036c into master Feb 25, 2022
@canepat canepat deleted the backend_grpc_server branch February 25, 2022 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants