-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add a shared p2p rpc client session pool to reduce rpc setup time #3152
feat: add a shared p2p rpc client session pool to reduce rpc setup time #3152
Conversation
a0b082b
to
2c4e82c
Compare
/// Creates a new RpcClientPool that can be shared between tasks. The client pool will lazily establish up to | ||
/// `max_sessions` sessions and provides client session that is least used. | ||
#[cfg(feature = "rpc")] | ||
pub fn create_rpc_client_pool<T>(&self, max_sessions: usize) -> RpcClientPool<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please help me understand this:
I would have thought that calls to pub async fn connect_rpc
and pub async fn connect_rpc_using_builder
would call pub fn create_rpc_client_pool
if a pool does not exist end then return connections from the pool only? Is that not the purpose of the pool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the thinking but unfortunately it's not that easy to do given that all clients have a different type.
Remember a pool session is per peer per protocol.
A pool of sessions has to be typed to an rpc protocol i.e. RpcClientPool<WalletBaseNodeClient>
. So storing a generic list containing different types requires RpcClientPool<Box<dyn RpcPoolClient>>
which actually requires setting up a BoxedRpcClientPool type. So basically, making this generic is not straight forward - I also don't think it's a particularly good solution, because:
- RPC is designed to for single concerns - one concern needs one kind of rpc service. Services are small and focused. You shouldn't need to open more than one/two sessions to a particular protocol i.e. you should design your software to not need more than this (though in reality you have quite a margin for abuse as we've seen).
- it is MUCH easier for the caller to get a pool and manage it, then to make a boxed dynamic list/hashmap. The domain knows about the pool it needs.
- Only the wallet needs a pool currently.
- How big of a pool do you need? Is one size fits all a good decision?
- It has a memory and perf cost which is simply not necessary for most components to pay
- Other minor concerns which add undue code complexity - When can we close the pooled sessions? Do we assume that we are never done using the pool? Do we add another call(s) to manage this pool state for everyone?
So, the wallet will get a pool, and then use it to get the connections it needs. In future, other components may need a pool so they have that available. It really depends on the lifetime of your session and requirements so it's probably a mistake to try and implement something that covers all possible domain requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thanks, and agreed. The caller/client process can/should manage the sessions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Is the plan that the domain level clients will maintain one of these for a given peer connection and recreate it if the peer connection drops? So they will handle the re-dialing if required. I can see that the maintenance of the peer connection could become a bit tricky.
@philipr-za Yeah, this PR takes care of the pooled management of a particular RPC session. I am implementing a domain-level component that manages the connection to a base node, which should be a matter of dialling it again if it drops and creating the shared pool. Callers can call on this to get a session from the pool without being exposed to managing the connection or needing to know about base node changes etc. Maybe that component can be made generic, but that is not the goal at the moment |
Your response to @hansieodendaal answered this question to some extent. The wallet will need to manage these pools as they are required by tasks so it can keep a client open for tasks that need them and once no more tasks need them close them down. |
2c4e82c
to
4c7544d
Compare
Just pushed up another test that I wanted in here |
4c7544d
to
2cfdf7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Just some notes for implementing this further. We may want a domain level client that creates an RPC connection pool to have options to specify:
- if RPC, in general, should take care of reconnecting dropped client connections, furthermore if the client should be exclusive or if it may be selected from a peer list;
- clearing out/dropping a percentage of PRC sessions if not used for some time (e.g. pool size = 100, never drop sessions to below 10).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I would ideally want the wallet to have something a bit different but I think this can be extended into this:
In order of preference I would like:
Option 1. The wallet to have only one connection pool that each protocol/service can use as a dependency, and it asks it for a connection of type T (e.g. BaseNodeRpc) to an address. i.e.
pool.get_client<TRpcClient>(address)
. If the rpc dance has been done already, return a client, otherwise negotiate
Option 2. The wallet have one connection pool per rpc client type. A protocol/service that needs both basenoderpc and basenodewalletrpc would have one dependency per client type it needs
Option 3. This looks like what you have, although I might be misunderstanding, which is that you have to have a pool per address and rpc type. This is useful, but I think the other two options are going to be more ergonomic
Happy to merge this, and future improvements can be built upon it
@mikethetike Thanks for the review - looks like we're on the same page. This pool construct forms the basis for what you have outlined. Option 1) This has the same issues as raised here #3152 (comment) - we'll see that this is something we can do without or build non-generically (types are known) in the domain layer where a pool is needed (currently only the wallet) - PR incoming |
PR queued successfully. Your position in queue is: 3 |
PR is on top of the queue now |
PR failed to merge with reason: Some CI status(es) failed |
PR queued successfully. Your position in queue is: 8 |
PR is on top of the queue now |
PR failed to merge with reason: Some CI status(es) failed |
PR queued successfully. Your position in queue is: 1 |
2133424
to
6573ff3
Compare
5b5a0c7
to
bae7360
Compare
5ae87d9
to
0f1dc29
Compare
Adds an RPC client session pool to efficiently maintain multiple shared RPC sessions on the p2p network.
Potentially fixes GRPC connection issues we've been having
0f1dc29
to
765d884
Compare
PR queued successfully. Your position in queue is: 2 |
PR is on top of the queue now |
<!--- Provide a general summary of your changes in the Title above --> ## Description <!--- Describe your changes in detail --> Adds a service responsible for wallet connectivity. This service is responsible for and abstracts any complexity in the management of the base node connections and RPC session management. This PR makes use of this service in the base node montoring service but does not "plumb" the WalletConenctivityService into the protocols. This is left as a TODO, but we should expect this to remove many lines of code and greaty simplify these protocols by removing the budren of connection management in the various wallet components. A number of simplifications on the peer connection and substream code, debatably reducing places that bugs could hide. ## Motivation and Context <!--- Why is this change required? What problem does it solve? --> <!--- If it fixes an open issue, please link to the issue here. --> Follow on from #3152, making use of the pool in the wallet base node monitoring. The rest of the protocols should follow. ## How Has This Been Tested? <!--- Please describe in detail how you tested your changes. --> <!--- Include details of your testing environment, and the tests you ran to --> <!--- see how your change affects other areas of the code, etc. --> Extensive service unit tests Existing tests pass Washing machine test on local wallet, with the broadcast protocol modified to use the wallet connectivity Base node RPC sessions max out at 10 each (2 wallets connected to same base node) even when pushing through 426 transactions. ![image](https://user-images.githubusercontent.com/1057902/128175630-52617736-f71f-4865-a915-6d6c9b00e8d5.png) Manually tested wallet connectivity states (Connecting -> Online | Connecting -> Offline | Connecting -> Online -> Offline -> Connecting -> Online) ## Checklist: <!--- Go over all the following points, and put an `x` in all the boxes that apply. --> * [x] I'm merging against the `development` branch. * [x] I have squashed my commits into a single commit.
Description
Adds an RPC client session pool to efficiently maintain multiple shared
RPC sessions on the p2p network.
Motivation and Context
Wallet currently establishes one or more RPC sessions per transaction. At times, during stress tests these
sessions are opened and closed many times. Each time an RPC sessions is established, protocol negotiation
and RPC handshakes must occur. These components can make use of the shared RPC client pool
to more efficiently and easily manage multiple sessions without creating situations where the target peer is
thrashed with 10s of thousands of sessions in a stress test situation.
Implementing this in the wallet is TODO.
Some RPC test utilities were moved around to make them useful to other tests, which has added some LOC
How Has This Been Tested?
Extensive unit tests
Checklist:
development
branch.