-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose acquire function to user for custom connection routing #772
Comments
Are you proposing changing how the pool works or allowing a specific connection to be used? Selecting a connection from the idle pool should be cheap - you can always increase the size of this in heavy workloads. Connections are only created if the idle pool is empty. How would you prefer? fixed pool size? |
I'm proposing changing how the pool works actually. But it's good idea to give people to use a specific group of Clickhouse machines by implementing some kind of workgroup or priority based model maybe? Since Clickhouse Cloud is serverless solution at the moment, maybe it's not a big thing for CH customers. But for on-prem users, they might want to dedicate some of the Clickhouse servers just for read purposes. Our another scenario might be having 2 different regions of group of Clickhouse servers. And trying first region of servers and if there is no available server in that region, redirecting the request to other region. I'm just brainstorming. Maybe these are too much work for this driver. But load balancing might work as an external plugin (like ch-go) and this plugin would only supply server for the request. Users can implement their own load balancing or contribute to other one directly. Still just brain storming.. :) Thanks. |
There was a request to expose how connections are selected from the pool. Maybe this would solve your requirement? i.e. allow the acquire function to be custom? |
@gingerwizard I want |
Agree we need to do this @yuzhichang |
We want to do this. However, currently, acquire returns
a connect struct has access to the internals of the client and manages the pools. I'm not keen on exposing these. We could possibly expose the dial function. This currently does the actual connection i.e.
We could allow the user to override this. I would change the signature so it received a function The client would continue to use the connection pool unless dial was specified - in which case the custom function would always be called. @yusufozturk @yuzhichang does this solve your requirements? Feedback please. |
@gingerwizard It looks complicated for me to switch connection shard-aware based on your callback-flavor proposal. I'd prefer code both shard info and shard-aware retry strategy into dsn or options. |
I’m trying to avoid coding in strategies which are app specific - and avoid the need to constantly add new approaches. |
The challenge I see with this is whilst you’re controlling the server you’re connecting to, there would be no pool. Connections wouldn’t be reused. Maybe we could require the user return a lower level type representing an actual connection |
@yuzhichang we actually have this already https://github.com/ClickHouse/clickhouse-go/blob/main/examples/clickhouse_api/connect_settings.go#L40-L46 via The current proposal is to add a function that allowed the user to return the next address to use (specified in options e.g. However, this is incomplete. The missing piece is the above only lets you control the dial side when a new connection is needed. i.e. client would still manage the idle pool and determine when a new connection was opened. This is tricky to change since the connect object has buffers and readers associated with it that are created https://github.com/ClickHouse/clickhouse-go/blob/main/conn.go#L103-L120 - we can't really expose this as its fundamental to how the client works and is tightly coupled to other behaviour. Although the net.Conn can be controlled through |
@gingerwizard I think being able to use a custom dial would be perfectly okay in my situation. Thanks! |
As mentioned by @gingerwizard we will not add any use case-specific requirements for now. |
@jkaflik what do you mean by use case specific? During every query execution, there is a load balancing mechanism which is unnecessary cost and locking. This issue was opened to ask an ability to improve the process by allowing users to override load balancing functionality, not a use case request. Is this not going to be released in v2.5? Suggested implementation was very good to allow these scenarios and not too much change in the code. Thanks. |
I've made a proposal in #855 for this change, but it has a major disadvantage: it requires exposing the internal I'd say, it's fine as a temporary solution until we have a better architecture in v3. Then, definitely, this will become more modular. (I like the approach of |
I came up with a bit better proposal. Still hacky. Details in a PR description: #855 |
@yusufozturk FYI #855 has been merged. It's a hacky solution, but hope it makes sense for your use case. |
As I see in the current V2 code, every time before there is an operation, this library runs "ch.acquire" to get an active clickhouse-server. I believe we can gain some performance in latency if we move this process to outside of the queries.
So basically a dedicated goroutine, first adds all servers into a pool, and then monitors the every endpoints and if there is an issue with a server, so it can remove from the pool.
In this case, ch.acquire only gets one of the active servers from the pool and executes the operation to that machine. We are doing a similar operation in our load balanced API gateway and it handles thousands of requests per seconds.
The text was updated successfully, but these errors were encountered: