-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dns_cache for server addresses as in pgbouncer #249
Conversation
557c496
to
8bed231
Compare
8bed231
to
a7890c9
Compare
Hey there, Thanks for the PR, I really like the way it's implemented. I've seen that feature in PgBouncer, but I've never used it personally or really understood why it's there. Could you explain to me the use case for it? It seems like it's overriding the TTL on the DNS records. |
We use DNS as our single source of truth for database servers. Our services use server names instead of Ip addresses for connecting to databases. This way, when we do cutovers of replicas (because of hw maintenance, etc), we simply have to update the server record, in this situation, pgbouncer detects the change and starts sending traffic to the new server. Apart from that, we also have services that connect directly to the database, and usually, when they loose connection to Db, they reconnect (so they get the new ip). We want this feature because it simplifies the instrumentation needed when dealing with our DB infrastructure. As we have 16 shards plus 1 common db, each one with a (at least) 1 read only replica, hw maintenance and cutovers are frequent. That said, I saw in the code (doc is outated BTW) that pool conf can be changed without restarting, that could be used to do cutovers, but it is a bit more impactful because the entire pool needs to be reconnected, also the dns feature ensures "compatibility" with the rest of the services that do not use db poolers for connecting to databases. I actually saw the possibility of reloading pools on the fly, while reading the code for implementing this feature. Still, I found this "complementary" to "pool reloading" and fault tolerance features. |
Thank you for explaining your use case to me, I appreciate it! Based on my understanding, you don't need a DNS cache feature. We already support using DNS in the config, for example You can change the
PgCat uses It's my understanding that PgBouncer DNS cache is used to override whatever the TTL is on the DNS records that are coming from the DNS server. I think it helps in cases when the DNS server is temporarily unavailable (broken) or is reporting incorrect TTLs for some reason. This certainly happens in cloud environments and on prem, for sure, e.g. Route53 has intermittent outages sometimes, but they are exceedingly rare. So overall, I'm still not sure why you'd want this feature. Thanks! |
As you mention, without this feature, new connections to servers (provided TTLs has expired), will end up in the new server, but the whole point of the feature is not new connections but current ones. From PGbouncer doc:
And this is the whole point, by just changing the DNS record, you have a way of redirecting traffic to a new server (usually a replica) in a zero disruptive manner. Old connections will be freed as the transactions/sessions finishes and new transactions/sessions will use the new ip. Maybe the naming here is a bit misleading I use |
That makes way more sense to me now.
Very interesting approach. I've never used DNS changes to trigger a failover like that. The way I always did it in the past was by changing the config to point to new DNS records for the new primary/replica - this way, I control the exact timing of the failover and can immediately check it's effect, without worrying about DNS cache & propagation. Since config reloads are zero downtime, this was a good enough approach for me. I've mostly used Pgb in AWS with RDS, where we had no control over the DNS of the databases, so I think that's the main reason I never had to deal with this before. Please allow me a few days to think about your proposal. Since I've never used this PgBouncer feature and I need to consider it. At this moment, I'm thinking this is a good feature to have, but perhaps it should be better named. One issue I've always had with Pgb documentation and features is they were powerful yet sometimes had unforeseen interactions and consequences, for example: pgbouncer/pgbouncer#647 My goal with pgcat is to have the best way to run Postgres at scale, implemented in code. So while having many configuration options and features is often seen as sign of maturity and completeness of a project, they can lead to user confusion and production incidents, when used incorrectly. |
96cf5c5
to
dfe1d0e
Compare
a8b3ad8
to
7a2d5e4
Compare
5026387
to
99871c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just make sure CI passes before merging.
3e0fbbd
to
9a4eac1
Compare
… changes. Adds a module to deal with dns_cache feature. It's main struct is CachedResolver, which is a simple thread safe hostname <-> Ips cache with the ability to refresh resolutions every `dns_max_ttl` seconds. This way, a client can check whether its ip address has changed.
Hello!
We are considering using PgCat as an alternative to PgBouncer in our infrastructure. One thing that we noticed is that it lacks support for dns cache, a thing that we rely on.
In Pgbouncer (from its docu):
Unfortunately this is not the case in PgCat so we decided to implement it ourselves.
This change:
Server
struct that saves the value of the address resolution when the connection is first established.Server
to check whether the underlying address has changed and if so, returns true.TODO:
Reloading dns config is not (yet) possible.