Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clustering) atomic exports of declarative configs with postgres #8586

Merged
merged 1 commit into from
Apr 21, 2022

Conversation

bungle
Copy link
Member

@bungle bungle commented Mar 24, 2022

Summary

Adds support for atomic exports of declarative configs with Postgres.

Copy link
Member

@dndx dndx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I wonder if this is really safe. Since :each might yield, would this cause some potential inconsistencies with, say, concurrent Admin API calls?

For example, worker 1 is exporting, it begins the transaction and yields, worker 2 inserts a new Service, then user issues a request to insert a Route for this new Service and got forwarded to worker 1, this would fail because worker 1 is still inside a transaction and can not see the new Service yet.

Still, I trust this makes the situation better, but wondering if there is something else we can also do to mitigate this potential race. Maybe use a special DB connection just for db_export and do not share it with the rest of the query?

@bungle bungle force-pushed the feat/pg-atomic-exports branch from b0725a3 to 634633e Compare March 24, 2022 13:04
@bungle
Copy link
Member Author

bungle commented Mar 25, 2022

@lukego conducted a performance test with this branch, and it seems to be working as indented.

@lukego
Copy link
Contributor

lukego commented Mar 25, 2022

@lukego conducted a performance test with this branch, and it seems to be working as indented.

Note: I have only checked the error logs so far. This change made a big improvement there. I will now test for effects on memory and performance and request errors and ...

@hbagdi
Copy link
Member

hbagdi commented Mar 28, 2022

For example, worker 1 is exporting, it begins the transaction and yields, worker 2 inserts a new Service, then user issues a request to insert a Route for this new Service and got forwarded to worker 1, this would fail because worker 1 is still inside a transaction and can not see the new Service yet.

There are multiple TCP connections to the database and I don't think (I'm not 100% sure) there are concurrent queries on the same TCP connection. If a Service was created previously, an insert query for a route subsequently will see the previously created Query, else that would violate serialization.

The transaction that is doing the export on worker 1 will not see the route, it may see the Service if that service was created before the transaction began.

Historically, Kong has been considered eventually consistent. This change seems to challenge that in some regard (not all).
Another worthy-mention future item is that Kong will be supporting incremental syncs to the data-plane. In that world, do we think we will be able to have something of this sort?

@dndx
Copy link
Member

dndx commented Mar 30, 2022

@hbagdi I expressed the same concern with @bungle offline regarding mixing up the connection. Another potential issue is that after this change the CP memory usage had an what appears to be a huge increase, so we are holding off this merge for now pending further investigation.

I personally think mixing the export connection which requires the transaction with those that doesn't is a slightly dangerous idea for my tase. I would like to see the export happen on it's dedicated connection instead.

@dndx dndx added the pr/discussion This PR is being debated. Probably just a few details. label Mar 30, 2022
@bungle
Copy link
Member Author

bungle commented Apr 4, 2022

CP memory usage had an what appears to be a huge increase

I am pretty sure this has nothing to do with this change. If memory is increased anywhere, it would happen on Postgres.

@bungle
Copy link
Member Author

bungle commented Apr 4, 2022

There are multiple TCP connections to the database and I don't think (I'm not 100% sure) there are concurrent queries on the same TCP connection

This is a bit opaque because of design of :connect. E.g. is it that :connect may be shared with single worker on multiple threads. For example when exporting thread yields, can the stored connection be used by another thread. The design of :connect is a bit fuzzy as it stores a connection in ngx.ctx, but that is a key here too. As it uses ngx.ctx it will not get shared with other threads. Except the threads that run within the same ngx.ctx, and in this case there should be a single export process per ngx.ctx. Yes, it would be nice if :connect just returned a connection that could be passed to DAOs, but our DAOs don't have that. Yes we could use some raw SQL to export and then run some process_auto_fields and transforms that are needed, but that is a way too much hassle for this fix.

Historically, Kong has been considered eventually consistent. This change seems to challenge that in some regard (not all). Another worthy-mention future item is that Kong will be supporting incremental syncs to the data-plane. In that world, do we think we will be able to have something of this sort?

TL;RD; our goal has never been "eventual" when that just does not work, as is the case here (on big databases a huge exports that can not always be imported, and which can lead to situation where things are not even fixed eventually).

This has nothing to do with eventual vs. non-eventual. This is about making exports that can be imported. Right now in environments where there are simultaneous changes (e.g. admin api calls) and exports (as it is a case with hybrid exports, and also to some extend with kong config db_export), there is a chance that we export data that is never going to import as the validation on import will fail as the data is exported on different tables with different state of database, and that will lead to foreign key violations, but to some other smaller issues too.

What this change does is:

  1. it creates connection to whole export process (same connection is used to export all the data from all the tables). Vs. previously it may have made one query, and then put the connection back to pool, and with paged each it may have done that even with a single table multiple times. The change will make exporting much faster as it does not constantly put connection to pool and retrieve a new one.
  2. It wraps the reads with "snapshot" aka repeatable read, which means that all data is exported from same "snapshot" of database.

And that's it.

@bungle
Copy link
Member Author

bungle commented Apr 4, 2022

I personally think mixing the export connection which requires the transaction with those that doesn't is a slightly dangerous idea for my tase. I would like to see the export happen on it's dedicated connection instead.

@dndx, It is not shared. We store it in ngx.ctx for the export context. It is not shared with other contexts. We can even :close the connection after use, if you believe something might be leaking when it is put pack to pool.

@bungle bungle force-pushed the feat/pg-atomic-exports branch from 634633e to 154a76e Compare April 4, 2022 08:27
@bungle
Copy link
Member Author

bungle commented Apr 4, 2022

Historically, Kong has been considered eventually consistent.

@hbagdi, The eventual consistency has been just a tool to avoid bigger issues. If there were no related issues (e.g. huge latencies in proxy or other operations), nothing would have been designed to be eventual, and everything would share the same state all the time, and in real-time. But because that is not possible or at least very very hard, we have decided to do some tasks eventually.

Let's take the CP-DP config export scenario:

  1. CP1 receives admin api call and posts cluster event and I think worker event for itself.
  2. After (by default) 0-5 secs the CP2 receives cluster event and posts worker events
  3. After (by default) 0-1 secs CP2 receives its worker events and in case of export creates another worker event
  4. After (by default) 0-1 secs CP2 receives its another worker event and starts exporting config for its connected DPs
  5. CP2 spends some time to export data from all database tables to a Lua table
  6. The config is then compressed and sent to connected DPs
  7. Connected DPs load the data in shared dict or more recently in LMDB and fire a worker event
  8. Connected DPs rebuild router, plugins iterator and reinitialize the balancer and clear the caches.

Now everything here can be seen eventual. Only thing that this PR touches is 5. where it creates that lua table using a single TCP connection to postgres database and runs queries in snapshot isolation (it does not lock writes nor other reads) — there is nothing eventual with this function, only thing it results is a lua table, that better be something that can be imported by DPs. This is all handled by postgres itself. Worth to mention that it is not necessary to run this through single connection. Other postgres connections can also join the same snapshot and make queries at the same time (with proper SQL statements executed to do so, including some Postgres only functions). This way you can even parallelise the config export. It is more involved design though as more queries need to be executed and we need a master connection to close when everything is done. But it is possible. I didn't want to go this route as I am not sure is the parallelisation of export a good idea, and when that is not decided, it is better to just run it through a single connection as it is basically just DAO queries wrapped with BEGIN and COMMIT (in certain isolation level).

@bungle
Copy link
Member Author

bungle commented Apr 4, 2022

@hbagdi I expressed the same concern with @bungle offline regarding mixing up the connection. Another potential issue is that after this change the CP memory usage had an what appears to be a huge increase, so we are holding off this merge for now pending further investigation.

I personally think mixing the export connection which requires the transaction with those that doesn't is a slightly dangerous idea for my tase. I would like to see the export happen on it's dedicated connection instead.

It is not mixed, because:

  1. this is only ever ran in couple of cases:
  • CP exports config from Postgres
  • kong config db_export is executed ontraditional with Postgres
  1. In those cases we connect with :connect that then:
  • stores connection in ngx.ctx (and does not return it to pool)
  • start transaction and execute queries in same ngx.ctx (and as the connection is found on ctx, it is not returned to pool)
  • commit transaction and return connection to pool (explicitly)

ngx.ctx is used to share state (that connection) between different things doing db queries:

  • raw db.connector:query (to start and commit transaction)
  • daos executing paged :each on different tables

Admin API calls will not share the same ngx.ctx. And on hybrid it is always protected that only a single export is in progress at the time on single worker.

@bungle
Copy link
Member Author

bungle commented Apr 4, 2022

I was thinking about making this a fix as that is what it is if you think the most common error on @lukego's performance tests. But as this can also be seen as a behaviour change, I decided to go with safer feat. This has been asked by community as well, but we have hesitated to implement it because it is difficult to do with cassandra. But as Cassandra is moving to deprecated state with 3.0, I think it is a signal that we can start better optimising for Postgres only.

@bungle
Copy link
Member Author

bungle commented Apr 4, 2022

@hbagdi and @dndx,

While I don't disagree what you are saying. I hope the above answers assure you that this is fine. The mentioned memory increase by 40x in @lukego's test should be repeated as I can hardly believe this causes any memory pressure on Kong itself. It probably should make it less so, as it basically uses a fewer calls, but it is always possible that there is a bug somewhere in pgmoon or something that leaks memory or that is not garbage collected properly or causes pikes.

When I say, I don't disagree, it is about the design of Kong connector and it's :connect and :setkeepalive functions that seem to be somewhat magic with ngx.ctx backing store in certain Nginx phases. The intent of this PR is not to refactor that design, and I'd say it is also quite hard. I'd rather refactor the export process at some point, perhaps when we get rid of Cassandra. That's why I don't see it relevant here to make comments about the design of connector and daos and how Kong makes it possible to share connection between those (arbitrary queries and DAO queries), it is more like, that is just how it works with Kong. We may get to develop simpler and alternative design later, but I think better time for that is when Cassandra is removed.

Copy link
Member

@dndx dndx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look at the logic stated here and inside the connector parent class, and it should be fine. Since the connector is stored on the ngx.ctx, and we do not call db.connector:setkeepalive() until the export finishes, there should be no chance for the export to be interrupted. Nice work @bungle !

Just left some minor comment, but otherwise LGTM.

kong/db/declarative/init.lua Outdated Show resolved Hide resolved
kong/db/declarative/init.lua Outdated Show resolved Hide resolved
### Summary

Adds support for atomic exports of declarative configs with Postgres.
@bungle bungle force-pushed the feat/pg-atomic-exports branch from 154a76e to 94aab9c Compare April 20, 2022 18:19
@bungle bungle requested a review from a team as a code owner April 20, 2022 18:19
@pull-request-size pull-request-size bot added size/M and removed size/S labels Apr 20, 2022
@dndx dndx merged commit 39dd728 into master Apr 21, 2022
@dndx dndx deleted the feat/pg-atomic-exports branch April 21, 2022 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core/db pr/discussion This PR is being debated. Probably just a few details. size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants