-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: add watch API endpoints #61
Conversation
Signed-off-by: Aidan Oldershaw <[email protected]>
Signed-off-by: Aidan Oldershaw <[email protected]>
?watch=true
query parameter to API endpoints
I think there are two parts to this proposal that can be teased apart:
I think they can be treated somewhat separately. In particular, controlling client demand requires us to provide backpressure. The ATC needs to signal to clients when it is ready to serve an endpoint, rather than relying on hardcoded client settings. That way it can, for example, select a subset of clients that it knows about to signal this to, signal less frequently when under heavy load etc. |
(And if you go down the backpressure route ... RSocket is nifty) |
@jchesterpivotal thanks for the pointer on the term "Change Data Capture". I was able to find some pretty helpful information on the topic - notably, that it's typically more performant to do CDC by parsing the transaction log (seems to be
Could you reiterate what you mean here? Do you mean the client might ask the ATC to start watching events, and the ATC says "not yet, but try again in 15 seconds"? Or do you mean the client is already watching events, but the ATC is overloaded, so the ATC decides not to forward events right away? Or neither of those things? |
Approximately: that the ATC tells the client when to make another Imagine that Concourse is a pipeline (heh) which manufactures responses to a client request. Right now, work is pushed through that system. A request arrives and is pushed forward through the ATC, to the database, then back through the ATC, back to the client. The problem is that each stage of the pipeline has no way to control how much work it receives and relies on a mix of overprovisioning, upstream load balancing and plain luck to deal with variance in demand. Worse still: a slowdown anywhere in the pipeline causes queues to form further upstream. The idea of "reactive backpressure" is that the downstream explicitly the upstream that it has capacity to do work. So the database signals the ATC "I can serve 20 queries", the ATC only sends 20 queries. In turn it only serves 20 clients at once. In manufacturing this is called "pull-based" flow. The best-known example is Kanban, where cards circulate between each pair of stations to govern demand between them. Another example is CONWIP, where this is a fixed supply of cards for the entire manufacturing line, circulating from the beginning to end and back around. The reason for backpressure is that the easiest place to stabilise load is further upstream from where it's felt. What we're seeing in this case is that the furthest upstream demand is generated by web clients. We control the design of the client, so it would be possible to have it follow ATC instructions on when to send requests. RSocket has some nice ways of supporting that and can run over websockets, but we'd still be better off even with a simple long-polling / SSE method that says "Now!". |
@jchesterpivotal Neat, thanks for taking the time to explain it! The reactive backpressure approach could definitely be a good solution to some of the problems around I wonder if you think using RSocket has much merit in conjunction with the proposed |
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is pretty much as described in the linked RFC - it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then invoke a callback with the updated job(s) for every subscriber that has access to them. Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is pretty much as described in the linked RFC - it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then invoke a callback with the updated job(s) for every subscriber that has access to them. Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 This commit enables monitoring the relevant database tables for watching changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly as described in the linked RFC, in that it uses Postgres TRIGGERs + NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using the primary keys from these tables, it then performs the ListAllJobs query scoped to only the affected jobs. It will then send the updated job(s) to each subscriber that has access to them Currently, we aren't taking into account public pipelines for access control - this'll be addressed in a later commit. One thing to note is that the list of tables/columns being monitored for changes is kind of arbitrary, and it's possible it won't capture every change to jobs. I initially started by monitoring changes to any field from any table that is referenced in the ListAllJobs queries. This would work, but would also result in duplicate events. It seems like most changes to jobs seem to flow through the jobs table at some point (often via the `config`). Signed-off-by: Aidan Oldershaw <[email protected]>
concourse/rfcs#61 Signed-off-by: Aidan Oldershaw <[email protected]>
This would still be a great optimization to have. I don't think it needed to be an RFC since it's iterating on existing features. Anyone is free to take the ideas in this RFC, open an issue in the main repo and start a PR implementing this stuff. |
Rendered
Experimental PR: concourse/concourse#5802 - adds watching for the
ListAllJobs
endpoint (this was a pretty big change, but subsequent endpoints won't be as heavy-weight to implement).Signed-off-by: Aidan Oldershaw [email protected]