Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: add watch API endpoints #61

Closed
wants to merge 3 commits into from

Conversation

aoldershaw
Copy link

@aoldershaw aoldershaw commented Jun 16, 2020

Rendered

Experimental PR: concourse/concourse#5802 - adds watching for the ListAllJobs endpoint (this was a pretty big change, but subsequent endpoints won't be as heavy-weight to implement).

Signed-off-by: Aidan Oldershaw [email protected]

@aoldershaw aoldershaw changed the title RFC: add a ?watch=true query parameter to API endpoints RFC: add a "watch" query parameter to API endpoints Jun 16, 2020
@jchesterpivotal
Copy link

I think there are two parts to this proposal that can be teased apart:

  1. Notifying-on-change (usually called "Change Data Capture" in database / ETL land). This is where triggers, notify/listen are useful.
  2. Controlling demand from clients.

I think they can be treated somewhat separately. In particular, controlling client demand requires us to provide backpressure. The ATC needs to signal to clients when it is ready to serve an endpoint, rather than relying on hardcoded client settings. That way it can, for example, select a subset of clients that it knows about to signal this to, signal less frequently when under heavy load etc.

@jchesterpivotal
Copy link

(And if you go down the backpressure route ... RSocket is nifty)

@aoldershaw
Copy link
Author

aoldershaw commented Jun 17, 2020

@jchesterpivotal thanks for the pointer on the term "Change Data Capture". I was able to find some pretty helpful information on the topic - notably, that it's typically more performant to do CDC by parsing the transaction log (seems to be pg_logical_slot_get_changes in Postgres) - though this requires setting "wal_level to logical and max_replication_slots to at least 1" (https://www.postgresql.org/docs/9.4/logicaldecoding-example.html). Not sure what other implications those settings have (though I would guess it would require a fair bit more disk).

The ATC needs to signal to clients when it is ready to serve an endpoint, rather than relying on hardcoded client settings

Could you reiterate what you mean here? Do you mean the client might ask the ATC to start watching events, and the ATC says "not yet, but try again in 15 seconds"? Or do you mean the client is already watching events, but the ATC is overloaded, so the ATC decides not to forward events right away? Or neither of those things?

@jchesterpivotal
Copy link

Approximately: that the ATC tells the client when to make another GET. RSocket makes this a little more explicit, in that it allows each end of the connection to signal "I can now receive 10 widgets" or "I am open for 3 widgets any time in the next 30 seconds".

Imagine that Concourse is a pipeline (heh) which manufactures responses to a client request. Right now, work is pushed through that system. A request arrives and is pushed forward through the ATC, to the database, then back through the ATC, back to the client. The problem is that each stage of the pipeline has no way to control how much work it receives and relies on a mix of overprovisioning, upstream load balancing and plain luck to deal with variance in demand. Worse still: a slowdown anywhere in the pipeline causes queues to form further upstream.

The idea of "reactive backpressure" is that the downstream explicitly the upstream that it has capacity to do work. So the database signals the ATC "I can serve 20 queries", the ATC only sends 20 queries. In turn it only serves 20 clients at once.

In manufacturing this is called "pull-based" flow. The best-known example is Kanban, where cards circulate between each pair of stations to govern demand between them. Another example is CONWIP, where this is a fixed supply of cards for the entire manufacturing line, circulating from the beginning to end and back around.

The reason for backpressure is that the easiest place to stabilise load is further upstream from where it's felt. What we're seeing in this case is that the furthest upstream demand is generated by web clients. We control the design of the client, so it would be possible to have it follow ATC instructions on when to send requests. RSocket has some nice ways of supporting that and can run over websockets, but we'd still be better off even with a simple long-polling / SSE method that says "Now!".

@aoldershaw
Copy link
Author

@jchesterpivotal Neat, thanks for taking the time to explain it! The reactive backpressure approach could definitely be a good solution to some of the problems around ListAllJobs that we've experienced in the past. In fact, we implemented concourse/concourse#5429 to prevent too many clients from hitting the endpoint at once, where the ATC would respond with a 503 Service Unavailable if the limit was exceeded, and the client would retry until it was accepted by the ATC. In hindsight, an approach like using RSocket would have probably been a better way to go.

I wonder if you think using RSocket has much merit in conjunction with the proposed ?watch parameter - rather than the ATC just telling the clients "you can ask for new data now", it'll be telling the clients "here's your new data" (which could be on the ATC's own terms). The intention of the proposed design is that there won't be any more burden on the DB whether there's 1 client or 100 clients listening for changes (albeit with an upfront "constant" cost for observing changes, and the cost of the initial query), and there probably won't be (too) much more burden on the ATC either (it's just a matter of checking permissions and forwarding events - though I may be underestimating how costly this will be).

aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is pretty
much as described in the linked RFC - it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then invoke a callback
with the updated job(s) for every subscriber that has access to them.

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is pretty
much as described in the linked RFC - it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then invoke a callback
with the updated job(s) for every subscriber that has access to them.

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 23, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 25, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 25, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 27, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Jun 27, 2020
@aoldershaw aoldershaw changed the title RFC: add a "watch" query parameter to API endpoints RFC: add watch API endpoints Jul 3, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jul 3, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Jul 3, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Jul 9, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Jul 9, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 10, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 10, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 14, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 14, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 19, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 19, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 27, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Aug 27, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Sep 8, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Sep 8, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Sep 28, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Sep 28, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Oct 2, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Oct 2, 2020
aoldershaw added a commit to concourse/concourse that referenced this pull request Oct 16, 2020
concourse/rfcs#61

This commit enables monitoring the relevant database tables for watching
changes to the ListAllJobs endpoint. The `ListAllJobsWatcher` is roughly
as described in the linked RFC, in that it uses Postgres TRIGGERs +
NOTIFY/LISTEN to capture/observe changes to the relevant tables. Using
the primary keys from these tables, it then performs the ListAllJobs
query scoped to only the affected jobs. It will then send the updated
job(s) to each subscriber that has access to them

Currently, we aren't taking into account public pipelines for access
control - this'll be addressed in a later commit.

One thing to note is that the list of tables/columns being monitored for
changes is kind of arbitrary, and it's possible it won't capture every
change to jobs. I initially started by monitoring changes to any field
from any table that is referenced in the ListAllJobs queries. This would
work, but would also result in duplicate events. It seems like most
changes to jobs seem to flow through the jobs table at some point (often
via the `config`).

Signed-off-by: Aidan Oldershaw <[email protected]>
aoldershaw added a commit to concourse/concourse that referenced this pull request Oct 16, 2020
@clarafu clarafu removed their assignment May 16, 2022
@taylorsilva
Copy link
Member

This would still be a great optimization to have. I don't think it needed to be an RFC since it's iterating on existing features. Anyone is free to take the ideas in this RFC, open an issue in the main repo and start a PR implementing this stuff.

@taylorsilva taylorsilva closed this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants