Explicit backoff logic for aggregation jobs #557

cjpatton · 2024-05-01T20:01:38Z

As mentioned in #556, replay protection implies a relatively stringent operational requirement for the Helper: many aggregation jobs might make concurrent transactions on the same database, which can easily overwhelm the database if not sufficiently provisioned.

To mitigate this problem, the Helper can cancel an aggregation job and ask the Leader to try again later. One option is to respond with HTTP status 429 and a "retry-after" header indicating when it should be safe to retry.

I'd like to suggest that we spell this out explicitly in the draft.

branlwyd · 2024-05-01T20:29:04Z

One thing we touched on today is: what is the "unit" of backoff, e.g. what is the domain over which the retry-after header applies to?

Plausible ideas include:

Only that specific aggregation job
Aggregation jobs sharing a batch with the backoff'ed aggregation job (this might be tricky for time-interval, where an aggregation job can touch an arbitrary number of different batches)
All aggregation jobs for the same task as the backoff'ed aggregation job
All aggregation jobs

cjpatton · 2024-05-01T21:14:22Z

For Daphne, it's for jobs sharing a batch "bucket":

For time-interval tasks, each bucket is a time window
For fixed-size queries, there is just one bucket, which is identified by the batch ID.

cjpatton · 2024-06-04T20:00:50Z

Just wanted to quickly document here an alternative mentioned by @divergentdave on a call: allow aggregation jobs to be asynchronous. In response to an aggregation job initialization request, the helper is allowed to respond with 201 Created without producing a response. The leader would then poll the helper to see if the response is ready, similar to how the collector polls the leader on collection jobs.

This gets a little complicated for multi-round VDAFs, but I think we can figure this out.

cc/ @Noah-Kennedy

Noah-Kennedy · 2024-06-06T21:03:36Z

Just wanted to quickly document here an alternative mentioned by @divergentdave on a call: allow aggregation jobs to be asynchronous. In response to an aggregation job initialization request, the helper is allowed to respond with 201 Created without producing a response. The leader would then poll the helper to see if the response is ready, similar to how the collector polls the leader on collection jobs.

This gets a little complicated for multi-round VDAFs, but I think we can figure this out.

cc/ @Noah-Kennedy

I quite like this approach, as it gives a lot more flexibility to implementations than the current status quo.

This makes scaling up the protocol a lot easier.

tgeoghegan · 2024-06-06T21:48:59Z

This shouldn't take too much protocol text to achieve: we'd need to clarify that it's OK for the helper to respond to PUT /tasks/{task_id}/aggregation_jobs/{aggregation_job_id} with 201 Created and then spell out that the leader should then poll GET /tasks/{task_id}/aggregation_jobs/{aggregation_job_id} until they get 200 OK and an AggregationJobResp (basically the same semantics as for collection jobs).

branlwyd · 2024-06-06T23:57:27Z

Tim's comment raises a good question: if we take this, should we allow both asynchronous & synchronous aggregation job behavior, or only asynchronous?

IMO, it would be nice if we could specify only one of asynchronous/synchronous behavior: requiring implementations to support both modes would be added complexity; specifying both modes but allowing implementations to only implement one mode could lead to interoperability problems between aggregator implementations. But maybe that is too hopeful?

(In general, I think asynchronous aggregation jobs are a good idea since they decouple expensive computation from synchronous HTTP requests. But they do increase the communication cost of each aggregation job -- every aggregation job will now require at least two network round trips, or more precisely two round trips per aggregation step required by the VDAF. I think this means that implementations would want to tune for fewer, larger aggregation jobs to amortize this cost per-report.)

branlwyd · 2024-06-07T00:05:13Z

I do think asynchronous aggregation jobs lead to increased complexity in the "aggregation job state machine".

With synchronous aggregation jobs, the state machine goes from step 1 -> step 2 ... until the aggregation job is complete.

With asynchronous aggregation jobs, an additional "computing" step will be added to each step above, so that the state machine goes computing step 1 -> step 1 -> computing step 2 -> step 2 -> ...

This is not a fatal flaw, just something to consider when weighing synchronous vs asynchronous behaviors.

erks · 2024-06-11T23:15:30Z

We can make both asynchronous & synchronous aggregations co-exist by doing something like this:

Each round/call to Helper will start with the regular PUT (for init) / POST (for continue/collect) calls.
If Helper responds with the corresponding 201 (for init) / 200 (for continue/collect), then the call is done and Leader can move on to the next step.
If Helper responds with 202 Accepted, Leader starts to poll with a GET version of the call until receiving 200/201.

cjpatton · 2024-06-11T23:44:43Z

That sounds to me like it would work just fine. It's up to the Helper if they want to implement both or one or the other.

Noah-Kennedy · 2024-06-12T15:37:37Z

Tim's comment raises a good question: if we take this, should we allow both asynchronous & synchronous aggregation job behavior, or only asynchronous?

IMO, it would be nice if we could specify only one of asynchronous/synchronous behavior: requiring implementations to support both modes would be added complexity; specifying both modes but allowing implementations to only implement one mode could lead to interoperability problems between aggregator implementations. But maybe that is too hopeful?

(In general, I think asynchronous aggregation jobs are a good idea since they decouple expensive computation from synchronous HTTP requests. But they do increase the communication cost of each aggregation job -- every aggregation job will now require at least two network round trips, or more precisely two round trips per aggregation step required by the VDAF. I think this means that implementations would want to tune for fewer, larger aggregation jobs to amortize this cost per-report.)

As far as I am concerned, the additions to compute are pretty negligable compared with the additional scalability we would be getting.

cjpatton · 2024-06-12T16:06:48Z

Per 2024/6/12 DAP sync: @erks would be alright with making each step asynchronous and not allowing the synchronous option? We just had a call on this, and we expect that most of the time the Helper will always make this asynchronous.

erks · 2024-06-13T23:16:30Z

I think, from the spec perspective, it's probably okay to move towards the asynchronous option exclusively. I'm just worried about the migration of the existing implementations to the new async model, as there could be periods where Leader and Helper have to support both at the same time.

cjpatton · 2024-06-25T14:42:28Z

I think, from the spec perspective, it's probably okay to move towards the asynchronous option exclusively. I'm just worried about the migration of the existing implementations to the new async model, as there could be periods where Leader and Helper have to support both at the same time.

Supporting multiple drafts at once adds complexity, but it's complexity that many have learned to manage. (I believe at one point, Cloudflare had deployed up to three drafts of TLS 1.3 (RFC8446) simultaneously.) We already have versioning infrastructure in daphne for this very reason.

Closes ietf-wg-ppm#557. Allows helper-side aggregate job preparation to be handled asynchronously, similarly to how collection jobs work. Briefly, after calls with `AggregationJobInitReq` and `AggregationJobContinueReq`, the leader polls the helper at `GET /tasks/{task-id]/aggregation_jobs/{aggregation-job-id}` until receiving an `AggregationJobResp`. In discussion in ietf-wg-ppm#577 we had soft consensus on making this the only option, i.e. for spec simplicity we don't need to also allow the old synchronous behavior.

Closes ietf-wg-ppm#557. Allows helper-side aggregate job preparation to be handled asynchronously, similarly to how collection jobs work. Briefly, after calls with `AggregationJobInitReq` and `AggregationJobContinueReq`, the leader polls the helper at `GET /tasks/{task-id]/aggregation_jobs/{aggregation-job-id}` until receiving an `AggregationJobResp`. In discussion in ietf-wg-ppm#577 we had soft consensus on making this the only option, i.e. for spec simplicity we don't need to also allow the old synchronous behavior. PR feedback PR feedback Editorial Editorial Editorial; add prep step to continuation GET Add summary Add step field to init GETs too

Closes ietf-wg-ppm#557. Allows helper-side aggregate job preparation to be handled asynchronously, similarly to how collection jobs work. Briefly, after calls with `AggregationJobInitReq` and `AggregationJobContinueReq`, the leader polls the helper at `GET /tasks/{task-id]/aggregation_jobs/{aggregation-job-id}` until receiving an `AggregationJobResp`. In discussion in ietf-wg-ppm#577 we had soft consensus on making this the only option, i.e. for spec simplicity we don't need to also allow the old synchronous behavior. PR feedback PR feedback Editorial Editorial Editorial; add prep step to continuation GET Add summary Add step field to init GETs too fixup

cjpatton · 2024-07-26T23:35:38Z

IETF 120: There was no objection to merging #564, but @bemasc and @martinthomson both expressed interest in revisiting the design space. On the other hand, @Noah-Kennedy and myself expressed some urgency in making some change here soon: right now we're having to do way too much work (storage consistency and computation) on the hot path of HTTP requests.

We have considered other alternatives, including @martinthomson's suggestion of having the Helper tell the Leader later on which reports are replayed and thus need to reviewed. As @branlwyd and others observed previously, this creates the requirement to store individual report shares until the batch is collected. We would rather not impose this requirement.

To @bemasc's point at the mic: if there is a different HTTP API pattern we should be following to resolve this, please let us know what it is. The folks working on this probably have far less expertise than you do on building applications on HTTP.

I think we should give this issue a little more time to see if someone has a better idea. That said, we can't wait forever to fix this problem, so I think we should lean on the deployment of experience of folks who have implemented DAP. If there is no discussion on this topic in the next couple of weeks, I'm going to merge #564 as-is.

Thank you again @inahga for wrangling the PR!

martinthomson · 2024-07-31T03:08:26Z

My suggestion was not limited to a batch. The leader would send reports to the helper and store them until the helper gives an OK or a REJECT. In most cases, that batch could be bounded in time or number such that the total work on hand is manageable.

However, I think that our current exploration of consistent hashing and other techniques is more likely to produce a good outcome here.

I would like to avoid the need for asynchronous processing. It creates a burden for the leader that is almost exactly the same as what you were concerned about, it just hides it. So, if we can get the processing cost down -- in particular, to remove any reliance on consistent global state -- I would like to see the asynchronous stuff removed. It complicates where the protocol should be simplified.

cjpatton · 2024-07-31T16:20:42Z

However, I think that our current exploration of consistent hashing and other techniques is more likely to produce a good outcome here.

Consisting hashing is already used in Daphne to shard replay protection across multiple actors. This is the main tool we have to reduce the latency of resolving storage transactions across aggregation jobs.

I would like to avoid the need for asynchronous processing. It creates a burden for the leader that is almost exactly the same as what you were concerned about, it just hides it. So, if we can get the processing cost down -- in particular, to remove any reliance on consistent global state -- I would like to see the asynchronous stuff removed. It complicates where the protocol should be simplified.

What consistent global state are you thinking of here?

By all means, let's explore ways to mitigate costs, but we need to start turning our intuition about what might be possible into concrete proposals on paper.

@bemasc's proposal on the list is a good first step, but there are some concerns it doesn't fully address. I'll reply with details on the list in a moment, but at a high level, there is a certain amount of computation both aggregators have to do ($O(m \cdot n \cdot \log n)$, where $m$ is the batch size and $n$ is the size of measurement); and unlike the leader, it has to carry out that computation on the hot path of HTTP requests.

Given the operational requirements currently in the draft, our expectation is that whatever burden exists for coordinating the computation is meant to be born by the leader, not the helper. So if there is an irreducible cost that some aggregator has to pay, it should be payed by the leader.

cjpatton · 2024-09-19T01:18:47Z

Here's how we'll close this issue (see the list thread for details):

Merge Add operational consideration around max aggregation job size. #576
Merge either Helper-side aggregation job init/continue asynchronous processing #564 or Simplify report replay checking. #574 depending on whether the latency added by Simplify report replay checking. #574 is too high.

Closes ietf-wg-ppm#557. Allows helper-side aggregate job preparation to optionally be handled asynchronously, similarly to how collection jobs work. Briefly, after calls with `AggregationJobInitReq` and `AggregationJobContinueReq`, the leader polls the helper at `GET /tasks/{task-id]/aggregation_jobs/{aggregation-job-id}?step={step}` until receiving an `AggregationJobResp`.

Closes ietf-wg-ppm#557. Allows helper-side aggregate job preparation to be handled asynchronously, similarly to how collection jobs work. Briefly, after calls with `AggregationJobInitReq` and `AggregationJobContinueReq`, the leader polls the helper at `GET /tasks/{task-id]/aggregation_jobs/{aggregation-job-id}` until receiving an `AggregationJobResp`.

cjpatton added the feature label Jun 4, 2024

Noah-Kennedy mentioned this issue Jun 12, 2024

Consider removing max_batch_size. #555

Closed

cjpatton assigned tgeoghegan Jun 12, 2024

inahga mentioned this issue Jun 18, 2024

Helper: Apply backpressure when too many aggregation job requests are queued divviup/janus#3181

Closed

inahga mentioned this issue Jun 25, 2024

Helper-side aggregation job init/continue asynchronous processing #564

Merged

cjpatton added the ietf120 label Jun 26, 2024

cjpatton mentioned this issue Jul 18, 2024

Allow for aggregation step skew recovery of arbitrary number of steps #569

Closed

cjpatton assigned cjpatton and unassigned tgeoghegan Aug 8, 2024

cjpatton removed ietf120 feature labels Sep 17, 2024

cjpatton added aggregation cost draft-12 labels Sep 17, 2024

This was referenced Sep 19, 2024

Simplify report replay checking. #574

Closed

Add operational consideration around max aggregation job size. #576

Merged

cjpatton closed this as completed in #564 Sep 25, 2024

cjpatton closed this as completed in bca20a4 Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit backoff logic for aggregation jobs #557

Explicit backoff logic for aggregation jobs #557

cjpatton commented May 1, 2024

branlwyd commented May 1, 2024

cjpatton commented May 1, 2024

cjpatton commented Jun 4, 2024

Noah-Kennedy commented Jun 6, 2024

tgeoghegan commented Jun 6, 2024

branlwyd commented Jun 6, 2024 •

edited

Loading

branlwyd commented Jun 7, 2024

erks commented Jun 11, 2024 •

edited

Loading

cjpatton commented Jun 11, 2024

Noah-Kennedy commented Jun 12, 2024

cjpatton commented Jun 12, 2024

erks commented Jun 13, 2024 •

edited

Loading

cjpatton commented Jun 25, 2024

cjpatton commented Jul 26, 2024 •

edited

Loading

martinthomson commented Jul 31, 2024

cjpatton commented Jul 31, 2024 •

edited

Loading

cjpatton commented Sep 19, 2024 •

edited

Loading

Explicit backoff logic for aggregation jobs #557

Explicit backoff logic for aggregation jobs #557

Comments

cjpatton commented May 1, 2024

branlwyd commented May 1, 2024

cjpatton commented May 1, 2024

cjpatton commented Jun 4, 2024

Noah-Kennedy commented Jun 6, 2024

tgeoghegan commented Jun 6, 2024

branlwyd commented Jun 6, 2024 • edited Loading

branlwyd commented Jun 7, 2024

erks commented Jun 11, 2024 • edited Loading

cjpatton commented Jun 11, 2024

Noah-Kennedy commented Jun 12, 2024

cjpatton commented Jun 12, 2024

erks commented Jun 13, 2024 • edited Loading

cjpatton commented Jun 25, 2024

cjpatton commented Jul 26, 2024 • edited Loading

martinthomson commented Jul 31, 2024

cjpatton commented Jul 31, 2024 • edited Loading

cjpatton commented Sep 19, 2024 • edited Loading

branlwyd commented Jun 6, 2024 •

edited

Loading

erks commented Jun 11, 2024 •

edited

Loading

erks commented Jun 13, 2024 •

edited

Loading

cjpatton commented Jul 26, 2024 •

edited

Loading

cjpatton commented Jul 31, 2024 •

edited

Loading

cjpatton commented Sep 19, 2024 •

edited

Loading