Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align collection job with new aggregation job semantics #596

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

inahga
Copy link
Contributor

@inahga inahga commented Sep 27, 2024

Closes #588.

This does the following:

  • Communicate job status through field in the body, while always returning 200 OK for polls.
  • Polling endpoint is communicated through Location header.
  • Shuffling around of paragraphs for readability.
  • Align structure names and media types to look like aggregation job ones.

This is a fair bit more invasive than I would have hoped, LMK if anything can be stripped out.

@inahga
Copy link
Contributor Author

inahga commented Sep 27, 2024

It raises the question of whether we should support both sync/async like with aggregation. (n.b. I preserved async-only in this PR).

The only technical rationale I have for it is that since most implementations are doing VDAFs without aggregation parameters, and thus can eagerly aggregate, collection jobs can be fulfilled remarkably quickly. We find that a lot of delay in processing collection jobs happens to just be with polling. If the collection job could be fulfilled instantly, that would be nice.

I don't feel strongly about this though, nor do I think collection jobs are a particularly expensive part of the system, so I'm content to leave it be.

The Leader then begins working with the Helper to aggregate the reports
satisfying the query (or continues this process, depending on the VDAF) as
described in {{aggregate-flow}}.

Changing a collection job's parameters is illegal, so further requests to
`PUT /tasks/{tasks}/collection_jobs/{collection-job-id}` for the same
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`PUT /tasks/{tasks}/collection_jobs/{collection-job-id}` for the same
`PUT /tasks/{task-id}/collection_jobs/{collection-job-id}` for the same

(not caused by this PR -- aligns with other uses of parameterized URL path fragments)

with HTTP status 201 Created with a body containing a `CollectionJobResp`. The
`status` field is set to `processing`, and the response contains a Location
header containing the relative reference
`/tasks/{task-id}/collection_jobs/{collection-job-id}`. The Leader SHOULD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q. What's the justification for including a Location field, here or in the aggregation interaction? In both cases, the URI included in the Location header field is entirely determined by the specification, and the HTTP client has enough information to construct the URI themselves.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure we have parity with aggregation jobs, but in both cases I agree we probably don't need a Location header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale is (unfortunately) subtle.

https://www.rfc-editor.org/rfc/rfc9110#name-201-created

The 201 (Created) status code indicates that the request has been fulfilled and has resulted in one or more new resources being created. The primary resource created by the request is identified by either a Location header field in the response or, if no Location header field is received, by the target URI.

The 201 response content typically describes and links to the resource(s) created. Any validator fields (Section 8.8) sent in the response convey the current validators for a new representation created by the request. Note that the PUT method (Section 9.3.4) has additional requirements that might preclude sending such validators.

Consider aggregation: we PUT at {helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id}, BUT we expect the Leader to poll the job at {helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id}?step=0. That is, the resource we're polling is at a different URI than the one that was provided in the PUT.

However, this is not true for collection. So I think we can (and should) eschew the Location header for collection. I will edit.

Comment on lines 2157 to 2230
After receiving the response to its `CollectionJobReq`, the Collector makes an
HTTP GET request to the aforementioned Location to check on the status of the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
After receiving the response to its `CollectionJobReq`, the Collector makes an
HTTP GET request to the aforementioned Location to check on the status of the
After receiving the response to its `CollectionJobReq`, the Collector periodically makes
HTTP GET requests to the aforementioned Location to check on the status of the

I think we should highlight that this is a repeated polling operation -- as currently written, the start of this sentence reads as if only a single HTTP GET is sent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the response might have had status "finished", in which case there's no need to poll?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the response might have had status "finished", in which case there's no need to poll?

I added a blurb to the paragraph Once both aggregate shares are successfully obtained. Let me know if this reads clearer.

@branlwyd
Copy link
Collaborator

It raises the question of whether we should support both sync/async like with aggregation. (n.b. I preserved async-only in this PR).

The only technical rationale I have for it is that since most implementations are doing VDAFs without aggregation parameters, and thus can eagerly aggregate, collection jobs can be fulfilled remarkably quickly. We find that a lot of delay in processing collection jobs happens to just be with polling. If the collection job could be fulfilled instantly, that would be nice.

I don't feel strongly about this though, nor do I think collection jobs are a particularly expensive part of the system, so I'm content to leave it be.

IMO, we shouldn't consider specifying sync collection unless someone wants to implement it. It would be easy enough to specify, but would lead to further implementation complexity, similarly to the complexity induced by supporting both sync & async aggregation.

Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we shouldn't consider specifying sync collection unless someone wants to implement it. It would be easy enough to specify, but would lead to further implementation complexity, similarly to the complexity induced by supporting both sync & async aggregation.

When we discussed this yesterday (2024/10/2), I was in agreement here, but after reading this PR, I actually think the symmetry with the aggregation job section does a lot to improve readability. I don't have a strong opinion either way, but if we think this won't add too much complexity to the specification, then I think we should keep it.

draft-ietf-ppm-dap.md Show resolved Hide resolved
with HTTP status 201 Created with a body containing a `CollectionJobResp`. The
`status` field is set to `processing`, and the response contains a Location
header containing the relative reference
`/tasks/{task-id}/collection_jobs/{collection-job-id}`. The Leader SHOULD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure we have parity with aggregation jobs, but in both cases I agree we probably don't need a Location header.

draft-ietf-ppm-dap.md Outdated Show resolved Hide resolved
Comment on lines 2157 to 2230
After receiving the response to its `CollectionJobReq`, the Collector makes an
HTTP GET request to the aforementioned Location to check on the status of the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the response might have had status "finished", in which case there's no need to poll?

draft-ietf-ppm-dap.md Show resolved Hide resolved
@inahga inahga force-pushed the inahga/align-collection branch from dc5b067 to d2590ed Compare October 4, 2024 15:09
@branlwyd
Copy link
Collaborator

branlwyd commented Oct 4, 2024

When we discussed this yesterday (2024/10/2), I was in agreement here, but after reading this PR, I actually think the symmetry with the aggregation job section does a lot to improve readability. I don't have a strong opinion either way, but if we think this won't add too much complexity to the specification, then I think we should keep it.

I've been thinking about this some more.

I still think async-only is the right choice for collection, as (a) no one has expressed a desire to implement sync collection & (b) processing a collection job may require aggregating an arbitrary number of reports, with the number of reports to aggregate at least partially out of control of the aggregators -- so sync collection is unlikely to work well in many cases.

If we want to resolve the difference between the aggregation interaction with the collection interaction, I think we should try to attain enough implementation experience with async aggregation to show that it is at least as good as sync aggregation for all use cases, then remove sync aggregation. This would simplify the protocol as well as resolve the disparity between aggregation and collection.

That said, I don't feel strongly enough about this to insist on it. If this argument is not convincing, we can specify both -- there is nothing stopping us from dropping sync aggregation/collection later, if implementation experience shows this is possible.

draft-ietf-ppm-dap.md Show resolved Hide resolved
draft-ietf-ppm-dap.md Outdated Show resolved Hide resolved
draft-ietf-ppm-dap.md Outdated Show resolved Hide resolved
draft-ietf-ppm-dap.md Outdated Show resolved Hide resolved
draft-ietf-ppm-dap.md Outdated Show resolved Hide resolved
@inahga inahga force-pushed the inahga/align-collection branch 2 times, most recently from 514e6be to 363df15 Compare October 9, 2024 15:14
This does the following:
- Communicate job status through field in the body, while always returning 200
  OK for polls.
- Polling endpoint is communicated through Location header.
- Shuffling around of paragraphs for readability.
- Align structure names and media types to look like aggregation job ones.
@inahga inahga force-pushed the inahga/align-collection branch from d2619c5 to 69e63b9 Compare October 9, 2024 16:24
@branlwyd branlwyd merged commit d41cbde into ietf-wg-ppm:main Oct 9, 2024
1 check passed
@branlwyd branlwyd mentioned this pull request Oct 11, 2024
20 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Align collection job polling semantics with aggregation job changes from #564
3 participants