Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support VDAFs with an aggregation parameter that isn't () #225

Open
tgeoghegan opened this issue Jun 9, 2022 · 5 comments
Open

Support VDAFs with an aggregation parameter that isn't () #225

tgeoghegan opened this issue Jun 9, 2022 · 5 comments
Assignees

Comments

@tgeoghegan
Copy link
Contributor

Once we have a Poplar1 specification (cfrg/draft-irtf-cfrg-vdaf#84) and implementation (divviup/libprio-rs#141), we need janus to support it. Of course, a DAP implementation should be generic over VDAFs, so we shouldn't need any Poplar1 specific code, but what we will need to do is teach Janus to deal with VDAFs that actually have an aggregation parameter, where aggregation jobs cannot be created or driven until the collector hits [leader]/collect.

@tgeoghegan tgeoghegan added the milestone umbrella issues for roadmap items label Jun 9, 2022
@cjpatton
Copy link
Contributor

cjpatton commented Jun 9, 2022

You could probably already do Poplar with ToyIdpf. Should be enough to exercise this, though you'd have to keep the input length to a few bits.

@branlwyd
Copy link
Contributor

As part of this, we will probably want to change the collect acquire operation since it races with the creation of aggregation jobs--it should also check that there are no reports not yet included in an aggregation job over the relevant time period. This is included in this issue since current VDAFs (which use () as an aggregation parameter) support incremental aggregation, and thus are much less affected.

@tgeoghegan
Copy link
Contributor Author

#1508 deleted some test-only code that supported the non-trivial-aggregation-parameter case, which we might want to look at when we get around to reintroducing that support,

@tgeoghegan tgeoghegan removed the milestone umbrella issues for roadmap items label Jan 30, 2024
@tgeoghegan tgeoghegan self-assigned this Feb 14, 2024
tgeoghegan added a commit that referenced this issue Feb 16, 2024
...and remove the dummy VDAF we had in Janus itself. This is a step
toward implementing support for VDAFs that use an aggregation parameter
(#225). We could use Poplar1, of course, but what's _really_ interesting
is Mastic, which besides using an aggregation parameter, is a
single-round VDAF. I want to be able to contrast the implementation
complexity of VDAFs with an aggregation parameter and ones that also
take multiple rounds, and the dummy VDAF is most expedient way to wire
up the former.
tgeoghegan added a commit that referenced this issue Feb 16, 2024
...and remove the dummy VDAF we had in Janus itself. This is a step
toward implementing support for VDAFs that use an aggregation parameter
(#225). We could use Poplar1, of course, but what's _really_ interesting
is Mastic, which besides using an aggregation parameter, is a
single-round VDAF. I want to be able to contrast the implementation
complexity of VDAFs with an aggregation parameter and ones that also
take multiple rounds, and the dummy VDAF is most expedient way to wire
up the former.
tgeoghegan added a commit that referenced this issue Feb 23, 2024
...and remove the dummy VDAF we had in Janus itself. This is a step
toward implementing support for VDAFs that use an aggregation parameter
(#225). We could use Poplar1, of course, but what's _really_ interesting
is Mastic, which besides using an aggregation parameter, is a
single-round VDAF. I want to be able to contrast the implementation
complexity of VDAFs with an aggregation parameter and ones that also
take multiple rounds, and the dummy VDAF is most expedient way to wire
up the former.
tgeoghegan added a commit that referenced this issue Feb 23, 2024
...and remove the dummy VDAF we had in Janus itself. This is a step
toward implementing support for VDAFs that use an aggregation parameter
(#225). We could use Poplar1, of course, but what's _really_ interesting
is Mastic, which besides using an aggregation parameter, is a
single-round VDAF. I want to be able to contrast the implementation
complexity of VDAFs with an aggregation parameter and ones that also
take multiple rounds, and the dummy VDAF is most expedient way to wire
up the former.

Also adopts `prio` 0.16.1, to pick up some bug fixes to the dummy VDAF.
tgeoghegan added a commit that referenced this issue Mar 12, 2024
Enable Janus to run the dummy VDAF from crate `prio` with arbitrary
rounds (by extending the representation of `VdafInstance::Fake`) and
with multiple aggregation parameters chosen by the test.

To that end, this resurrects some functionality previously deleted in
PR #1508. At the time, we had that code gated on `#[cfg(test)]`, with
some cautions about its likely performance problems. I'm bringing it
back in the release configuration of Janus. I don't think we need a
feature flag for this: the various `VdafInstance::Fake` variants that
would use these codepaths are not currently compiled into release Janus
builds. Even if they were, deployments can avoid this risky code by
simply not using those VDAFs.

This wires up some tests for both time interval and fixed size tasks,
running multiple collections against a single batch with varying
aggregation parameters. However, a few tests are disabled because of
stuff that doesn't work yet:

- creating fixed size aggregation jobs w/ agg param
- using `Vdaf.is_valid` to check whether subsequent collections against a
  single batch are valid

Part of #225
tgeoghegan added a commit that referenced this issue Mar 14, 2024
Enable Janus to run the dummy VDAF from crate `prio` with arbitrary
rounds (by extending the representation of `VdafInstance::Fake`) and
with multiple aggregation parameters chosen by the test.

To that end, this resurrects some functionality previously deleted in
PR #1508. At the time, we had that code gated on `#[cfg(test)]`, with
some cautions about its likely performance problems. I'm bringing it
back in the release configuration of Janus. I don't think we need a
feature flag for this: the various `VdafInstance::Fake` variants that
would use these codepaths are not currently compiled into release Janus
builds. Even if they were, deployments can avoid this risky code by
simply not using those VDAFs.

This wires up some tests for both time interval and fixed size tasks,
running multiple collections against a single batch with varying
aggregation parameters. However, a few tests are disabled because of
stuff that doesn't work yet:

- creating fixed size aggregation jobs w/ agg param
- using `Vdaf.is_valid` to check whether subsequent collections against a
  single batch are valid

Part of #225
tgeoghegan added a commit that referenced this issue Mar 19, 2024
Enable Janus to run the dummy VDAF from crate `prio` with arbitrary
rounds (by extending the representation of `VdafInstance::Fake`) and
with multiple aggregation parameters chosen by the test.

To that end, this resurrects some functionality previously deleted in
PR #1508. At the time, we had that code gated on `#[cfg(test)]`, with
some cautions about its likely performance problems. I'm bringing it
back in the release configuration of Janus. I don't think we need a
feature flag for this: the various `VdafInstance::Fake` variants that
would use these codepaths are not currently compiled into release Janus
builds. Even if they were, deployments can avoid this risky code by
simply not using those VDAFs.

This wires up some tests for both time interval and fixed size tasks,
running multiple collections against a single batch with varying
aggregation parameters. However, a few tests are disabled because of
stuff that doesn't work yet:

- creating fixed size aggregation jobs w/ agg param
- using `Vdaf.is_valid` to check whether subsequent collections against a
  single batch are valid

Part of #225
tgeoghegan added a commit that referenced this issue Mar 19, 2024
Enable Janus to run the dummy VDAF from crate `prio` with arbitrary
rounds (by extending the representation of `VdafInstance::Fake`) and
with multiple aggregation parameters chosen by the test.

To that end, this resurrects some functionality previously deleted in
PR #1508, though it remains gated on `#[cfg(test)]`.

This wires up some tests for both time interval and fixed size tasks,
running multiple collections against a single batch with varying
aggregation parameters. However, a few tests are disabled because of
stuff that doesn't work yet:

- creating fixed size aggregation jobs w/ agg param
- using `Vdaf.is_valid` to check whether subsequent collections against a
  single batch are valid

Part of #225
tgeoghegan added a commit that referenced this issue Apr 12, 2024
Re-enables the `janus_in_process_one_round_with_agg_param_time_interval`
integration test, and makes fixes necessary to get it passing.

 - `Transaction::get_unaggregated_client_report_ids_by_collect_for_task`
   is renamed to `get_aggregatable_reports_for_time_interval_task` to
   reflect that it doesn't (yet?) work for fixed size tasks, and that
   while the reports it returns can be aggregated, they aren't
   necessarily unaggregated.
 - The query in that method is modernized to use `self.task_info_for`
   and to set `updated_at`, `updated_by` when reports are marked as
   having started aggregation.
 - Added some notes and TODOs throughout aggregation job management to
   indicate where we should or should not be scrubbing reports. Report
   scrubbing is a valuable optimization for Prio-style VDAFs without
   multi-collection, but we need to keep input shares, etc.,
   indefinitely in order to collect reports multiple times.
 - `Transaction::interval_has_unaggregated_reports` now considers the
   aggregation parameter and not just the `aggregation_started` column
   on `client_reports`. See comments inline for discussion of the fast
   and slow paths.

Part of #225
@tgeoghegan
Copy link
Contributor Author

Note from review on #3003 (comment): in the case of the trivial aggregation parameter (Prio3), we eagerly scrub reports (i.e., discard input shares and public share) once they are prepared. We can't do that quite as easily when running Poplar1 or Mastic because we might need to aggregate the report again. However, if the aggregators notice they're at the last level of the tree, then they can scrub. We might need some libprio-rs plumbing so that aggregators can determine whether an aggregation parameter is for the leaves.

@branlwyd
Copy link
Contributor

Revisiting this: we need to implement, at least, creation of aggregation jobs w/ an aggregation parameter. There is currently test-only code which handles this for time-interval only.

Now that multi-collection of batches is gone, this may be all that is needed. But we should test to make sure that this is the only place that needs to be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants