In-band task configuration and transparency #290

wangshan · 2022-07-21T22:33:43Z

Task configuration between aggregators currently is handled out-of-band, this requires leader and helper to agree some secure method to exchange task configuration prior to upload.

An alternative is to create task in-band and on-demand from Report or ReportShare received from clients. Assuming server sends a unique task_id and all the parameters required by Task configuration to the clients out-of-band (this is defined in client capability in current spec). Then clients can include these fields in report's extension, and upload them to servers along with reports. This is similar to an idea mentioned in #271.

Upon receiving a Report (leader) or ReportShare (helper) with unseen tuple of (task_id, extension), the aggregators create the task on-demand, then proceed to aggregate-flow as usual.

This can be optimised to avoid checking the tuple of (task_id, extension), by letting client create task_id based on extension. For example, the client can share some text that defines the use case, communicated to them out-of-band, then create task_id using hash("shared text" || extension). This can be particularly useful if the "shared text" or part of extension is only available on the client side.

There are some nice features in this method:

By sending all task parameters to clients then return to the server, we add transparency to the task that clients participate in. Client has the choice to see what data has been collected with what parameters (like min_batch_size, max_batch_lifetime, or any differential privacy parameters if that's the privacy guarantee used). For some clients these task parameters can even be hardcoded on the client side (for e.g. on a mobile device) to avoid any tempering from server.
Because extension is used in HPKE's AAD, malicious leader cannot change the task parameters, for e.g reducing min_batch_size.
A "task" now means the same task_id and same task parameters (including parameters for choosing VDAF). We can guarantee reports from one task will only be aggregated with one VDAF. For the same reason, if a malicious client changes the task_id or task parameters, its report will be aggregated in a different task, with other poison reports from the same malicious attack. The "good" reports are not polluted.
It avoids any out-of-band task orchestration between leader and helper, which isn't defined by the spec yet.
The on-demand creation is easy to implement with streaming framework that has groupBy operator. In fact, task as an object doesn't have to exist in aggregators, it mainly becomes an identifier to group aggregations together.

The disadvantage is added overhead in Report and ReportShare. Some optimisation can be done by moving Extension out of ReportShare and into AggregateInitReq.

Note that the parameters that are not necessarily tied to a task may still need to be exchanged out-of-band between aggregators, like collector hpke config (or even aggregator's hpke config, see #289). Secrets that should not be know by clients, like vdaf_verify_key must still be exchanged out-of-band.

The text was updated successfully, but these errors were encountered:

simon-friedberger · 2022-07-22T09:55:46Z

Sounds workable. The clients might have to tell the leader who the helper is supposed to be.
And the aggregators would probably need some kind of restriction about which expensive computations to perform depending on client authentication. Otherwise malicious clients could invent new tasks which use expensive VDAFs and submit data for them.

chris-wood · 2022-07-22T13:06:31Z

The crux of the issue here seems to be that DAP currently lacks task configuration transparency. It's possible for clients to be configured with a task ID and related parameters that they think corresponds to some reasonable level of privacy (e.g., with a large enough min_batch_size), where in actuality the privacy guarantees are weak. Would you agree with that summary, @wangshan?

I don't know if in-band configuration is the best solution to this problem. @simon-friedberger points out a number of considerations that would complicate things. I need to think on this more.

Also, for what it's worth, in some previous version of this draft the task ID was derived from all relevant parameters in a manner similar to your proposal, i.e., task_id = hash("shared text" || extension), but we removed that because it imposed too much structure on the out of band configuration.

wangshan · 2022-07-22T14:10:04Z

Sounds workable. The clients might have to tell the leader who the helper is supposed to be.

This is definitely doable, the current task configuration parameters already include aggregator_endpoints

And the aggregators would probably need some kind of restriction about which expensive computations to perform depending
on client authentication. Otherwise malicious clients could invent new tasks which use expensive VDAFs and submit data for them.

True, but without client authentication, this attack is possible anyway, without knowing the cost of computation.

simon-friedberger · 2022-07-22T14:25:02Z

True, but without client authentication, this attack is possible anyway, without knowing the cost of computation.

Well, if the tasks are defined out-of-band nobody can trigger a Poplar1 task if you only want to run Prio3 tasks. But I totally agree that it's a corner case and it has a trivial fix.

wangshan · 2022-07-22T14:49:33Z

The crux of the issue here seems to be that DAP currently lacks task configuration transparency. It's possible for clients to be configured with a task ID and related parameters that they think corresponds to some reasonable level of privacy (e.g., with a large enough min_batch_size), where in actuality the privacy guarantees are weak. Would you agree with that summary, @wangshan?

I don't know if in-band configuration is the best solution to this problem. @simon-friedberger points out a number of considerations that would complicate things. I need to think on this more.

You summary about transparency is correct. But I'd say it's only half of the concern. The current spec does not specify how leader and helper can exchange task configuration safely. If task creation happens out-of-band, I can imagine at task initialisation, a rouge engineer from leader configs a task with weak privacy guanratee and sync that with helper, without anyone detecting it. With in-band configuration, client can detect such behaviour. Furthermore, client can implement sanity checks to verify the parameters indeed provide a strong guarantee. One advantage of addressing this with in-band configuration, is that it doesn't introduce a new API, therefore no new surface of attack. I'm happy to discuss this more.

The problem @simon-friedberger mentioned are definitely worth addressing, but I'd argue without client authentication, it's hard to protect malicious client from sending garbage data to the server regardless of how the task is configured.

Also, for what it's worth, in some previous version of this draft the task ID was derived from all relevant parameters in a manner similar to your proposal, i.e., task_id = hash("shared text" || extension), but we removed that because it imposed too much structure on the out of band configuration.

Interesting, is there a commit/issue I can read about this decision?

chris-wood · 2022-07-22T14:54:13Z

You summary about transparency is correct. But I'd say it's only half of the concern. The current spec does not specify how leader and helper can exchange task configuration safely. If task creation happens out-of-band, I can imagine at task initialisation, a rouge engineer from leader configs a task with weak privacy guanratee and sync that with helper, without anyone detecting it. With in-band configuration, client can detect such behaviour.

I don't think in-band configuration is necessary for this. Rather, in-band enforcement seems sufficient. In particular, if the client is given "weak parameters," it could just opt to not use them. Instead, if the client is given "good parameters" and uses them, in-band enforcement would require the helper to check these parameters, and would fail if they didn't match that which were provided by the rogue leader.

What I'm getting at is that in-band configuration seems separate from in-band enforcement. Would you agree or disagree with that claim?

Interesting, is there a commit/issue I can read about this decision?

I don't recall if this was captured in an issue. Let me know if you really need it and I can look back in the history.

wangshan · 2022-07-22T15:16:52Z

To make things clear, there are two issues we are discussing:

in-band enforcement and transparency, which is the topic of Consider enforcing min_batch_size check in aggregator #271
automatic on-demand task configuration.

simon-friedberger · 2022-07-28T16:30:53Z

Note that this removes the part of the protocol where leader, helpers and collector agree on parameters.

So, a collector - as the author of a client, which is probably the default case - can trivially create new tasks with e.g. min_batch_size = 2 which will probably be too low for most collected data.

Of course, as the author of a client, they could also just chose to ignore DAP entirely and send the value in the clear to a different telemetry server. So I don't think this fundamentally changes the attack model but it might still be bad for the "trustability" of DAP.

branlwyd · 2022-08-03T01:37:08Z

A few thoughts:

Overall I like this idea. In addition to concerns about the trustworthiness of the leader, a generally-available DAP deployment will need to agree on task parameters with its co-aggregators. Having a standardized solution to this problem would make deploying DAP much easier in practice.
That said, I think the client is not well-positioned to specify some task parameters. IMO it would be unfortunate if some parameters are determined in-band and some parameters are determined out-of-band. Maybe we can find a solution to these issues? Specifically:
- vdaf_verify_key is private between the aggregators. (I think the leader can be trusted to generate this value & communicate it to the helper when a new task is introduced?)
- The leader-helper & leader-collector bearer tokens used for auth are private between the leader/helper & the leader/collector, respectively. (I think the leader could generate & inform the helper of the leader-helper auth token on task creation. The collector-leader token is trickier, since currently all leader-collector communication is initiated by the collector -- there is a chicken & egg problem of confirming who the correct collector is. Also, the bearer token is considered to be a temporary solution--the authentication method may change, which may change the considerations here.)
- collector_config is not private, but it will eventually need to be able to change (it is a public key & therefore must eventually be rotatable). This introduces complications not only because we want to identify tasks by (task_id, task_parameters), but also because client updates are not atomic -- during rotation, for some time some set of (updated) clients will be advertising the rotated keyset, while another set of (non-updated) clients will be advertising the non-rotated keyset.
I have a slight preference that tasks continue to be identified by task_id rather than (task_id, task_parameters). First, I don't think there is any real benefit to allowing multiple tasks to share the same ID (as long as the task ID space is large enough that accidental collision is negligible). Having a standard identifier shared between the aggregators will help practical debugging quite a lot. And I think at least some task parameters will eventually want to change (i.e. collector_config); not including the task parameters as part of the task identifier will make this easier. OTOH, I do think it is valuable to check & fail if the provided task parameters don't match what is configured for an existing task.
I am also somewhat dubious about the practical protections this provides to clients -- since the entity operating the collector & providing the client software will almost always be the same, I agree with Simon that they could just ignore DAP entirely and send the value in the clear. (Maybe I am mistaken about the deployment configuration that this is meant to protect?)
We are early-on enough in specifying DAP that these parameters could be communicated in an (optional?) field in the relevant messages, rather than as an extension. OTOH, this eases moving the task parameters to the aggregate initialization request--moving extensions is tricky because not every report for the same task is going to have the same set of extensions, nor will the extension ordering necessarily be the same.
I think DAP deployments will definitely want to include more information than just the raw task parameters. I imagine things like a collector identity, client authentication, additional task metadata (e.g. human-readable name/description), deployment-specific task parameters, etc. I would be OK implementing these as additional extensions on the client report.

wangshan · 2022-08-10T22:14:25Z

So, a collector - as the author of a client, which is probably the default case - can trivially create new tasks with e.g. min_batch_size = 2 which will probably be too low for most collected data.

This threat exists even without on-demand task configuration, the author of client can create a task of min_batch_size=2 out of band, and client would not know about it. But as long as we send such parameters to client side, such behaviour becomes much more detectable.

In general, like you said, DAP cannot prevent a collector/leader organisation from deliberately miss using or bypassing the DAP channel, but doing so requires some change on client and it's harder to hide than server-side work

wangshan · 2022-08-10T22:24:18Z

We are early-on enough in specifying DAP that these parameters could be communicated in an (optional?) field in the relevant messages, rather than as an extension.

I think extension is designed to do this: as an optional field for report? cc @chris-wood

OTOH, this eases moving the task parameters to the aggregate initialization request--moving extensions is tricky because not every report for the same task is going to have the same set of extensions, nor will the extension ordering necessarily be the same.

I'm curious why the extension can be different for the same task (other than key rotation in collector hpke config), and why the ordering may different?

wangshan · 2022-08-10T22:47:12Z

@branlwyd I agree collector_config and vdaf_verify_key shouldn't go to device, for the latter I think there were concerns whether it's privacy leak free if the key is chosen by the leader. I don't see how we can deliver these parameters other than OOB, perhaps an independent authority that leader and helper can query?

First, I don't think there is any real benefit to allowing multiple tasks to share the same ID (as long as the task ID space is large enough that accidental collision is negligible).

I agree with this too, if we can exclude parameters that may change during one task (like collector_config), then the reason for a task id to be shared by more than one tasks, is most likely due to task creation error or malicious attack.

simon-friedberger · 2022-08-11T07:26:39Z

So, a collector - as the author of a client, which is probably the default case - can trivially create new tasks with e.g. min_batch_size = 2 which will probably be too low for most collected data.

This threat exists even without on-demand task configuration, the author of client can create a task of min_batch_size=2 out of band, and client would not know about it. But as long as we send such parameters to client side, such behaviour becomes much more detectable.

But with an out-of-band agreement this can be handled during the agreement. If the collector says we want to collect user age with min_batch_size = 2 and compute the median the aggregators can decline because it is a bad idea. If we are specifying that this will be handled automatically by software such a check cannot be done.

wangshan · 2022-08-11T16:01:00Z

If the collector says we want to collect user age with min_batch_size = 2 and compute the median the aggregators can decline because it is a bad idea. If we are specifying that this will be handled automatically by software such a check cannot be done.

Can we not implement the same check in aggregators when a task is being created on demand?

cjpatton · 2022-08-11T21:31:16Z

Overall I like the direction we're going with this thread. Like others, I'm most interested in the potential for streamlining task on-boarding. That said, lots of good points have been brought up about the potential for things to go wrong. I think my top concerns are:

In-band enforcement (I think @branlwyd and @chris-wood raised this previously): What should we do if one party accepts a task config, but the other does not? In particular I'm thinking of the case where the Leader starts an aggregation job, but the Helper aborts due to the task config being unacceptable (for whatever reason). In practice, a couple of engineers from two different orgs will have to hop on a call and start debugging and we'll declare an incident. Meanwhile, a backlog of reports will start building up. Out-of-band task configuration reduces the chance of these sorts of surprises coming up.
We need to ensure that co-Aggregators can still do capacity planning. I think this means that there will have to be at least some "global" parameters that are picked out of band, e.g., min_batch_duration for time-interval tasks.

I think this issue is ready for an initial draft PR. It would be helpful to see the details spelled out to see if there's anything not workable here.

simon-friedberger · 2022-08-12T11:27:11Z

Can we not implement the same check in aggregators when a task is being created on demand?

That depends, I don't think this kind of check can be codified because the aggregation system doesn't know the privacy implication of the data being processed. App usage time is probably less important than monthly income but both are just numbers to be averaged.

I am fine if we want to ignore this but suggest that we clearly describe it as out of scope then.

cjpatton · 2022-08-17T00:07:44Z

+1 @simon-friedberger

cjpatton · 2022-08-24T21:35:14Z

Hi folks, just wanted to update people here that @wangshan will be working on an extension (DAP's first!) to specify a mechanism for in-band task provisioning. I think the goal will be to get folks feedback in time for IETF 115 so that we can consider adopting the draft as a WG item.

cjpatton · 2022-09-13T01:13:26Z

Closing this issue as resolved. Here is the repo where we're working on this, if anyone is curious: https://github.com/wangshan/draft-wang-ppm-dap-taskprov

Tentative goal is to present this at IETF 115.

wangshan changed the title ~~In-band task configuration~~ In-band task configuration and transparency Jul 21, 2022

wangshan mentioned this issue Jul 22, 2022

Consider enforcing min_batch_size check in aggregator #271

Closed

tgeoghegan mentioned this issue Jul 29, 2022

Consider adding max_task_lifetime to force retire a task #291

Closed

cjpatton mentioned this issue Sep 7, 2022

Add task configuration schema and describe client behaviour ietf-wg-ppm/draft-ietf-ppm-dap-taskprov#2

Merged

cjpatton closed this as not planned Won't fix, can't repro, duplicate, stale Sep 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In-band task configuration and transparency #290

In-band task configuration and transparency #290

wangshan commented Jul 21, 2022 •

edited by chris-wood

Loading

simon-friedberger commented Jul 22, 2022

chris-wood commented Jul 22, 2022

wangshan commented Jul 22, 2022 •

edited

Loading

simon-friedberger commented Jul 22, 2022

wangshan commented Jul 22, 2022

chris-wood commented Jul 22, 2022

wangshan commented Jul 22, 2022

simon-friedberger commented Jul 28, 2022

branlwyd commented Aug 3, 2022 •

edited

Loading

wangshan commented Aug 10, 2022

wangshan commented Aug 10, 2022

wangshan commented Aug 10, 2022

simon-friedberger commented Aug 11, 2022

wangshan commented Aug 11, 2022

cjpatton commented Aug 11, 2022 •

edited

Loading

simon-friedberger commented Aug 12, 2022

cjpatton commented Aug 17, 2022

cjpatton commented Aug 24, 2022 •

edited

Loading

cjpatton commented Sep 13, 2022

In-band task configuration and transparency #290

In-band task configuration and transparency #290

Comments

wangshan commented Jul 21, 2022 • edited by chris-wood Loading

simon-friedberger commented Jul 22, 2022

chris-wood commented Jul 22, 2022

wangshan commented Jul 22, 2022 • edited Loading

simon-friedberger commented Jul 22, 2022

wangshan commented Jul 22, 2022

chris-wood commented Jul 22, 2022

wangshan commented Jul 22, 2022

simon-friedberger commented Jul 28, 2022

branlwyd commented Aug 3, 2022 • edited Loading

wangshan commented Aug 10, 2022

wangshan commented Aug 10, 2022

wangshan commented Aug 10, 2022

simon-friedberger commented Aug 11, 2022

wangshan commented Aug 11, 2022

cjpatton commented Aug 11, 2022 • edited Loading

simon-friedberger commented Aug 12, 2022

cjpatton commented Aug 17, 2022

cjpatton commented Aug 24, 2022 • edited Loading

cjpatton commented Sep 13, 2022

wangshan commented Jul 21, 2022 •

edited by chris-wood

Loading

wangshan commented Jul 22, 2022 •

edited

Loading

branlwyd commented Aug 3, 2022 •

edited

Loading

cjpatton commented Aug 11, 2022 •

edited

Loading

cjpatton commented Aug 24, 2022 •

edited

Loading