-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In-band task configuration and transparency #290
Comments
Sounds workable. The clients might have to tell the leader who the helper is supposed to be. |
The crux of the issue here seems to be that DAP currently lacks task configuration transparency. It's possible for clients to be configured with a task ID and related parameters that they think corresponds to some reasonable level of privacy (e.g., with a large enough I don't know if in-band configuration is the best solution to this problem. @simon-friedberger points out a number of considerations that would complicate things. I need to think on this more. Also, for what it's worth, in some previous version of this draft the task ID was derived from all relevant parameters in a manner similar to your proposal, i.e., |
This is definitely doable, the current task configuration parameters already include
True, but without client authentication, this attack is possible anyway, without knowing the cost of computation. |
Well, if the tasks are defined out-of-band nobody can trigger a Poplar1 task if you only want to run Prio3 tasks. But I totally agree that it's a corner case and it has a trivial fix. |
You summary about transparency is correct. But I'd say it's only half of the concern. The current spec does not specify how leader and helper can exchange task configuration safely. If task creation happens out-of-band, I can imagine at task initialisation, a rouge engineer from leader configs a task with weak privacy guanratee and sync that with helper, without anyone detecting it. With in-band configuration, client can detect such behaviour. Furthermore, client can implement sanity checks to verify the parameters indeed provide a strong guarantee. One advantage of addressing this with in-band configuration, is that it doesn't introduce a new API, therefore no new surface of attack. I'm happy to discuss this more. The problem @simon-friedberger mentioned are definitely worth addressing, but I'd argue without client authentication, it's hard to protect malicious client from sending garbage data to the server regardless of how the task is configured.
Interesting, is there a commit/issue I can read about this decision? |
I don't think in-band configuration is necessary for this. Rather, in-band enforcement seems sufficient. In particular, if the client is given "weak parameters," it could just opt to not use them. Instead, if the client is given "good parameters" and uses them, in-band enforcement would require the helper to check these parameters, and would fail if they didn't match that which were provided by the rogue leader. What I'm getting at is that in-band configuration seems separate from in-band enforcement. Would you agree or disagree with that claim?
I don't recall if this was captured in an issue. Let me know if you really need it and I can look back in the history. |
To make things clear, there are two issues we are discussing:
|
Note that this removes the part of the protocol where leader, helpers and collector agree on parameters. So, a collector - as the author of a client, which is probably the default case - can trivially create new tasks with e.g. Of course, as the author of a client, they could also just chose to ignore DAP entirely and send the value in the clear to a different telemetry server. So I don't think this fundamentally changes the attack model but it might still be bad for the "trustability" of DAP. |
A few thoughts:
|
This threat exists even without on-demand task configuration, the author of client can create a task of min_batch_size=2 out of band, and client would not know about it. But as long as we send such parameters to client side, such behaviour becomes much more detectable. In general, like you said, DAP cannot prevent a collector/leader organisation from deliberately miss using or bypassing the DAP channel, but doing so requires some change on client and it's harder to hide than server-side work |
I think extension is designed to do this: as an optional field for report? cc @chris-wood
I'm curious why the extension can be different for the same task (other than key rotation in collector hpke config), and why the ordering may different? |
@branlwyd I agree
I agree with this too, if we can exclude parameters that may change during one task (like |
But with an out-of-band agreement this can be handled during the agreement. If the collector says we want to collect user age with |
Can we not implement the same check in aggregators when a task is being created on demand? |
Overall I like the direction we're going with this thread. Like others, I'm most interested in the potential for streamlining task on-boarding. That said, lots of good points have been brought up about the potential for things to go wrong. I think my top concerns are:
I think this issue is ready for an initial draft PR. It would be helpful to see the details spelled out to see if there's anything not workable here. |
That depends, I don't think this kind of check can be codified because the aggregation system doesn't know the privacy implication of the data being processed. App usage time is probably less important than monthly income but both are just numbers to be averaged. I am fine if we want to ignore this but suggest that we clearly describe it as out of scope then. |
Hi folks, just wanted to update people here that @wangshan will be working on an extension (DAP's first!) to specify a mechanism for in-band task provisioning. I think the goal will be to get folks feedback in time for IETF 115 so that we can consider adopting the draft as a WG item. |
Closing this issue as resolved. Here is the repo where we're working on this, if anyone is curious: https://github.com/wangshan/draft-wang-ppm-dap-taskprov Tentative goal is to present this at IETF 115. |
Task configuration between aggregators currently is handled out-of-band, this requires leader and helper to agree some secure method to exchange task configuration prior to upload.
An alternative is to create task in-band and on-demand from
Report
orReportShare
received from clients. Assuming server sends a unique task_id and all the parameters required by Task configuration to the clients out-of-band (this is defined in client capability in current spec). Then clients can include these fields in report'sextension
, and upload them to servers along with reports. This is similar to an idea mentioned in #271.Upon receiving a Report (leader) or ReportShare (helper) with unseen tuple of (task_id, extension), the aggregators create the task on-demand, then proceed to aggregate-flow as usual.
This can be optimised to avoid checking the tuple of (task_id, extension), by letting client create
task_id
based onextension
. For example, the client can share some text that defines the use case, communicated to them out-of-band, then createtask_id
usinghash("shared text" || extension)
. This can be particularly useful if the "shared text" or part of extension is only available on the client side.There are some nice features in this method:
min_batch_size
,max_batch_lifetime
, or any differential privacy parameters if that's the privacy guarantee used). For some clients these task parameters can even be hardcoded on the client side (for e.g. on a mobile device) to avoid any tempering from server.extension
is used in HPKE's AAD, malicious leader cannot change the task parameters, for e.g reducingmin_batch_size
.groupBy
operator. In fact, task as an object doesn't have to exist in aggregators, it mainly becomes an identifier to group aggregations together.The disadvantage is added overhead in
Report
andReportShare
. Some optimisation can be done by moving Extension out ofReportShare
and intoAggregateInitReq
.Note that the parameters that are not necessarily tied to a task may still need to be exchanged out-of-band between aggregators, like collector hpke config (or even aggregator's hpke config, see #289). Secrets that should not be know by clients, like
vdaf_verify_key
must still be exchanged out-of-band.The text was updated successfully, but these errors were encountered: