You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think it would be helpful to explicitly list what sort of synchronization guarantees the aggregators need to uphold. Some of these are implicit in the text elsewhere, and they would be important to the architecture of a distributed aggregator. Here's what I have so far:
Leader
The leader has to perform anti-replay checks between receiving a report and sending it in an aggregation job (i.e. deduplicating by ReportMetadata). This is easily amenable to approaches that only provide eventual consistency.
The leader needs some synchronization between aggregate share requests and aggregation job requests to make sure it doesn't aggregate any new reports into a batch that has already been collected. This requirement is significantly different between time interval queries, where the client metadata determines the batch, and fixed size query, where the leader has full control of batches.
The leader needs to synchronize between sending aggregation job requests and sending aggregate share requests, to ensure that it never has both an aggregate share request collecting a batch and an aggregation job that affects the same batch outstanding at the same time. Note that with time interval queries, there is a many-to-many mapping between aggregation jobs and batches, while with fixed size queries, each aggregation job impacts only one batch.
Helper
The helper needs to perform duplicate report detection across aggregation job requests.
The helper needs strong consistency between aggregation job requests and subsequent aggregate share requests, so that it includes every eligible output share in its aggregate share.
The text was updated successfully, but these errors were encountered:
Right now, all of these require explicit "transactional"/"serializable" synchronization between the relevant components of the system (except for report uploads, as noted).
Reducing these to something requiring only eventual consistency would be valuable even for an implementation using a monolithic database (e.g. Postgres transactions can still encounter distributed systems-like inconsistencies at transaction isolation levels lower than SERIALIZABLE, without implementation effort to ensure the appropriate transactions necessarily encounter a write conflict).
I think it would be helpful to explicitly list what sort of synchronization guarantees the aggregators need to uphold. Some of these are implicit in the text elsewhere, and they would be important to the architecture of a distributed aggregator. Here's what I have so far:
The text was updated successfully, but these errors were encountered: