Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation job creator: copy report data directly from client_reports to report_aggregations, rather than reading full reports #2689

Closed
branlwyd opened this issue Feb 16, 2024 · 1 comment · Fixed by #2727
Assignees

Comments

@branlwyd
Copy link
Contributor

Currently, the aggregation job creator will read a large number of reports (5000, at time of writing) in order to create aggregation jobs. For VDAFs with large reports, this can require a significant amount of memory.

Instead, the aggregation job creator could read report IDs & other (small) metadata required to generate reports, and use a SQL query which causes Postgres to directly copy the data from the relevant client_reports row into the relevant report_aggregations row. This would decouple the memory usage of the aggregation job creator from the report size of the relevant VDAFs.

@divergentdave
Copy link
Collaborator

The aggregation job creator and its SQL proxy are currently outliers in CPU consumption. Doing this copying within the database will improve performance, and give us more headroom before we have to start sharding the aggregation job creator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants