Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clients can receive unwanted duplicate progress when opening a separate session #2817

Open
sipsma opened this issue Apr 20, 2022 · 2 comments

Comments

@sipsma
Copy link
Collaborator

sipsma commented Apr 20, 2022

Background: Dagger supports exporting results in a DAG at arbitrary points of execution, which is a bit different than Buildkit's underlying model that currently only allows a single export configuration that's specified at the beginning of a Build request. Dagger works around this by opening separate sessions from the main Solve to export results.

The problem: The approach of separate sessions works perfectly fine for now, but does have a weird side effect in that Buildkit "replays" all progress events, including ExecOp logs, for every vertex in the LLB DAG referenced by the new export session. I believe I tracked this behavior down to these lines. I'd presume this exists so that if one client is doing a build and another totally separate client connects to Buildkit with an overlapping build, that new client gets all the progress so far backfilled rather than only getting the newest updates. That makes sense in general, but for Dagger's use case it results in the client receiving progress updates it already received. Currently, that means that progress logs get printed multiple times, which is very confusing for users.

Dagger want to deduplicate these logs but right now the only way I'm seeing would be to hash the relevant data in each structure, which could get extremely expensive for progress logs especially at a certain scale EDIT: commented below with a better stopgap approach.

Other options that could work better:

  1. Buildkit sends each progress item w/ a unique ID that can serve as a hash key for clients to check whether they have already received the progress or not
  2. Buildkit supports a solve request option that disables replaying of progress events so far in builds and only sends new ones. This also saves the network transfer of logs that will be ignored anyways.

@tonistiigi Let me know if this makes sense or if you think there's another way to approach this. Support for Export in the Gateway API is obviously a better long-term solution, but I'm guessing that's a much bigger project and we'd prefer to have a simpler fix in the meantime.

@sipsma
Copy link
Collaborator Author

sipsma commented Apr 20, 2022

On a second thought, I guess there is a stopgap solution where the client maintains state on each vertex digest received and doesn't print logs for ones that have already been completed. That's not as ideal as just checking a unique ID or disabling this entirely, but isn't as expensive as hashing all the data. I'd still like to see if one of the other solutions would work, but marking this as just an enhancement for now.

@tonistiigi
Copy link
Member

This is what we do in buildx bake docker/buildx#977 . Far from ideal but don't have very good solutions in mind atm. Somehow would need to track that vertex is already part of session group. Or only do one status request for multiple builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants