Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large aggregated result #7527

Closed
gvos94 opened this issue Jan 7, 2022 · 11 comments
Closed

Large aggregated result #7527

gvos94 opened this issue Jan 7, 2022 · 11 comments
Labels
area/executor solution/superseded This PR or issue has been superseded by another one (slightly different from a duplicate) solution/workaround There's a workaround, might not be great, but exists type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@gvos94
Copy link

gvos94 commented Jan 7, 2022

Summary

What happened/what you expected to happen?
Hello. I’m running a workflow that fans out quite a few parallel tasks, each of which outputs a parameter and then aggregates the parameters with {{tasks.task-name.outputs.parameters.output-name}} in the next step. However, the wait and the main container seems to be crashing with the following error: standard_init_linux.go:228: exec user process caused: argument list too long . Looks like this is because the size of the aggregated result is too huge (136 KB) to fit into the ARGO_TEMPLATE environment variable. Any idea how can I fix this? Thanks!

What version of Argo Workflows are you running?
v3.2.4 with the PNS executor


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@sarabala1979
Copy link
Member

@gandeevan you can store it as artifacts for aggregated output and pass it to downstream steps.

@gvos94
Copy link
Author

gvos94 commented Jan 10, 2022

I did try storing the aggregated output as an artifact for the downstream step using the raw field. But that causes the same issue, since the entire template including the value of the raw field is populated into the ARGO_TEMPLATE environment variable.

To the best of my knowledge there's no workflow variable to aggregate the aggregate the artifacts of the previous step (which uses withParams) - the artifact equivalent of steps.<STEPNAME>.outputs.parameters. Is that correct?

@sarabala1979
Copy link
Member

yes, there is one more workaround you can try using PVC for all fanout steps to store the output and downstream step can read from the same PVC

@gvos94
Copy link
Author

gvos94 commented Jan 10, 2022

We use GKE, which doesn't support simultaneous mounting of a volume on multiple nodes in the read-write mode. Using a PVC for storing the output would thus serialize the fanned-out steps.

@gvos94
Copy link
Author

gvos94 commented Jan 10, 2022

Just thinking out loud here - would it be a good idea to add the support to expose the aggregated result as an artifact, to solve for when the aggregated result is too big to fit into a parameter? I can probably take a shot at this.

@sarabala1979
Copy link
Member

sarabala1979 commented Jan 10, 2022

@gandeevan good idea. I will change the label to enhancement
Please update the title and add the usecase

@sarabala1979 sarabala1979 added type/feature Feature request and removed type/bug triage labels Jan 10, 2022
@evax
Copy link

evax commented Jan 19, 2022

We've started running into the same problem after upgrading from 3.1.10 to 3.2.6.

It seems to have been caused by the switch from debian slim to alpine in #6006:

docker run -it docker.io/library/debian:10.7-slim /bin/sh -c "getconf ARG_MAX"
2097152
docker run -it alpine:3 /bin/sh -c "getconf ARG_MAX"
131072

@tomkennes
Copy link

tomkennes commented Jan 19, 2022

Running into similar issues here. We are using Argo Workflows to consume events from several webhooks. Sometimes the payload, which is passed as part of arg, seems to be too large for a workflow to be able to start:
standard_init_linux.go:228: exec user process caused: argument list too long

@tomkennes
Copy link

tomkennes commented Jan 20, 2022

@gandeevan, @evax , we just managed to come up with a solution. Maybe this would work for you as well.

You should enable archival of workflows. Basically you then write all workflows to a database for later retrieval: https://argoproj.github.io/argo-workflows/workflow-archive/. This feature creates a couple of tables in a SQL-database including the table argo_archived_workflows, which contains the actual kubernetes manifest for all the workflows.

Fetch the dynamic content from that field and rerun whatever you intended to run!

@alexec alexec added the wontfix label Feb 4, 2022
@alexec
Copy link
Contributor

alexec commented Feb 4, 2022

It is not possible to fix this is the current architecture due to limitations on the size of arguments.

@alexec
Copy link
Contributor

alexec commented Feb 21, 2022

This is actually a regression and should not have been closed. Rather than re-open, lets track in #7586.

@agilgur5 agilgur5 added type/bug type/regression Regression from previous behavior (a specific type of bug) solution/duplicate This issue or PR is a duplicate of an existing one and removed type/feature Feature request labels Sep 17, 2023
@agilgur5 agilgur5 added the solution/workaround There's a workaround, might not be great, but exists label Nov 12, 2023
@agilgur5 agilgur5 added solution/superseded This PR or issue has been superseded by another one (slightly different from a duplicate) area/executor and removed solution/duplicate This issue or PR is a duplicate of an existing one labels Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/executor solution/superseded This PR or issue has been superseded by another one (slightly different from a duplicate) solution/workaround There's a workaround, might not be great, but exists type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

No branches or pull requests

6 participants