Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orchestration - Flakiness in small samples when using PNS executor #5285

Closed
Bobgy opened this issue Mar 11, 2021 · 7 comments
Closed

Orchestration - Flakiness in small samples when using PNS executor #5285

Bobgy opened this issue Mar 11, 2021 · 7 comments
Assignees
Labels
lifecycle/stale The issue / pull request is stale, any activities remove this label. upstream_issue

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Mar 11, 2021

In #5273, I switched to PNS executor by default. After that, it seems lightweight component sample fail more frequently than before.

(It failed 3 times consecutively in an example I found, but it seems to me that the last two failures should be fixed by #5284, we can observe the actual flakiness rate after the change)

Symptom, pipeline components that run too fast fail with:

failed to save outputs: could not chroot into main for artifact collection: container may have exited too quickly

https://oss-prow.knative.dev/view/gs/oss-prow/pr-logs/pull/kubeflow_pipelines/4147/kubeflow-pipeline-sample-test/1369916972707352576#1:build-log.txt%3A5681

Root cause seems to be: argoproj/argo-workflows#1256 (comment)

And workarounds can be:

  • (hacky) let the main container sleep for a while
  • (stable fix) mount the artifacts on an emptyDir
@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 11, 2021

/assign @chensun @neuromage
/cc @Ark-kun @capri-xiyue

I think we need to discuss this problem, from KFP side, we could automatically mount emptyDir for users.

@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 11, 2021

/cc @jessesuen @alexec
Do you have any suggestions? Is my above understanding of the workarounds accurate?

@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 12, 2021

Hmm, strange, I'm not seeing flakiness anymore

I'll keep observing this problem

@Bobgy Bobgy self-assigned this Mar 12, 2021
@Ark-kun
Copy link
Contributor

Ark-kun commented Mar 16, 2021

@Ark-kun Ark-kun self-assigned this Mar 16, 2021
@Ark-kun Ark-kun changed the title Flakiness in lightweight component sample using PNS executor Orchestration - Flakiness in small samples when using PNS executor Mar 16, 2021
@Bobgy
Copy link
Contributor Author

Bobgy commented Mar 17, 2021

I'll revert back to docker as default for current release, but we should evaluate possibility of mounting emptyDir volumes for artifact paths for users.

@stale
Copy link

stale bot commented Jun 18, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 18, 2021
@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 22, 2021

We now recommend emissary executor instead: #1654 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale The issue / pull request is stale, any activities remove this label. upstream_issue
Projects
None yet
Development

No branches or pull requests

4 participants