Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Remove the need for pod annotations to be mounted as a volume #6022

Merged
merged 14 commits into from
Jun 22, 2021

Conversation

chazapis
Copy link
Contributor

Currently, the controller mounts a DownwardAPI volume at each executor instance. The later uses it to read pod annotations and get updates on the main container's execution status. As mentioned in #4616, DownwardAPI volumes are not available outside of Kubernetes (i.e., in serverless environments).

This patch deprecates PR #5946, and introduces a new implementation based on the POC by @alexec in PR #5950, which moves the deadline annotation to the ARGO_DEADLINE environment variable and Implements a shutdownPod action to send a SIGTERM signal to the wait container. The wait container then kills the other containers in the pod.

Checklist:

Tips:

  • Your PR needs to pass the required checks before it can be approved. If the check is not required (e.g. E2E tests) it does not need to pass
  • Sign-off your commits to pass the DCO check: git commit --signoff.
  • Run make pre-commit -B to fix codegen or lint problems.
  • Say how how you tested your changes. If you changed the UI, attach screenshots.

@chazapis chazapis mentioned this pull request May 26, 2021
1 task
@chazapis
Copy link
Contributor Author

I merged the code from PR #5950 with master, tried it out and fixed some tests. I also removed the pod annotations completely, as there is no use of keeping the "legacy" path, when sending a SIGKILL to the wait container.

@codecov
Copy link

codecov bot commented May 26, 2021

Codecov Report

Merging #6022 (6b9ce6a) into master (0e94283) will increase coverage by 0.20%.
The diff coverage is 27.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6022      +/-   ##
==========================================
+ Coverage   47.59%   47.79%   +0.20%     
==========================================
  Files         250      250              
  Lines       15809    15636     -173     
==========================================
- Hits         7524     7474      -50     
+ Misses       7347     7227     -120     
+ Partials      938      935       -3     
Impacted Files Coverage Δ
cmd/argoexec/commands/root.go 1.51% <0.00%> (-0.03%) ⬇️
workflow/common/common.go 75.00% <ø> (ø)
workflow/controller/controller.go 19.63% <0.00%> (-0.43%) ⬇️
workflow/controller/pod_cleanup_key.go 20.00% <ø> (ø)
workflow/executor/executor.go 22.71% <0.00%> (+3.99%) ⬆️
workflow/executor/resource.go 30.45% <ø> (+1.49%) ⬆️
workflow/controller/exec_control.go 56.81% <33.33%> (+4.18%) ⬆️
workflow/controller/workflowpod.go 72.43% <76.47%> (-0.23%) ⬇️
workflow/controller/operator.go 70.50% <83.33%> (-0.62%) ⬇️
workflow/controller/dag.go 72.67% <100.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e94283...6b9ce6a. Read the comment docs.

Comment on lines 300 to 306
deadline := time.Time{}
if woc.workflowDeadline != nil {
deadline = *woc.workflowDeadline
}
if deadline.IsZero() || opts.executionDeadline.Before(deadline) {
deadline = opts.executionDeadline
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you put this into its own func and add some tests?

Copy link
Contributor Author

@chazapis chazapis May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I also found an issue: the second if should check that !opts.executionDeadline.IsZero(), or else it sets the deadline back to zero even if the woc.workflowDeadline is valid.

Comment on lines 1118 to 1129
str, err := json.Marshal(obj)
if err != nil {
return err
}
return template.Validate(string(str), func(tag string) error {
return errors.Errorf(errors.CodeBadRequest, "failed to resolve {{%s}}", tag)
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep this feature @jessesuen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put the validation code back in the next commit.

@alexec
Copy link
Contributor

alexec commented May 26, 2021

There is a risk of this causing problems with workflows that are running during an upgrade.

  • If they exceed their deadline they should die.
  • The new code never changes the deadline or issues SIGUSR2, I think that is Ok, because the new code will kill the containers.
  • If they are shutdown by the new code, then they should exit OK.

I think we need to highlight these risks.

I'd like @jessesuen to review this PR.

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this changes the executor, and because not all of the test suites build the executor, we need to be careful.

Do you think you could identify the tests that use shutdown?

return nil, fmt.Errorf("not found")
}

func getPodDeadline(pod *apiv1.Pod) (time.Time, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could have a basic test

@@ -1199,6 +1190,29 @@ func (woc *wfOperationCtx) assessNodeStatus(pod *apiv1.Pod, node *wfv1.NodeStatu
return nil
}

func getPodTemplate(pod *apiv1.Pod) (*wfv1.Template, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could have a basic test

@chazapis chazapis force-pushed the remove-execution-control branch from 2d7d64a to 08e0e13 Compare May 27, 2021 15:09
@chazapis
Copy link
Contributor Author

I just committed the requested changes.

However, having done several runs now, I think that we may not need the ARGO_DEADLINE variable at all. Since the deadline is set in createWorkflowPod() once, we could just set the activeDeadlineSeconds for the pod and let Kubernetes handle timeouts.

@alexec
Copy link
Contributor

alexec commented May 27, 2021

However, having done several runs now, I think that we may not need the ARGO_DEADLINE variable at all. Since the deadline is set in createWorkflowPod() once, we could just set the activeDeadlineSeconds for the pod and let Kubernetes handle timeouts.

@jessesuen I'm assuming there was probably a good reason for not doing this?

@chazapis
Copy link
Contributor Author

However, having done several runs now, I think that we may not need the ARGO_DEADLINE variable at all. Since the deadline is set in createWorkflowPod() once, we could just set the activeDeadlineSeconds for the pod and let Kubernetes handle timeouts.

I added another commit for this. Now the execution control is even simpler: all deadlines are handled by Kubernetes, termination is done with a SIGTERM from the controller to the executor.

@jessesuen
Copy link
Member

@jessesuen I'm assuming there was probably a good reason for not doing this?

Yes. activeDeadlineSeconds will delete the pod. But our termination mechanism will stop the pod without deleting it. Using activeDeadlineSeconds instead of our own deadline mechanism is a change in behavior -- some would say regression (including me).

@chazapis
Copy link
Contributor Author

@jessesuen I'm assuming there was probably a good reason for not doing this?

Yes. activeDeadlineSeconds will delete the pod. But our termination mechanism will stop the pod without deleting it. Using activeDeadlineSeconds instead of our own deadline mechanism is a change in behavior -- some would say regression (including me).

I just tried this (on Kubernetes v1.19.7) and the activeDeadlineSeconds does not delete the pod, but rather marks it as DeadlineExceeded. Argo recognizes the status as Failed.

@chazapis chazapis force-pushed the remove-execution-control branch from 827dac0 to 08e0e13 Compare May 31, 2021 08:51
@chazapis
Copy link
Contributor Author

Quick update: I modified the CI pipeline to always build the executor image (in a separate branch of my fork) and indeed the last commit that completely removes the deadline breaks the tests. So, I reverted the last commit and placed it in my remove-deadline branch. Now the tests pass.

alexec
alexec previously requested changes May 31, 2021
Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One challenge with this is that we need to make sure we minimize breakage when a new version of the controller is deployed, but old versions of the executor are running, notably in pods that still use annotations and SIGUSR2.

For the template, I think we can avoid either annotation or env vars and have the controller just look at the workflow manifest.

For deadline, I hope that shutdownPod will work for both old and new pods. We will need to test manually.

workflow/controller/operator.go Outdated Show resolved Hide resolved
workflow/controller/operator.go Outdated Show resolved Hide resolved
@chazapis
Copy link
Contributor Author

chazapis commented Jun 1, 2021

One challenge with this is that we need to make sure we minimize breakage when a new version of the controller is deployed, but old versions of the executor are running, notably in pods that still use annotations and SIGUSR2.

For the template, I think we can avoid either annotation or env vars and have the controller just look at the workflow manifest.

For deadline, I hope that shutdownPod will work for both old and new pods. We will need to test manually.

I tried a manual test of starting a workflow with a previous version and then switching to the new controller. Before the latest changes, the controller could not find the template in assessNodeStatus(). With the changes, it does. The new commit passes all CI tests.

@chazapis chazapis requested a review from alexec June 3, 2021 06:58
@alexec alexec self-assigned this Jun 3, 2021
@alexec alexec added this to the v3.2 milestone Jun 3, 2021
@alexec alexec dismissed their stale review June 11, 2021 18:06

done

@alexec
Copy link
Contributor

alexec commented Jun 11, 2021

This is almost there. You'll need to sync with master and fix the conflicts and by the look of it, run make lint. I think we should be good to merge.

Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment

chazapis added 2 commits June 15, 2021 17:53
Signed-off-by: Antony Chazapis <[email protected]>
alexec
alexec previously requested changes Jun 15, 2021
Copy link
Contributor

@alexec alexec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor comments

workflow/executor/executor.go Outdated Show resolved Hide resolved
workflow/executor/executor.go Outdated Show resolved Hide resolved
workflow/executor/executor.go Outdated Show resolved Hide resolved
workflow/executor/executor.go Outdated Show resolved Hide resolved
@@ -16,6 +16,9 @@ import (

func SignalContainer(restConfig *rest.Config, pod *corev1.Pod, container string, s syscall.Signal) error {
command := []string{"/bin/sh", "-c", "kill -%d 1"}
if container == common.WaitContainerName {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - I was about to say you should annotate this kill command - but this is backwards compatible

@chazapis chazapis requested a review from alexec June 16, 2021 20:26
@alexec alexec dismissed their stale review June 17, 2021 04:22

dismis

@alexec
Copy link
Contributor

alexec commented Jun 17, 2021

Can I ask you to run the e2e test suites locally one more time and confirm they pass before we merge? I don't like reverting PRs and it creates more work.

@chazapis
Copy link
Contributor Author

Can I ask you to run the e2e test suites locally one more time and confirm they pass before we merge? I don't like reverting PRs and it creates more work.

Actually, they had run yesterday successfully, before the latest merge with master. Today I see that the cron e2e test fails. I tried several things, but I could not find the cause. Running the cron e2e test locally passes. Any ideas?

@chazapis
Copy link
Contributor Author

Update: Looks like the failing test has something to do with the executor used. Probably the SIGTERM signal is not properly propagated.

@chazapis
Copy link
Contributor Author

chazapis commented Jun 22, 2021

@alexec, found the issue, which was unrelated to the refactoring, but rather on how the emissary signals the child processes (solution based on this). The new commit passes all tests.

@alexec
Copy link
Contributor

alexec commented Jun 22, 2021

tests are failing? DCO not complete?

@chazapis chazapis force-pushed the remove-execution-control branch from 25e1d5f to 2e0ac13 Compare June 22, 2021 18:48
@chazapis
Copy link
Contributor Author

tests are failing? DCO not complete?

Oops, you are right, I made some commits with another account by mistake; fixed. For the tests, I changed the E2E workflow to always build the argoexec container.

@alexec alexec enabled auto-merge (squash) June 22, 2021 22:00
@alexec alexec merged commit cecc379 into argoproj:master Jun 22, 2021
@alexec alexec mentioned this pull request Jun 28, 2021
9 tasks
uturunku1 pushed a commit to newrelic-forks/argo-workflows that referenced this pull request Jul 22, 2021
alexec added a commit that referenced this pull request Jul 26, 2021
* Update events.md (#6119)

Trying to use the argo workflows events and I noticed that some crucial explanations are missing here. I would like to add:
- A simple WorkflowTemplate bound to the WorkflowEventBinding, to show what is triggered by the curl that send the event
- Some infos about the process that bind the event to the workflow template:
   - template creation
   - event binding apply
   - api call to trigger the workflow template creation
Plus: there is a little mistake in the selector:  metadata["x-argo"] instead of metadata["X-Argo-E2E"] I would like to correct it in order to avoid mistakes during the curl.

Hope this is appreciated! ;)

Denis

Signed-off-by: uturunku1 <[email protected]>

* docs: Add note on the requirements of resource templates. Fixes #5566 (#6125)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: updated CHANGELOG.md (#6127)

Signed-off-by: GitHub <[email protected]>

Co-authored-by: alexec <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* add troubleshooting notes section for running-locally docs (#6132)

Co-authored-by: uturunku1 <“[email protected]”>
Signed-off-by: uturunku1 <[email protected]>

* fix(executor): Check whether any errors within checkResourceState() are transient. Fixes #6118. (#6134)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* build: Remove PNS_PRIVILEGED=true (#6138)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Document the extraction of data from a k8s resource (#6102)

* Document the extraction of data from a k8s resource

* remove reference to lines of a file that can be outdated

Co-authored-by: Yuan Tang <[email protected]>

* Remove yaml snippet and only keep the link to the example

Co-authored-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* build image output to docker (#6128)

Co-authored-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* chore: Update stress rig and docs. Fixes #6136 (#6141)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* chore: Upgrade Alibaba OSS to use more secure ListObjectsV2() (#6142)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix: Allow setting workflow input parameters in UI. Fixes #4234 (#5319)

* fix: Allow setting workflow input parameters in UI. Fixes #4234

Signed-off-by: Kenny Trytek <[email protected]>

* fix: Allow setting workflow input parameters in UI. Fixes #4234

 - Allow workflow input parameters as well as entrypoint parameters.

Signed-off-by: Kenny Trytek <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(controller): Performance improvement for Sprig. Fixes #6135 (#6140)

Signed-off-by: uturunku1 <[email protected]>

* update from v0.19.6 to v0.20.4 and indirect dependencies

Signed-off-by: uturunku1 <“[email protected]”>
Signed-off-by: uturunku1 <[email protected]>

* exec.GetAuthenticator takes two arguments in the k8s-client-go v0.20.4

Signed-off-by: uturunku1 <“[email protected]”>
Signed-off-by: uturunku1 <[email protected]>

* update makefile to use [email protected]

Signed-off-by: uturunku1 <“[email protected]”>
Signed-off-by: uturunku1 <[email protected]>

* docs: Fix release-notes.md

Signed-off-by: uturunku1 <[email protected]>

* docs: Update Graviti's website link (#6148)

Signed-off-by: uturunku1 <[email protected]>

* fix(ui): Fix-up local storage namespaces. Fixes #6109 (#6144)

Signed-off-by: uturunku1 <[email protected]>

* fix(executor): Capture emissary main-logs. Fixes #6145 (#6146)

Signed-off-by: uturunku1 <[email protected]>

* fix(ui): Fix event-flow scrolling. Fixes #6133 (#6147)

Signed-off-by: uturunku1 <[email protected]>

* test: Fix logging test (#6159)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat(ui): Add checkbox to check all workflows in list. Fixes #6069 (#6158)

Signed-off-by: uturunku1 <[email protected]>

* docs: Use 'depends' instead of 'dependencies' in examples (#6166)

Signed-off-by: Simon Behar <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat(server): Allow redirect_uri to be automatically resolved when using sso (#6167)

Signed-off-by: uturunku1 <[email protected]>

* fix(controller): Allow retry on transient errors when validating workflow spec. Fixes #6163 (#6178)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(controller): dehydrate workflow before deleting offloaded node status (#6112)

Signed-off-by: uturunku1 <[email protected]>

* docs: updated CHANGELOG.md (#6160)

Signed-off-by: GitHub <[email protected]>

Co-authored-by: alexec <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Remove RBAC for SSO from Roadmap (Already implemented) (#6174)

It looks like RBAC for SSO is already implemented by #4198 so hopefully it can be removed from the roadmap as it is also documented? https://argoproj.github.io/argo-workflows/argo-server-sso/#sso-rbac

Signed-off-by: uturunku1 <[email protected]>

* docs: updated CHANGELOG.md (#6187)

Signed-off-by: GitHub <[email protected]>

Co-authored-by: alexec <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Fix changelog order for .0 tags (#6188)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(controller): Wrong validate order when validate DAG task's argument (#6190)

Signed-off-by: BOOK <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix rebase conflict

Signed-off-by: uturunku1 <[email protected]>

* run go mod tidy

Signed-off-by: uturunku1 <[email protected]>

* refactor: Remove the need for pod annotations to be mounted as a volume (#6022)

Signed-off-by: Antony Chazapis <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: ContainerSets do not have 'depends' (#6199)

Signed-off-by: Simon Behar <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix: Fix security issues related to file closing and paths (G307 & G304) (#6200)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Add links to Python examples to description annotations (#6202)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs(executor): document k8s executor behaviour with program warnings (#6212)

* docs(executor): document k8s executor behaviour with program warnings

Signed-off-by: Tianchu Zhao <[email protected]>

* docs(executor): fix typo

Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix: Fix certain sibling tasks not connected to parent (#6193)

Signed-off-by: Simon Behar <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat(ui): Add copy to clipboard shortcut (#6217)

Signed-off-by: Simon Behar <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: updated CHANGELOG.md (#6220)

Signed-off-by: GitHub <[email protected]>

Co-authored-by: alexec <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Add KarrotPay in USERS.md (#6221)

Signed-off-by: Byungjin Park <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* run go mod tidy

Signed-off-by: uturunku1 <[email protected]>

* docs: Add workflow-count-resourcequota.yaml example (#6225)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix: Reduce argoexec image size (#6197)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(conttroller): Always set finishedAt dote. Fixes #6135 (#6139)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat: Add support for deletion delay when using PodGC (#6168)

Signed-off-by: Stefan Sedich <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: update bug report template (#6236)

Signed-off-by: uturunku1 <[email protected]>

* docs: updated CHANGELOG.md (#6242)

Signed-off-by: GitHub <[email protected]>

Co-authored-by: alexec <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(executor): emissary - make argoexec executable from non-root containers. Fixes #6238 (#6247)

Signed-off-by: Yuan Gong <[email protected]>

Co-authored-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat: Introduce when condition to retryStrategy (#6114)

Signed-off-by: Simon Behar <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* ci: Add Go code security scanner via gosec. Fixes #6203 (#6232)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: fix end of files, new lines and remove multiple lines (#6240)

Signed-off-by: NikeNano <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: add json destructuring example (#6250)


Signed-off-by: Michael Crenshaw <[email protected]>

Co-authored-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(executor): Tolerate docker re-creating containers. Fixes #6244 (#6252)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(executor): emissary - make /var/run/argo files readable from non-root users. Fixes #6238 (#6304)

Signed-off-by: Yuan Gong <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs(controller): add missing emissary executor (#6291)

Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: docs and hacks improvements (#6310)

Signed-off-by: Michael Crenshaw <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(cli): Only list needed fields. Fixes #6000 (#6298)

* fix(cli): Only list needed fields

Signed-off-by: Alex Collins <[email protected]>

* ok

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Fix typo (#6311)

Signed-off-by: Byungjin Park <[email protected]>

Co-authored-by: Saravanan Balasubramanian <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* require sso redirect url to be an argo url (#6211)

Signed-off-by: Brandon Goode <[email protected]>

Co-authored-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: code format (#6269)

- Add yaml rendering
- Add bash rendering

Co-authored-by: Simon Behar <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat(controller): Store artifact repository in workflow status. Fixes #6255 (#6299)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: document using ingress with TLS enabled (#6324)

Signed-off-by: valorl <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: document how to access hyphenated steps in expression templates (#6318)

Signed-off-by: Michael Crenshaw <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat(controller): Differentiate CronWorkflow submission vs invalid spec error metrics (#6309)

* feat(controller): Differentiate CronWorkflow submission vs invalid spec error metrics

Signed-off-by: Yuan Tang <[email protected]>

* Address feedback

Signed-off-by: Yuan Tang <[email protected]>

Co-authored-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* chore: deleted wft.yaml

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* ci: only run Snyk once a day on master

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(controller): Not updating StoredWorkflowSpec when WFT changed during workflow running (#6342)

Signed-off-by: Saravanan Balasubramanian <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(cli): v3.1 Argo Auth Token (#6344)

* fix(cli): v3.1 Argo Auth Token

Signed-off-by: Saravanan Balasubramanian <[email protected]>

* update

Signed-off-by: Saravanan Balasubramanian <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Add Alibaba Group to USERS.md (#6353)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(crd): temp fix 34s timeout bug for k8s 1.20+ (#6350)

Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: updated CHANGELOG.md (#6348)

Signed-off-by: GitHub <[email protected]>

Co-authored-by: sarabala1979 <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs(users): Add WooliesX (#6358)

Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(cli): Overridding name/generateName when creating CronWorkflows if specified (#6308)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat(controller): sortDAGTasks supports sort by field Depends (#6307)

Signed-off-by: BOOK <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(fields): handle nexted fields when excluding (#6359)

Signed-off-by: AntoineDao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* feat(controller): Allow configurable host name label key when retrying different hosts (#6341)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* pull argo-events changes
update versions in go.mod and go.sum

Signed-off-by: uturunku1 <[email protected]>

* run go mod tidy

Signed-off-by: uturunku1 <[email protected]>

* fix(controller): allow workflow.duration to pass validator (#6376)

Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(controller): fix retry on transient errors when validating workflow spec (#6370)

Signed-off-by: Tianchu Zhao <[email protected]>

Co-authored-by: Saravanan Balasubramanian <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix: examples/ci.yaml indent (#6328)

Signed-off-by: kungho.back <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* chore: import grafana dashboard (#6365)

Signed-off-by: GitHub <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(gcs): throw argo not found error if key not exist (#6393)

Signed-off-by: AntoineDao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* Revert "fix: examples/ci.yaml indent (#6328)"

This reverts commit 3f72fe5.

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix: Server crash when opening timeline tab for big workflows (#6369)

Signed-off-by: Alexander Matyushentsev <[email protected]>

Co-authored-by: Saravanan Balasubramanian <[email protected]>
Co-authored-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Add 4intelligence (#6400)

Signed-off-by: Thiago Gil <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Add note on additional required permission for createBucketIfNotPresent for OSS driver (#6378)

Signed-off-by: Yuan Tang <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(controller): allow initial duration to be 0 instead of current_time-0 (#6389)


Signed-off-by: Tianchu Zhao <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* fix(controller): Mark workflows wait for semaphore as pending. Fixes #6351 (#6356)

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* docs: Updating upgrading.md. Closes #6314

Signed-off-by: Alex Collins <[email protected]>
Signed-off-by: uturunku1 <[email protected]>

* not need to convert to unstructured.unstructured

I was getting this error controller_test.go: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *unstructured.Unstructured: failed to list *unstructured.Unstructured: item[0]: can't assign or convert unstructured.Unstructured into v1alpha1.Workflow

Based on this comment, it seems like the conversion is not needed: kubernetes-sigs/controller-runtime#524 (comment)

Signed-off-by: uturunku1 <[email protected]>

* run make pre-commit -B

Signed-off-by: uturunku1 <[email protected]>

* fix potential file inclusion via variable lint error

there is a risk that an unintended file path will be specified. So uuse filepath.Clean() to clean up possible bad paths

Signed-off-by: uturunku1 <[email protected]>

* fix format issue

Signed-off-by: uturunku1 <[email protected]>

Co-authored-by: Denis Bellotti <[email protected]>
Co-authored-by: Yuan Tang <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: alexec <[email protected]>
Co-authored-by: uturunku1 <“[email protected]”>
Co-authored-by: Christophe Blin <[email protected]>
Co-authored-by: meijin <[email protected]>
Co-authored-by: kennytrytek <[email protected]>
Co-authored-by: Caden <[email protected]>
Co-authored-by: Simon Behar <[email protected]>
Co-authored-by: Stefan Sedich <[email protected]>
Co-authored-by: Reijer Copier <[email protected]>
Co-authored-by: Brandon High <[email protected]>
Co-authored-by: BOOK <[email protected]>
Co-authored-by: Antony Chazapis <[email protected]>
Co-authored-by: Tianchu Zhao <[email protected]>
Co-authored-by: Byungjin Park (Claud) <[email protected]>
Co-authored-by: Yuan (Bob) Gong <[email protected]>
Co-authored-by: Niklas Hansson <[email protected]>
Co-authored-by: Michael Crenshaw <[email protected]>
Co-authored-by: Saravanan Balasubramanian <[email protected]>
Co-authored-by: brgoode <[email protected]>
Co-authored-by: Valér Orlovský <[email protected]>
Co-authored-by: Alex Collins <[email protected]>
Co-authored-by: sarabala1979 <[email protected]>
Co-authored-by: Antoine Dao <[email protected]>
Co-authored-by: KUNG HO BACK <[email protected]>
Co-authored-by: Zadkiel <[email protected]>
Co-authored-by: Alexander Matyushentsev <[email protected]>
Co-authored-by: Thiago Bittencourt Gil <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants