-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TEP-0046: Colocation of Tasks and Workspaces (formerly PipelineRun in a Pod) #318
Conversation
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jlpettersson for this TEP. Linking to #316 as both are somewhat similar. I like this approach, but a few comments though.
teps/0046-pipelinerun-in-a-pod.md
Outdated
This TEP describes an alternative way to run a Pipeline composed of Tasks sharing data through Workspaces. | ||
|
||
The way we run Tasks in a Pipeline that share data through a Workspace today has several problems, some | ||
of them are related to the need to use Kubernetes Persistent Volumes to back Workspaces and to | ||
schedule the different TaskRun Pods in an appropriate way, especially to allow multiple of them to run | ||
concurrently while accessing the same Workspace volume. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it an alternative way or an additional way ? Maybe it's a nit, but I would rather use the work additional as I don't think we should stop supporting the way PipelineRun are executing today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention was not to change the current way, but to add an alternative way - a PipelineRun will use either, not both. But this TEP only describes the alternative way :) Don't know how I would phrase it in better english :)
teps/0046-pipelinerun-in-a-pod.md
Outdated
Currently the only way to run a Pipeline composed of Tasks is that each task is executed in its own Pod. If you combine | ||
Tasks in a Pipeline and they need to share data (beyond a simple | ||
[result](https://github.com/tektoncd/pipeline/blob/master/docs/pipelines.md#using-results)), | ||
you'll need to provision a PVC or do some other similar, cloud specific thing, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/cloud specific thing/cloud specific provisionning/
?
teps/0046-pipelinerun-in-a-pod.md
Outdated
- Make it possible to run whole PipelineRun (including all TaskRuns) within a single Pod so that the workspace can be within the Pod: | ||
- At runtime in a PipelineRun | ||
- This should not change anything for the author of Tasks or Pipelines | ||
- No changes in the Pipeline or Task API for authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are only talking about the Pipeline
and Task
CRD here right ? (because PipelineRun
CRD will most likely change)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, at least Pipeline
and Task
for the author. Didn't think that PipelineRun
would need changes but I might be wrong. And when using this alternative, TaskRun
CRD will not be used.
teps/0046-pipelinerun-in-a-pod.md
Outdated
|
||
### Goals | ||
|
||
- Make it possible to run whole PipelineRun (including all TaskRuns) within a single Pod so that the workspace can be within the Pod: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the goal to be to run the whole PipelineRun
into a single Pod
or do we see ourselves having the need to support running into multiple pods (but some task on the same pod).
My initial thought is, I think it make sense to go with this goal "the whole PipelineRun
into a single Pod
" for now, and we can definitely see if we need more and if something like a Pipeline CustomTask
would be a good way to "group" tasks into separate pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I tried to keep the scope narrow. I think support for multiple pods is too broad scope - that can be in so many different ways - and can be reasoned about in other TEPs perhaps.
teps/0046-pipelinerun-in-a-pod.md
Outdated
to speed up the build and see the full PipelineRun to be run in a single Pod, on a single Node WITHOUT NEEDING | ||
to mount external volumes and worry that its Pipeline is composed of several Pods that might be scheduled to | ||
different cloud Availability Zones and reach a deadlocked state. | ||
- A user wants to create a Pipeline composed of Tasks from the catalog, to checkout code, run unit tests and upload results, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This use-case is shared with TEP-0044 👼🏼
Something we'll need to think about carefully here is that Kubernetes doesn't support the notion of per-Container Service Accounts. A ServiceAccount attached to a Pod is given to every Container and there isn't really a way to isolate them from each other (that I'm aware of?). This means that a PipelineRun with different Service Accounts attached to different Tasks would need to (I guess?) attach all the service accounts to the single Pod. Maybe there's another way to tackle this using per-Container Projected Volumes for the Service Account tokens that each "PipelineTask" needs? Whatever the solution we'll need to make sure we document differences in the way PipelineRuns execute when in "single-Pod" mode versus "multi-Pod" mode. |
2949bfb
to
4ff7333
Compare
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlpettersson do you feel like we definitely need this TEP as well as #316 ?
My feeling is that TEP 44 (#316) describes overlapping (if not the same) problems, this PR adds some more usecases and requirements (and specifies configuring this at runtime, while TEP 44 has focused on authoring time) but also describes a solution: pipelinerun in a pod
I'm wondering if we can add the use cases from this, expand 44 to include the ability to control this at runtime, and then have "pipelinerun in a pod" as a possible solution
teps/0046-pipelinerun-in-a-pod.md
Outdated
### Goals | ||
|
||
- Make it possible to run whole PipelineRun (including all TaskRuns) within a single Pod so that the workspace can be within the Pod: | ||
- At runtime in a PipelineRun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm interesting, this is a difference with #316
@bobcatfish we probably eventually converge to one, I think? But this is written from a different problem-perspective.
Sure, it can be added there, but also the whole TEP must be aligned with this then, including goal and requirements if that is ok? they are slightly different in some parts.
I think the problem for this TEP has been described in Design doc: Task parallelism when using workspace and we have been in and out in multiple alternatives, including the Affinity Assistant and a Custom Scheduler, I feel that this TEP is about "PipelineRun in a Pod", not only as a possible solution? the problem and alternatives has been discussed back and from since may 2020. |
Some evolution context here:
|
The high level problem description is in Design doc: Task parallelism when using workspace Perhaps this is a bit more abstract enhancement proposal, not strictly tied to implementation: To avoid the problems related to pod-external volumes, this enhancement proposal proposes a solution where the workspace volume can be inside the pod and the Tasks that uses that workspace volume need to be within the same scheduling unit, the Pod, to avoid the scheduling problems that the affinity assistant tried to solve, but had its shortcomings when it comes to scheduling, scaling and resource allocations. For a Pipeline, the Tasks that does not use a workspace or an unrelated workspace, can potentially be within a different scheduling unit (a Pod in Kubernetes). It would be beneficial if this can be done without any API additions (to reduce cognitive load for author) for the Task and Pipeline author, similar to how it works when using the affinity assistant (it only co-schedule pods using the same PVC-workspace to the same node). |
4ff7333
to
1d8f8da
Compare
1d8f8da
to
1891e1c
Compare
1891e1c
to
7056c4d
Compare
7056c4d
to
f672a25
Compare
Yes that's definitely sounds like a plan that I would like 👼🏼 |
Sorry I've written a long post here but TL;DR can we add a non-goal of handling security consequences of running a pipeline in a pod? Or can we include the security implications of introducing this mode as part of the TEP? Full version: I would like to focus on the fact that we are specifically talking about having different "modes" for Pipelines. There's quite a lot of literature on the way modes can cause different kinds of problems for users and I think some of that might apply here. Just one example - here's an article from Nielson Norman Group that discusses this in the context of UIs. I want to call out the following parts:
^ This could be a good argument in favor of adding a "Pipeline-in-a-Pod" mode. Are we in a situation where we have too many options and not enough available "types" (CRDs? fields?) to capture all the variations that a user may want to employ? Are we unwilling or unable or uninterested to add more "types of input"?
^ Are we going to introduce a large new suite of problems for users to navigate if we introduce this kind of execution mode toggle? To me this is made more complicated by the fact that "user" here actually might be multiple people - the Pipeline author, the PipelineRun user, etc. As a Pipeline author I will now need to be aware that all my Tasks will possibly be consolidated into a single Pod on a single Node. That means any purposeful isolation I've introduced (e.g. separating credential Secrets so they're only accessed by a Task with images that my org trusts) could be violated by a PipelineRun. This ability for a PipelineRun to subvert the intention of a Pipeline seems to me worth exploring more in depth. Moving on in the article, I want to call out this part:
The one I identified above (all service accounts will need to be exposed to the Pod) seems to me quite a considerable potential source of a) embarassment and b) safety or security consequences. So if we proceed down this path I really do want us to go in eyes-wide-open on what specifically we are going to expose unknowing users to when we recommend they switch their PipelineRuns into "Pod mode" because they're struggling to use Workspaces. Sooo I think there should be more detail in this TEP on the specific set of tradeoffs that we would be willing to accept by running with Pipelines-in-a-Pod mode before we commit to introducing it. It could be that we need to add more Non-Goals, or we may want to describe the acceptable solutions. I'm personally particularly concerned about all Secrets, Service Accounts and ConfigMaps being exposed to a single Pod when the Pipeline author may have intended to isolate them into different Tasks. If we're not going to propose a solution to that as part of the TEP can we at least say that it's a non-goal to specify any security consequences related to running everything in a single unit of execution? |
+1 to this from my pov too. Would we put this mode straight into beta or keep it in alpha while we work on the general solution? It might be useful to include its flagged-ness or gated-ness as part of this TEP too? |
For me it would definitely not being exposed in beta directly. |
56c2462
to
41ffca2
Compare
…sks and Workspaces without any changes for an author and without the need for external storage to back the Workspaces. Without external storage, problems with scheduling of Pods that use the same volume is avoided.
41ffca2
to
9c2329f
Compare
That is good input @sbwsg, thanks. I have added a Non-goal about addressing the security implications followed when most Secrets is mounted by the same Pod. And also clarified that only a single ServiceAccount can be used for a Pod - or multiple Tasks in this case. |
@jlpettersson: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In the most recent API working group we decided to keep this TEP and [TEP-0046](tektoncd#318) separate because they are coming at a similar problem from different angles. In the WG @jerop suggested that we update the TEPs with some info on what the overlaps + differences are and that's what this TEP is adding!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated TEP-0044 with a compare + contrast with this TEP: 75d8537
From my perspective I'm happy to merge this TEP problem statement as-is and move forward
/approve
@vdemeester is also a reviewer on this one, and @sbwsg and @imjasonh had some feedback also (maybe addressed now that the TEP isn't assuming that pipelinerun on a pod is the solution we're going with?)
/hold
- Make it possible to use a Pod-internal workspace to share data in a PipelineRun | ||
- The Tasks that use the workspace is scheduled to run within the same scheduling unit (Pod) | ||
- That Pipeline-features in use today is still usable, e.g. concurrency and `When`-expressions | ||
- No changes in the Pipeline or Task API for authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlpettersson I forgot to ask you in the API working group: does this mean that you're proposing that this is something that is configured only at runtime (i.e. in the PipelineRun only)? or would you imagine the pipeline author expressing this also/instead?
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bobcatfish The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
In the most recent API working group we decided to keep this TEP and [TEP-0046](tektoncd#318) separate because they are coming at a similar problem from different angles. In the WG @jerop suggested that we update the TEPs with some info on what the overlaps + differences are and that's what this TEP is adding!
In the most recent API working group we decided to keep this TEP and [TEP-0046](tektoncd#318) separate because they are coming at a similar problem from different angles. In the WG @jerop suggested that we update the TEPs with some info on what the overlaps + differences are and that's what this TEP is adding!
In the most recent API working group we decided to keep this TEP and [TEP-0046](tektoncd#318) separate because they are coming at a similar problem from different angles. In the WG @jerop suggested that we update the TEPs with some info on what the overlaps + differences are and that's what this TEP is adding!
In the most recent API working group we decided to keep this TEP and [TEP-0046](tektoncd#318) separate because they are coming at a similar problem from different angles. In the WG @jerop suggested that we update the TEPs with some info on what the overlaps + differences are and that's what this TEP is adding!
In the most recent API working group we decided to keep this TEP and [TEP-0046](#318) separate because they are coming at a similar problem from different angles. In the WG @jerop suggested that we update the TEPs with some info on what the overlaps + differences are and that's what this TEP is adding!
@bobcatfish @jlpettersson should we close this TEP and make sure TEP-0044 takes into account requirements from this one ? |
This TEP probably address a slightly broader problem, since it also require parallel/concurrent Task execution. But the TEP-0046 number is stolen now, so this cannot be merged. I am closing it. |
@jlpettersson I don't want this to get lost just due to a conflicting TEP number - please let me know if I can help at all - I DO think that TEP-0044 is going to address what you mentioned around parallel and concurrent task execution in the long run. |
This TEP describes an alternative way to run Pipelines composed of Tasks and Workspaces without any changes for an author and without the need for external storage to back the Workspaces. Without external storage, problems with scheduling of Pods that use the same volume is avoided.
This is a formalization of tektoncd/pipeline#3638
Some wording is stolen from @bobcatfish and TEP-0044 (these two TEPs might eventually converge?)