-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for a "volume resource"-like Task.spec field #1438
Comments
+1 for an overridable |
Thanks for this. If I understand correctly this would provide a logical name to be used in tasks for a path on a pre-provisioned PVC. I think it would be helpful to see an example task, how it would look like in a complete example. Unfortunately I missed the meeting, and I would like to contribute my 2¢. As far as I understand the main issue with the current solution for sharing data between tasks is that PVC can be expensive to provision (in terms of time it takes to provision and also quota limits). The benefit of the proposed volume field would be that the PVC can be pre-provisioned, which takes care of the issue of provisioning time, and also the same PVC could be partitioned and used by multiple Tasks / Pipelines, which helps with quota issues. I think the current solution we have for artifacts PVC is quite nice, as it's mostly transparent to end-users, but it has three main limitations:
All this is to say that I believe we could be on top of the existing artifac pvc solution to achieve something very close to what is proposed here, which would have the extra benefit of being backward compatible, i.e. it would keep the auto-provisioning feature for users that want that. |
@afrittoli -- yes I agree as much as we can that we want to build on top of the existing artifact pvc concept. One of the main goals for this proposal is just to add just enough syntax to make use of the artifact pvc (or something like it) clear. So I've take the original proposal a little further and hopefully closer to implementable. So... to start things off I want to provide a bit more on the workspace types this proposal adds.
Here's a quick example consisting of: a task that writes a message, a task that reads a message, and a pipeline that ties the two together. (Note: I also have flattened Task params) This does not use all the bells and whistles that the above types offer, but hopefully gets the point across.
|
Thanks for the extra specification work! Is this meant to fully replace the |
We talked about this briefly in the Beta meeting but wanted to leave a note here as well - it looks like the shape of WorkspaceMount and WorkspaceDevice are very similar (only the *Path param names differ). Wonder if we can de-dupe those somehow - would be cool if a Task didn't have to care about whether a mount is a Volume or Device. |
@afrittoli yes this is meant to fully replace VolumePipelineResource . There was concern that just saying that a particular PipelineResource was beta but only for |
@sbwsg I had the same thought when spec'ing this out but part of my reasoning is that they are wrapping different concepts at the Pod level. In particular VolumeDevice is still relatively new and might add new fields that we might want to expose. |
Thanks for the detailed proposal! A couple notes:
|
Thanks @dlorenc
|
So... what if we pared things down to really just mapping the artifact pvc into the workspace, and deferred everything else to a VolumePipelineResource when PipelineResources are ready...
Our example becomes...
To prevent unnecessary artifact pvc creation if no Pipeline tasks[*].workspaces are specified then we don't create it. If a WorkspaceBinding is not provided ephemeral storage is allocated (or we reuse the /workspace EmptyDir) |
I think this is not bad and probably covers enough of the most common use-cases to be sufficient for beta. I currently use a mixture of params and PodSpec.volumes to handle configMap and secret sharing and can wait for a VolumePipelineResource without forcing the issue. WDYT? |
So to be clear I understand the proposal:
This sgtm. A few questions:
|
I thought I was following this pretty well until #1438 (comment) but I think that might be b/c I was viewing this proposal maybe differently than you @skaegi @dlorenc
In my mind we don't actually need the VolumeResource b/c today you can get all the functionality it provided (minus automatic PVC creation) by using:
BUT:
I'm a bit confused by why in #1438 (comment) @skaegi you want to "mapping the artifact pvc into the workspace" - is it important that it's the artifact PVC (which I take to mean the PVC tekton automatically creates when you use I would prefer a solution where users can provide their own PVC or whatever volume info they want vs. trying to surface and make available what imo is an implementation detail of output -> input linking (and folks might be using something other than PVCs, we currently allow for GCS upload/download instead). Anyway we can talk more in the working group but long story short I like the way that #1438 (comment) looks (and how much simpler it is than #1438 (comment)), but I'm not understanding how |
Much clearer after your description in the working group @skaegi ! Haha I should have just waited before writing all those words, I had a feeling XD |
So far just docs and examples for declaring volumes in a Task via the "workspace" field and binding to them at runtime. This is midway between @skaegi's proposal b/c it allows the user to bring their own PVC but isn't the full cadillac version because it does not include volume devices. Will eventually fix tektoncd#1438
After my exploration in #1508, @skaegi and I discussed and This is the latest iteration of what we discussed (similar to #1438 (comment) but without devices): Meta-type descriptions...
-----------------------------
Workspace
name [string]
description [string] (optional)
mountPath [string] (defaults to /workspace/{name})
WorkspaceBinding
name [string]
readOnly [boolean] (defaults to false)
volumeName [string]
volumeSubPath [string] (optional)
WorkspaceVolume
name [string]
description [string] (optional)
persistentVolumeClaim [PersistentVoluemClaimVolumeSource]
-- and now modifications to existing types...
Task
workspaces [array of Workspace]
TaskRun
workspaces [array of WorkspaceBinding]
volumes (or workspaceVolumes TBD) [array of WorkspaceVolume]
Pipeline
tasks[*].workspaces [array of WorkspaceBinding]
volumes (or workspaceVolumes TBD) [array of WorkspaceVolume] # simon i think this would be something else, like just a list of volume names, we wouldnt know what actual volumes to provide until runtime
PipelineRun
volumes (or workspaceVolumes TBD) [array of WorkspaceVolume] Main differences (that I remember) from #1508:
|
@skaegi I was trying to explore your usecase for providing I came up with a few (buggy, typo ridden) examples:
Is it possible that (2) or (3) could meet your needs @skaegi ? This would allow us to avoid specifying I think it would look something like this (basically volumes - and their subpaths - are only specified in TaskRun or PipelineRun):
The main reason I want to push back is because I think (4) is the cleanest example, and once we have the |
Thanks @bobcatfish -- and agree FileSets are cool and think they abstract away most of what subPath was doing. Let me have a go at making an example using some fancy |
So I played with your examples a bit and quite liked (1) although having now seen filesets agree that subPath management might be optimized. For (2) and (3) I found having the volume subPath on the volume did not feel right. A "subPath" is a property of the volumeMount and one of the things that I really liked about our earlier design was how cleanly a Workspace and WorkspaceBinding combined to produce exactly the fields needed to create a VolumeMount in the resulting pod. For (4) I liked how a FileSet hides a number of the details that in most cases are not important. In particular it seems to me that it was only the resource "consumers" who care about path details and the "producers" just wanted an arbitrary work folder. So with that in mind I wonder if "subPath" was an optional field where if not provided used a generated value in its workspace binding. e.g. Producers wouldn't typically provide a subPath, but Consumers who need the subPath can get it via interpolation This example is similar to (1) but assumes a generated subPath and uses interpolation to extract the value in the Pipeline. Since the interpolation implies ordering I removed the runsAfter as I believe this can and should be computed internally. In the Tasks I use absolute |
That's an interesting way of looking at it - I think I've been looking at the fields we're adding less from the perspective of mapping each one perfectly to k8s concepts, and more from the perspective of who is interacting with each type at which time and what should they need to know, specifically Task and Pipeline authors vs folks people actually running the Pipelines who provide runtime information. In my mind the path to use on a volume is runtime information - I don't really see why anyone writing a Task or a Pipeline that uses these volumes cares about where on the volume the data is, they just want the data. The exceptional case seems to be when a Pipeline wants to get data
Quick question: is there a specific reason why you prefer using an absolute path, or is it just for verbosity in the example? imo it's much more robust to use the interpolated value
I continue to be strongly suspicious that if you had the Even if
|
Okay sounds good, let's give it a try :D
The most recent design includes a pretty sweet model for extensibility that @sbwsg came up with where you can define your own PipelineResource types :D |
ok. I'll start digging into that asap and give feedback. Sorry @sbwsg but I suspect it might be worthwhile to reconvene the resource-wg next week some time... |
I suggest that we keep conversation of the new resource proposal to the main WG. If we start taking up inordinate amounts of time discussing it then a separate WG makes more sense to me but I really want to keep the new proposal visible in the wider community if we can. Feel free to ping me with questions on slack though - happy to talk through the design or implementation details I'm working through now, or muse about where we could take it next. |
I also think once we have some POC's to try out (for this proposal and for FileSet) it'll help with our discussions! :D |
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of tektoncd#1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of tektoncd#1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of tektoncd#1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. It's possible folks might be specifying volumes already in the Task or via the stepTemplate that might collide with the names we are using for the workspaces; instead of validating this and making the Task author change these, we can instead randomize them! We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of tektoncd#1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. It's possible folks might be specifying volumes already in the Task or via the stepTemplate that might collide with the names we are using for the workspaces; instead of validating this and making the Task author change these, we can instead randomize them! We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of tektoncd#1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. It's possible folks might be specifying volumes already in the Task or via the stepTemplate that might collide with the names we are using for the workspaces; instead of validating this and making the Task author change these, we can instead randomize them! We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of tektoncd#1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. It's possible folks might be specifying volumes already in the Task or via the stepTemplate that might collide with the names we are using for the workspaces; instead of validating this and making the Task author change these, we can instead randomize them! We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of tektoncd#1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
This allows users to use Volumes with Tasks such that: - The actual volumes to use (or subdirectories on those volumes) are provided at runtime, not at Task authoring time - At Task authoring time you can declare that you expect a volume to be provided and control what path that volume should end up at - Validation will be provided that the volumes (workspaces) are actually provided at runtime Before this change, there were two ways to use Volumes with Tasks: - VolumeMounts were explicitly declared at the level of a step - Volumes were declared in Tasks, meaning the Task author controlled the name of the volume being used and it wasn't possible at runtime to use a subdir of the volume - Or the Volume could be provided via the podTemplate, if the user realized this was possible None of this was validated and could cause unexpected and hard to diagnose errors at runtime. It's possible folks might be specifying volumes already in the Task or via the stepTemplate that might collide with the names we are using for the workspaces; instead of validating this and making the Task author change these, we can instead randomize them! We have also limited (at least initially) the types of volume source being supported instead of expanding to all volume sources, tho we can expand it later if we want to and if users need it. This would reduce the API surface that a Tekton compliant system would need to conform to (once we actually define what conformance means!). Part of #1438 In future commits we will add support for workspaces to Pipelines and PipelineRuns as well; for now if a user tries to use a Pipeline with a Task that requires a Workspace, it will fail at runtime because it is not (yet) possible for the Pipeline and PipelineRun to provide workspaces. Co-authored-by: Scott <[email protected]>
For those following along, I think the two remaining pieces here are:
Anything else outstanding in the context of this issue? I figure we can follow up with brand new features around workspaces in new GH issues. |
Fixes #1438 The final of the original feature requests related to workspaces was to include support for Secrets as the source of a volume mounted into Task containers. This PR introduces support for Secrets as workspaces in a TaskRun definition.
A "volume resource" like abstraction is valuable as it lets a Task specify the requirement for a work folder as well as a description of its purpose without declaring the specifics of the volume backing it.
In the resources meeting -- https://docs.google.com/document/d/1p6HJt-QMvqegykkob9qlb7aWu-FMX7ogy7UPh9ICzaE/edit#heading=h.vf9n4tewyadq -- we discussed different approaches for supporting the concept and decided the best approach was to introduce a
volumes
or similarly named Task.spec field as the best balance of having it present in the beta without forcing PipelineResources to also be present.It turns out that
volumes
is actually a poor name as it both conflicts with the same field in a Pod spec and also does not capture the fact that it's actually closer to a volumeMount. The concept is for a named folder that is (at least by default) mounted as a sub-directory of/workspace
so for now I'll useworkspaces
as the field name instead ofvolumes
.Each
workspaces
item has minimally just aname
and optionaldescription
. For example:In all containers in the resulting Pod
/workspace/first
and/workspace/second
would be volumeMounts assigned by the TaskRun pointing at a particular volume and optional subPath in that volume. If a TaskRun does not provide a valid volumeMount that maps to/workspace/{workspaces.name}
it should fail validation.Other considerations to think about include: reserved names, optionality, alternate mountPaths, follow-on syntax in Pipeline, PipelineRun, and TaskRun.
The text was updated successfully, but these errors were encountered: