-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial auto sync support design doc #2901
Conversation
Codecov Report
|
@corneliusweig FYI since I think you worked on the sync stuff recently? |
@loosebazooka hey thanks for looping me in. I'll take a look at it! |
What we might want, is an API that can accept configuration that looks like: | ||
|
||
``` | ||
sync: auto { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can drop auto
and just add another mode generated
along with existing manual
and infer
?
sync:
generated:
- command: "./gradlew jib classes"
inputs:
- src/main/java
- src/main/resources
files:
- src: "target/classes"
dest: "/app/classes"
- src: "target/resources"
dest: "/app/resources"
- command: "compile_c_stuff.sh"
inputs:
- asdf.c
files:
- src: "asdf.o"
dest: "/libs/asdf.o"
manual:
- src: "local/computer/extrafiles"
dest: "/extrafiles"
includeTools: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this variant for the yaml
syntax. As I understand, the existing sync
specs would keep their semantics, but the generated
is a new build hook to be triggered before sync. (Maybe buildHooks
is also a more natural name than generated
). In other words:
- if
generated
is present, execute the build hooks when changes matchinginputs
are detected - if
generated
is absent -> current behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a file is present in both generated
and manual
we would either have to give one preference or throw an error. Personally I'm leaning towards throwing an error, so that user's don't end up with unexpected sync stuff happening.
should just be considered the full list of inputs that are normally | ||
watched for this build minus anything in `direct` (below). | ||
3. `files` - the built files to sync (and where to sync them) | ||
3. `direct` - same as manual sync, just copy them directly, don't wait on some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this replace manual? or another definition in auto
?
Currently, we don't support manual
and infer
sync rules together.
https://skaffold.dev/docs/how-tos/filesync/#limitations
As per this proposal, do we plan to support auto
as well manual
sync modes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that sync does not currently support multiple sync configurations, this is designed is that this is all one large block that does many things. A builder can have some files that need to be compiled, and others that don't -- so if I was configuring this from jib, I could separate them out.
However this design may be flawed in that it redoes the work that other sync options already do.
3. `files` - the built files to sync (and where to sync them) | ||
3. `direct` - same as manual sync, just copy them directly, don't wait on some | ||
command to transform them | ||
4. `includeTools` - useful for containers that don't contain, for example, `tar` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this for cases where a generated file is bundled up in the container?
So the sync will now,
- unzip the tar
- sync file
- zip again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do specify which tools to include? Is there a separate DSL for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think includeTools
maybe shouldn't be part of this design. While it's useful specifically for the default base image jib chooses (distroless/java), it could benefit be it's own design that answers these questions.
But to answer them anyway, since I'm here, I think for now it would include tar
-- since that's what we need to do a sync. If the requirements of sync change, it could include whatever else sync
needs. (that's why naming would be includeTools
instead of includeTar
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an option to add tar
into the target container is a great idea. I also think that this is an idea worth being explored on its own. For example, it could solve #1814. Consequently, I would see this option outside of the auto
block and right under sync
, so that other sync modes can also benefit from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. i am still confused on it makes sense for me as a user writing this config.
sync:
includeTar: true
generated:
....
If this is true, skaffold would now install tar inside all containers built ?
If this is false, skaffold sync will fail potentially #1814 ?
If above is true; then
As a user, if i have a jib builder, i need to set this true
.
I feel the users are might not be able to identify this. Can we instead, add an interface to builder
type Builder interface {
SyncTools() []tools
}
And then skaffold in dev mode if sync enabled, will install these tools in the built container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would make tar available to that specific artifacts container. Not all jib users have to do this, but the distroless
collection of images do not have tar (or any tools really) included by default.
I would think, instead of asking the user to change their base image to enable sync, we just make it easier for them by making the tools available automatically. I assume it would a mechanism similar to how debugging tools are available to a pod right now using init containers (@briandealwis ?).
Again, perhaps I should take this out of this design doc and move it to a new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps for simplicity, this doc could assume tar
exists in the container or declares that this works only when tar
exists. Installing other builds tools could be another doc, independent of this?
The Skaffold doc already states this as a limitation to use the sync mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah understood, so maybe skaffold can be a little smart about if tar
is included in the built image.
Untill then, asking users to specify via config is easy way.
Skaffold can also suggest adding it to their config by parsing the sync error and determine if its tar related.
really only aware of `.java` files in a build. | ||
|
||
1. Why is this required? | ||
- Having a sync mode that has knowledge of files that need to be built before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not your fault, may be we should add it to our design doc proposal,
Can we list some use cases we want to support. On the top of my mind
- Users can sync files generated files
- Users can sync generated files which are bundled inside a tar and inside a container.
- Users can sync files outside of their current workspace? ( for now users can hook up a bash script which can take input ../../)
- Users can sync remote files?
With that in mind, what is the contract between Skaffold and command
to generate these files.
Would skaffold run the command when it sees a change in the command.inputs
files? In which case, inputs
should be non-empty.
dest: "/app/classes", | ||
- src: "target/resources. | ||
dest: "/app/resources", | ||
- command: "compile_c_stuff.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about dependency loops here? What if the build hook is triggered by a change in file A
, the build hook runs and updates A
again. It's a case which can hardly be detected upfront as an error, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have not thought about this, I mostly work in the jib context where I don't expect that to happen. But detection of that should throw an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the question is: how can we detect it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's a hard problem :\
files: | ||
- src: "asdf.o" | ||
dest: "/libs/asdf.o" | ||
direct: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I haven't thought enough about the edge cases. But is it really necessary to separate out the direct
sync cases? What would be wrong about a sync rule that applies to both generated artifacts and direct source files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the process of doing a build when it shouldn't could be costly?
And if the build system isn't smart, it could trigger sync'ing of files that didn't need to be (and this could cause a server reload when there didn't need to be?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that the build hook is only triggered, if one of its input files changed (and Skaffold takes care to detect this). The syncing could be considered as a different step, where everything that changed is sync'ed (be it generated or direct). And if the build hook didn't run, none of its outputs would have changed and only direct dependencies are updated. Or is that not how you planned it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. I planned for it all to exist in the auto
block in this example as a single sync rule. Naming is a little weird at the moment.
``` | ||
sync: auto { | ||
indirect: | ||
- command: "./gradlew jib classes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a usability perspective, it is still quite some intellectual overhead to specify the build commands in the right way and define the proper sync rules. My gut feeling is that this procedure is a lot different from what most developers are used to, especially for java (if you do SpringBoot, you never care about .class
files). Do you plan a further simplification so that the sync works with smaller configuration overhead for SpringBoot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this config is mostly a manifestation of what the API would be for a builder like jib to define all this - not necessarily a user. So jib
would just tell the sync system what the config is, and it would go about doing what needs to be done.
I'm not opposed to it, but I agree, it's fairly involved for a user to configure this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still confused. Will the user be able to configure all these in skaffold.yaml
, or is it that this doc just explains a proposed internal sync API in the form? In either case, what if the user configures other current sync options in their skaffold.yaml
? Will they conflict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In much the same vein, I'd recommend to list a few other use cases for the proposed yaml changes beside better Jib support. It sounds as if your main goal is to achieve better Jib integration, but that could be done with less user-facing changes. I'm a bit worried that the current proposal is very powerful, but also rather complex (both for users and maintenance).
If the sole intended use for this is Jib, there could be a simpler option without so many user-facing changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I just needed a way to describe the API in a way that could be read. There may not be any need at all to expose this beyond an internal API. It also helps explain how this might be extended beyond just jib usage (though that might be the only usecase for now).
It looks like I should rewrite this to reflect that this isn't really exposed to the user. |
Updated this to reflect what is internal and external. |
1. run `./gradlew classes copyDependencies` | ||
2. run `skaffold dev` | ||
3. *make changes to some java file* | ||
4. run `./gradlew classes` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a similar question was asked, but what happens now when you change some Java source files is that it runs the Jib builder and rebuilds everything including .class
files. So, modifying a source file triggers a full rebuild which in turn will trigger the proposed sync. Somehow, we should be able to determine if we should forgo a full build and go for the sync route only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the assumption is that, at first, we can sync everything. Anything that jib defines as a file in the container will be sync'd when it changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
auto: {} | ||
``` | ||
|
||
#### Option 1: Delegate generation of sync tar to builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@corneliusweig @tejal29 added in this section about the builder being responsible for the syncable tar...
I can see the problem that it kind of ties down the sync implementation to tar.
Co-Authored-By: Tejal Desai <[email protected]>
Co-Authored-By: Tejal Desai <[email protected]>
Co-Authored-By: Tejal Desai <[email protected]>
5ff1dd7
to
9d16388
Compare
Co-Authored-By: Balint Pato <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, I think we should merge it. Added my comments as I was thinking it through.
considered inputs for the container builder) | ||
- Skaffold can sync files that are generated by a buildscript | ||
- And one or both of the following: | ||
- Skaffold can be configured to run a custom script to when certain files change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this different from the Skaffold can sync files that are generated by a buildscript
feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default build script isn't necessarily the one you want to use to generate intermediate assets, for example
- jib full container build:
./gradlew jib
- jib build requirements:
./gradlew classes
Co-Authored-By: Balint Pato <[email protected]>
Co-Authored-By: Balint Pato <[email protected]>
Merged with admin rights - CI build is nonsensical here. |
Design doc for how we might try to solve the auto synchronization of files that require a build first.
Useful if want to support java servers that can reload classes and resources -- and perhaps many other user cases!
Staged Doc: https://github.com/GoogleContainerTools/skaffold/blob/sync-auto-proposal/docs/design_proposals/sync-auto-support.md