Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llb: fileop implementation #809

Merged
merged 25 commits into from
Mar 18, 2019
Merged

llb: fileop implementation #809

merged 25 commits into from
Mar 18, 2019

Conversation

tonistiigi
Copy link
Member

@tonistiigi tonistiigi commented Feb 1, 2019

fixes #396 Add FileOp to LLB
fixes #781 WORKDIR issue
fixes #813 ADD --chown
fixes #786 ADD wildcard tar doesn't extract tarball to destination
fixes #735 ADDing files that may be missing
fixes #790 COPY --chown changes ownership of pre-existing directory
fixes #573 using wildcard does not use cache with multistage build

@AkihiroSuda @tiborvass @ijc

Signed-off-by: Tonis Tiigi [email protected]

solver/pb/ops.proto Outdated Show resolved Hide resolved
func TestFileMkdir(t *testing.T) {
t.Parallel()

st := Image("foo").File(Mkdir("/foo", 0700))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.File doesn't seem like the right naming given it can actually do so much more. Having said that I'm not sure what would be better, FileOp perhaps, or even FsOp maybe?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at a subsequent usage:

st := Image("foo").Dir("/etc").File(Mkdir("/foo", 0700).Mkdir("bar", 0600, WithParents(true)).Mkdir("bar/baz", 0701, WithParents(false)))

I'm left wondering if we coudn't arrange for this to be:

st := Image("foo").Dir("/etc").Mkdir("/foo", 0700).Mkdir("bar", 0600, WithParents(true)).Mkdir("bar/baz", 0701, WithParents(false)).Root()

i.e. dropping the .File() and instead chain at a higher level. That feels more in keeping with the existing llb client library stuff.

I suppose the desire for .File() is to chain multiple file ops into a single llb op?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With doing actions directly with the state we can't predict what actions run as part of a single op and what runs separately. As the caching is based on op, this isn't quite meaningless. We could add aliases of course state.Mkdir() -> state.File(Mkdir()) but I think it might be confusing and promote inefficient builds.

solver/pb/ops.proto Outdated Show resolved Hide resolved
message FileActionMkFile {
string path = 1;
int32 mode = 2;
bytes data = 3;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the first construct within the LLB which has the potential to create very large protobuf in one go in a possibly unconstrained way. Everything else can make huge graphs just through composition but is fairly limited by the complexity of the most complex graph you might sanely build.

That seems pretty unavoidable but has implications elsewhere (e.g. on maximum grpc size).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unavoidable...

Or maybe... could we use the session (along the lines of fssync) to somehow provide the data in bulk oob from the llb graph itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be done with docs or we can define limits. This is not an alternative for sources but should be only used for small one-off files. Eg. moby/moby#34423 is a good candidate.

solver/pb/ops.proto Show resolved Hide resolved
int32 mode = 5;
bool followSymlink = 6;
bool dirCopyContents = 7;
bool attemptUnpack = 8;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the forward/backward compatibility implications here? Adding a new archive type that could be unpacked can't be done I think since it would change the semantics. s/attempt/must/g might be a simple way to avoid that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a very bad feature and only here because of the dockerfile compatibility requirements. Normally, extraction is too high-level for LLB and should be done with an extra op. I think I'll rename it attemptUnpackDockerCompatibility to make it clear that new developments will not be accepted. If the behavior was mustUnpack then it would be very easy to prepend the input with exec and we wouldn't need it.

bool attemptUnpack = 8;
bool createDestPath = 9;
bool allowWildcard = 10;
bool allowEmptyWildcard = 11;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is similar to bash's nullglob or something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filepath.Match

@tonistiigi tonistiigi force-pushed the fileop branch 2 times, most recently from 0f4bb2c to b4848bb Compare February 11, 2019 23:18
@tonistiigi
Copy link
Member Author

Pushed initial solver implementation with unit tests. PTAL . A bit complex but wanted to make sure actions execute concurrently. Maybe it will become useful one day.

@AkihiroSuda
Copy link
Member

@tonistiigi Do we want to merge this before v0.4?

@AkihiroSuda
Copy link
Member

Doesn't it require adding new solver capability?

@tonistiigi
Copy link
Member Author

@AkihiroSuda No, we can merge parts of it in v0.4 if needed but I don't expect to replace defaults for dockerfile for v0.4 . For cap I'll add one the same time. Atm it is considered in development(I don't want to add a new cap when a change is needed).

@AkihiroSuda AkihiroSuda added this to the v0.5.0 milestone Mar 2, 2019
@tonistiigi tonistiigi force-pushed the fileop branch 7 times, most recently from 46a99b7 to 4187775 Compare March 11, 2019 06:11
@tonistiigi tonistiigi changed the title llb: initial fileop implementation llb: fileop implementation Mar 11, 2019
@tonistiigi
Copy link
Member Author

This is complete now and ready for reviews. Dockerfile frontend has been switched to use new implementation by default and passes existing tests. Regression tests have been added for all the issues and matrix testing for non-fileop codepaths.

PR with fsutil changes that this PR vendors: tonistiigi/fsutil#60

PTAL @AkihiroSuda @tiborvass @ijc @thaJeztah

solver/pb/ops.proto Outdated Show resolved Hide resolved
@@ -60,7 +60,7 @@ func (s *oci) New(opt ...SandboxOpt) (Sandbox, func() error, error) {
}
logs := map[string]*bytes.Buffer{}
// Include use of --oci-worker-labels to trigger https://github.com/moby/buildkit/pull/603
buildkitdArgs := []string{"buildkitd", "--oci-worker=true", "--containerd-worker=false", "--oci-worker-labels=org.mobyproject.buildkit.worker.sandbox=true"}
buildkitdArgs := []string{"buildkitd", "--oci-worker=true", "--containerd-worker=false", "--oci-worker-gc=false", "--oci-worker-labels=org.mobyproject.buildkit.worker.sandbox=true"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dt, err = ioutil.ReadFile(filepath.Join(destDir, "subdest/dir1/dir2/foo"))
require.NoError(t, err)
require.Equal(t, "foo-contents", string(dt))
if isFileOp { // non-fileop implementation is historically buggy
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a behavior change, but I verified that this is also the behavior in legacy docker build so the previous testcase was not correct.

solver/pb/ops.proto Outdated Show resolved Hide resolved
@thaJeztah
Copy link
Member

I saw you requested my review on this one; I did have a look, but, erm, I'd have to make myself more familiar with the BuildKit internals to give a fully meaningful review 😅 🤗 (I can go through the changes though if you want)

@tonistiigi
Copy link
Member Author

@thaJeztah I pinged you cause it's fixing some of the issues you reported

tonistiigi and others added 16 commits March 15, 2019 17:49
Signed-off-by: Tonis Tiigi <[email protected]>
Signed-off-by: Tonis Tiigi <[email protected]>
Signed-off-by: Tonis Tiigi <[email protected]>
Copy link
Collaborator

@tiborvass tiborvass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with moby

@tiborvass
Copy link
Collaborator

Regarding ignorecache: I don't think we should make it configurable, rather fix the UX as Tonis said, to say "EXISTS" or "FOUND".

@tonistiigi
Copy link
Member Author

There is no third state. Content was either created by exec or found in cache. Unless you want to just override all the status messages in cli but that seems pointless. It's a docs issue. If we don't want to change the docs then we just need to rerun all the time like the old behavior was (and is in this PR).

@tonistiigi
Copy link
Member Author

Good to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants