Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IncludePatterns and ExcludePatterns options for Copy #101

Merged
merged 7 commits into from
May 3, 2021

Conversation

aaronlehmann
Copy link
Collaborator

Allow include patterns and exclude patterns to be specified, similarly
to Walker.

There is a bit of extra complexity to handle the case of a pattern like
a/*/c. In this case, creating the directories a and a/b may need
to be delayed until we encounter a/b/c.

cc @hinshun

Allow include patterns and exclude patterns to be specified, similarly
to Walker.

There is a bit of extra complexity to handle the case of a pattern like
`a/*/c`. In this case, creating the directories `a` and `a/b` may need
to be delayed until we encounter `a/b/c`.
aaronlehmann added a commit to aaronlehmann/buildkit that referenced this pull request Apr 22, 2021
Allow include and exclude patterns to be specified for the "copy" op,
similarly to "local".

Depends on tonistiigi/fsutil#101
aaronlehmann added a commit to aaronlehmann/buildkit that referenced this pull request Apr 22, 2021
Allow include and exclude patterns to be specified for the "copy" op,
similarly to "local".

Depends on tonistiigi/fsutil#101

Signed-off-by: Aaron Lehmann <[email protected]>
aaronlehmann added a commit to aaronlehmann/buildkit that referenced this pull request Apr 26, 2021
Allow include and exclude patterns to be specified for the "copy" op,
similarly to "local".

Depends on tonistiigi/fsutil#101

Signed-off-by: Aaron Lehmann <[email protected]>
aaronlehmann added a commit to aaronlehmann/buildkit that referenced this pull request Apr 26, 2021
Allow include and exclude patterns to be specified for the "copy" op,
similarly to "local".

Depends on tonistiigi/fsutil#101

Signed-off-by: Aaron Lehmann <[email protected]>
aaronlehmann added a commit to aaronlehmann/buildkit that referenced this pull request Apr 26, 2021
Allow include and exclude patterns to be specified for the "copy" op,
similarly to "local".

Depends on tonistiigi/fsutil#101

Signed-off-by: Aaron Lehmann <[email protected]>
prefix/match.go Outdated
)

func Match(pattern, name string) (bool, bool) {
count := strings.Count(name, string(filepath.Separator))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear if filepath is correct in here. Iiuc the old code only uses this function for local paths but I see this imported in buildkit/contenthash where all paths should be normalized to unix. At least the test should not pass in non-unix I think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right about this. I guess https://github.com/moby/buildkit/pull/2082/files#diff-364fe775b55b3eec6b3ea0859ab920f0981c61d377e47830a46de5ec04058c44R437 does not fail on Windows because there are no test cases that depend on parent dirs being included in the checksum. Not sure if there's a way to simulate a change in parent dir metadata in that test so we can capture that scenario.

@@ -10,6 +10,7 @@ import (

"github.com/containerd/continuity/fs/fstest"
"github.com/pkg/errors"
"github.com/stretchr/testify/assert"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this assert intentional? Maybe a comment in the check if true.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow, do you prefer require over assert? I generally use assert unless the failed assertion would put the test in a bad state where it doesn't make sense to continue.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use require exclusively so it is a bit out of place and should have a comment so that it is not reverted in the future for consistency.

if !fi.IsDir() {
if include {
if err := c.createParentDirs(src, srcComponents, target, overwriteTargetMetadata); err != nil {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this quite a significant added performance overhead to rerun these checks where most of the time directories exist? If true then I think we need some caching. As everything is sorted shouldn't take too much memory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does add overhead, but only to the case where an include pattern is matched - so it won't regress performance of any existing cases.

Happy to add caching if you think it's worth the complexity.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed. The number of syscalls per file is quite important for copy.

copy/copy.go Outdated
return err
} else if !overwriteTargetMetadata {
} else if !overwriteTargetMetadata || !includeAll {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bit confused about the includeAll check in here as exclude patterns seem to be not affected by this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created will always be false if includeAll is false (because in that case we delay creation of parent dirs). So this is shorter version of:

if !includeAll {
        // directory may not have been created yet, so don't try to set its metadata
        copyFileInfo = false
}

copy/copy.go Outdated
pattern = strings.TrimSuffix(pattern, string(filepath.Separator))
}

if ok, p := prefix.Match(pattern, path); ok {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do IncludePatterns/ExcludePatterns use different format/algorithm? Afaics only ExcludePatterns support !. Is there a difference of ExcludePatterns and potentially doing ! in IncludePatterns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to match the logic in

if opt != nil {

I think I'm following the same logic (it has separate handling for IncludePatterns and ExcludePatterns and doesn't appear to handle ! in IncludePatterns), but maybe I'm missing something.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I don't remember why it is like this. I guess for .dockerignore is the only use-case atm that required ! and that is the excludes list. IncludePatterns is not used at all anymore as well and FollowPaths is used instead. I guess the FollowPaths case where symlinks need to be followed does not apply here?

I think it is important that something like COPY --filter can be implemented with this. I'm not quite sure if atm client would need to convert some of the patterns to a negative version. But I guess if we would only use one input then with current implementation depending on if user wants to mostly define include or exclude paths one of them would be much slower. Matching with the options of walker is not the most important here so if you think there is a better/simpler logic we can use instead lmk.

copy/copy.go Outdated
return false, errors.Wrap(err, "failed to match excludepatterns")
}
if m {
if fi.IsDir() && c.excludePatternMatcher.Exclusions() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR but this looks non-optimal. I guess ! is rarely used in practice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modeled on this code from walker.go:

for _, pat := range pm.Patterns() {

copy/copy.go Outdated
@@ -109,7 +114,8 @@ func Copy(ctx context.Context, srcRoot, src, dstRoot, dst string, opts ...Opt) e
if err != nil {
return err
}
if err := c.copy(ctx, srcFollowed, dst, false); err != nil {
includeAll := len(c.includePatterns) == 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excludepatterns don't matter in here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

includeAll means we can skip evaluating the include patterns. Usually it's used when a parent dir matches an include pattern. It doesn't skip evaluating exclude patterns.

Maybe skipIncludePatterns would be a better name?

prefix/match.go Outdated
"strings"
)

func Match(pattern, name string, slashSeparator bool) (bool, bool) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the return value is (bool, bool) can we use named return values and a godoc for this function?

pattern = strings.TrimSuffix(pattern, string(filepath.Separator))
}

if ok, p := prefix.Match(pattern, path, false); ok {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified:

matched, partial = prefix.Match(pattern, path, false)
if matched && !partial {
    break 
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could potentially overwrite a true value of matched with false

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah missed that.

copy/copy.go Outdated
if err != nil {
return false, errors.Wrap(err, "failed to match excludepatterns")
}
if m {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, prefer:

if !m {
    return false, nil
}

dirSlash...

@aaronlehmann
Copy link
Collaborator Author

@tonistiigi: Any more feedback on this PR?

Copy link
Owner

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented earlier, different types for include/exclude is a bit of a problem but don't have a specific alternative proposal atm.

@aaronlehmann
Copy link
Collaborator Author

Thanks!

@aaronlehmann aaronlehmann merged commit 5c8be85 into tonistiigi:master May 3, 2021
aaronlehmann added a commit to aaronlehmann/buildkit that referenced this pull request May 25, 2021
Allow include and exclude patterns to be specified for the "copy" op,
similarly to "local".

Depends on tonistiigi/fsutil#101

Signed-off-by: Aaron Lehmann <[email protected]>
aaronlehmann added a commit to aaronlehmann/buildkit that referenced this pull request May 26, 2021
Allow include and exclude patterns to be specified for the "copy" op,
similarly to "local".

Depends on tonistiigi/fsutil#101

Signed-off-by: Aaron Lehmann <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants