-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#1437 Add .gitignore and .dockerignore behavior to packaging #899
Conversation
18fb252
to
87c252e
Compare
Thank you for the PR! I will review this weekend. |
Are all the subprocess calls okay? Some of these repos might get pretty large - subprocess calling git on every single file, is that UX okay? |
@bimtauer I like this code. But @wild-endeavor is right, I see you have implemented pattern matcher, why not use the same for git? |
Also the patterns are listed here. Only weird one is |
@@ -125,8 +126,7 @@ def package(ctx, image_config, source, output, force, fast, in_container_source_ | |||
archive_fname = os.path.join(output_tmpdir, f"{digest}.tar.gz") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not your code, but i think compute_digest is wrong!. It is computing on the entire source and not the filtered objects only
Cannot use that for git as gitignore pattern matching is slightly different from dockerignore (some examples here) and also git allows for multiple nested gitignores. I'll look at the digest asap, good catch! |
Codecov Report
@@ Coverage Diff @@
## master #899 +/- ##
==========================================
+ Coverage 86.43% 86.57% +0.14%
==========================================
Files 243 245 +2
Lines 23243 23483 +240
Branches 2618 2645 +27
==========================================
+ Hits 20089 20330 +241
+ Misses 2713 2699 -14
- Partials 441 454 +13
Continue to review full report at Codecov.
|
Developed update to include only not ignored files in checksum. But unfortunately the Also having a look at the failing windows tests. I dont have access to a windows machine 🤢 but I'll try my best 🐧 |
3a37670
to
493129b
Compare
@wild-endeavor @kumare3 Bigger refactoring:
PS might need some help on windows debugging. |
e2721b4
to
ecc5633
Compare
self.has_git = which("git") is not None | ||
self.ignored = self._list_ignored() | ||
|
||
def _list_ignored(self) -> Dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great
@bimtauer some tests are failing and merge conflicts. But this otherwise looks awesome 🙏🏽 |
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
Signed-off-by: Tim Bauer <[email protected]>
c13bf5a
to
becdf2c
Compare
Signed-off-by: Tim Bauer <[email protected]>
@kumare3 I need help debugging the windows tests. I don't have access to a windows dev environment and cannot figure out why the tests fail there. Otherwise this should be ready. |
@eapolinario Thank you for pushing it over the finish line! |
TL;DR
Implements flyteorg/flyte#1437 by refactoring the tarfile filter logic to use a group of filter classes. Previous filtering logic was preserved in
StandardIgnore
.Decided to not go with a separate .flyteignore. Covering .gitignore and .dockerignore if present should work well with a users intentions and avoids having to manage yet another file. If its in .gitignore, you probably dont want it in source control (from where you probably build at least your production images) and if its in .dockerignore you probably dont want it in the task-running container anyways. So following these two should be good :)
And well if you like it messy, then as before, the in-built filtering saves you from packaging cache and pyc stuff.
Type
Are all requirements met?
Complete description
The
GitIgnore
checks for the presence of git cli and then uses that to check whether a path is ignored. Given the complexity of gitignores with the possibility of nesting and various wildcards, this was the safest way to reliable reproduce actual gitignore behavior.The
DockerIgnore
uses thePatternMatcher
from docker-py to match file paths against patterns read from a .dockerignore if present. Docker only takes into accout .dockerignore files at the root of the build context, not in subdirectories, so we only check for that. The behavior should thus also be identical to whatdocker build
would include.The archiving code was refactored into
create_archive
for less repetition and separate unit testing.The previous filter function and test was removed since the new feature covers that functionality.
Tracking Issue
flyteorg/flyte#1437
Follow-up issue
NA