Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect zipped symlinks #1140

Merged
merged 73 commits into from
Jul 26, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
b428871
Merge pull request #809 from awslabs/develop
sanathkr Nov 29, 2018
b493394
Merge pull request #815 from awslabs/develop
jfuss Nov 30, 2018
7e08351
Merge pull request #856 from awslabs/develop
jfuss Dec 12, 2018
afce048
Merge pull request #882 from awslabs/develop
sriram-mv Dec 21, 2018
b23d425
Merge pull request #980 from awslabs/develop
jfuss Feb 5, 2019
7ddb5e4
Merge pull request #1031 from awslabs/develop
sanathkr Mar 1, 2019
d88100f
Merge pull request #1051 from awslabs/develop
jfuss Mar 11, 2019
c861d49
Merge pull request #1077 from awslabs/develop
sriram-mv Mar 25, 2019
f01b6d5
Merge pull request #1083 from awslabs/develop
sriram-mv Mar 26, 2019
21be617
Merge pull request #1089 from awslabs/develop
sriram-mv Mar 27, 2019
0c421c1
Merge pull request #1132 from awslabs/develop
sriram-mv Apr 16, 2019
f308c7e
issymlink() function to determine file type
bubba-h57 Apr 19, 2019
68449db
Refactor to simply tracking the extracted_path
bubba-h57 Apr 19, 2019
ee20909
Refactor to simply tracking the extracted_path
bubba-h57 Apr 19, 2019
9707304
Implement _extract() to handle symlinks
bubba-h57 Apr 19, 2019
6744e20
Labels are better than bare octal values
bubba-h57 Apr 19, 2019
3bc3d9c
Modify the files_with_permissions structure, and document the details
bubba-h57 Apr 19, 2019
2d93efc
Use os.lchmod() to respect symlinks.
bubba-h57 Apr 20, 2019
e8ac37b
We can use raw values.
bubba-h57 Apr 20, 2019
3ebdc6b
Configure the files for zipping.
bubba-h57 Apr 20, 2019
2dbb985
Verify that we did indeed find the correct number of regular files an…
bubba-h57 Apr 20, 2019
611fd0e
File Type Count Verification
bubba-h57 Apr 20, 2019
3c47b60
Verify external attributes.
bubba-h57 Apr 20, 2019
8a36c42
Verify a file entry.
bubba-h57 Apr 20, 2019
b21ec11
Cleanup zip creation
bubba-h57 Apr 20, 2019
b9a4d08
Cleanup must unzip test entry
bubba-h57 Apr 20, 2019
96177ad
issymlink() function to determine file type
bubba-h57 Apr 19, 2019
fdd34a1
Refactor to simply tracking the extracted_path
bubba-h57 Apr 19, 2019
dff30a5
Refactor to simply tracking the extracted_path
bubba-h57 Apr 19, 2019
26136f7
Implement _extract() to handle symlinks
bubba-h57 Apr 19, 2019
027aaa9
Labels are better than bare octal values
bubba-h57 Apr 19, 2019
9de2d8f
Modify the files_with_permissions structure, and document the details
bubba-h57 Apr 19, 2019
8117e2f
Use os.lchmod() to respect symlinks.
bubba-h57 Apr 20, 2019
897dc77
We can use raw values.
bubba-h57 Apr 20, 2019
c4abf92
Configure the files for zipping.
bubba-h57 Apr 20, 2019
37a6d4d
Verify that we did indeed find the correct number of regular files an…
bubba-h57 Apr 20, 2019
09d3f42
File Type Count Verification
bubba-h57 Apr 20, 2019
c8963e0
Verify external attributes.
bubba-h57 Apr 20, 2019
98664d7
Verify a file entry.
bubba-h57 Apr 20, 2019
a913ea2
Cleanup zip creation
bubba-h57 Apr 20, 2019
e2c1bde
Cleanup must unzip test entry
bubba-h57 Apr 20, 2019
26dcfba
Merge branch 'respect-zipped-symlinks' of github.com:stechstudio/aws-…
bubba-h57 Apr 25, 2019
7ed0edd
resolve `os.lchmod()` is available on OSX but no Travis Xenial.
bubba-h57 Apr 25, 2019
f834f5c
Merge pull request #1168 from awslabs/develop
sriram-mv May 13, 2019
a681eb4
Merge pull request #1184 from awslabs/develop
jfuss May 20, 2019
dc81449
Merge pull request #1208 from awslabs/develop
awood45 Jun 4, 2019
bac006e
issymlink() function to determine file type
bubba-h57 Apr 19, 2019
c96bb8f
Refactor to simply tracking the extracted_path
bubba-h57 Apr 19, 2019
8b67c8d
Refactor to simply tracking the extracted_path
bubba-h57 Apr 19, 2019
edff889
Implement _extract() to handle symlinks
bubba-h57 Apr 19, 2019
23150f3
Labels are better than bare octal values
bubba-h57 Apr 19, 2019
03e0d45
Modify the files_with_permissions structure, and document the details
bubba-h57 Apr 19, 2019
7b6eb18
Use os.lchmod() to respect symlinks.
bubba-h57 Apr 20, 2019
16ba85b
We can use raw values.
bubba-h57 Apr 20, 2019
69144ea
Configure the files for zipping.
bubba-h57 Apr 20, 2019
b0eaf7f
Verify that we did indeed find the correct number of regular files an…
bubba-h57 Apr 20, 2019
3134c08
File Type Count Verification
bubba-h57 Apr 20, 2019
eb7a647
Verify external attributes.
bubba-h57 Apr 20, 2019
420862e
Verify a file entry.
bubba-h57 Apr 20, 2019
ce2fdca
Cleanup zip creation
bubba-h57 Apr 20, 2019
d6c59e1
Cleanup must unzip test entry
bubba-h57 Apr 20, 2019
a89bf29
issymlink() function to determine file type
bubba-h57 Apr 19, 2019
7b947cc
Refactor to simply tracking the extracted_path
bubba-h57 Apr 19, 2019
7d15dbc
Implement _extract() to handle symlinks
bubba-h57 Apr 19, 2019
ac13511
Labels are better than bare octal values
bubba-h57 Apr 19, 2019
58ff5b8
Modify the files_with_permissions structure, and document the details
bubba-h57 Apr 19, 2019
6ec323d
We can use raw values.
bubba-h57 Apr 20, 2019
d8d378a
Configure the files for zipping.
bubba-h57 Apr 20, 2019
9cf3020
Verify external attributes.
bubba-h57 Apr 20, 2019
ea9ba58
Cleanup zip creation
bubba-h57 Apr 20, 2019
ea4afe4
Cleanup must unzip test entry
bubba-h57 Apr 20, 2019
eaa1ba1
resolve `os.lchmod()` is available on OSX but no Travis Xenial.
bubba-h57 Apr 25, 2019
aa7a1fb
Merge branch 'respect-zipped-symlinks' of github.com:stechstudio/aws-…
bubba-h57 Jun 7, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 66 additions & 5 deletions samcli/local/lambdafn/zip.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,70 @@

LOG = logging.getLogger(__name__)

S_IFLNK = 0xA


def _is_symlink(file_info):
"""
Check the upper 4 bits of the external attribute for a symlink.
See: https://unix.stackexchange.com/questions/14705/the-zip-formats-external-file-attribute

Parameters
----------
file_info : zipfile.ZipInfo
The ZipInfo for a ZipFile

Returns
-------
bool
A response regarding whether the ZipInfo defines a symlink or not.
"""

return (file_info.external_attr >> 28) == 0xA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using the S_IFLNK constant from above?

Why not use stat. S_IFLNK here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not looking at a file on the filesystem. It is looking at a number of bytes in the file_info entry, the External Attributes for the file. See The zip format's external file attribute If you want to dig into it. But the short answer is because it isn't a file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. You made a constant S_IFLNK = 0xA above. Can you either remove it the constant or replace the 0xA here with S_IFLNK. Otherwise, it looks like an unused constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ that was with regard to pythons stat.S_ISLNK() ... I have no good excuse for not comparing to the S_IFLNK = 0xA constant defined directly above the function. :-)



def _extract(file_info, output_dir, zip_ref):
"""
Unzip the given file into the given directory while preserving file permissions in the process.

Parameters
----------
file_info : zipfile.ZipInfo
The ZipInfo for a ZipFile

output_dir : str
Path to the directory where the it should be unzipped to

zip_ref : zipfile.ZipFile
The ZipFile we are working with.

Returns
-------
string
Returns the target path the Zip Entry was extracted to.
"""

# Handle any regular file/directory entries
if not _is_symlink(file_info):
return zip_ref.extract(file_info, output_dir)

source = zip_ref.read(file_info.filename).decode('utf8')
link_name = os.path.normpath(os.path.join(output_dir, file_info.filename))

# make leading dirs if needed
leading_dirs = os.path.dirname(link_name)
if not os.path.exists(leading_dirs):
os.makedirs(leading_dirs)

# If the link already exists, delete it or symlink() fails
if os.path.lexists(link_name):
os.remove(link_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this have any (negative) side effects? I can't think of any at the moment but asking the question to spark a discussion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it fairly extensively and can not think of any negative side effects that might pop up. The unzip intends to overwrite anything that is there regardless ... we will simply delete it prior to writing.


# Create a symbolic link pointing to source named link_name.
os.symlink(source, link_name)

return link_name


def unzip(zip_file_path, output_dir, permission=None):
"""
Expand All @@ -40,10 +104,7 @@ def unzip(zip_file_path, output_dir, permission=None):

# For each item in the zip file, extract the file and set permissions if available
for file_info in zip_ref.infolist():
name = file_info.filename
extracted_path = os.path.join(output_dir, name)

zip_ref.extract(name, output_dir)
extracted_path = _extract(file_info, output_dir, zip_ref)
_set_permissions(file_info, extracted_path)

_override_permissions(extracted_path, permission)
Expand Down Expand Up @@ -81,7 +142,7 @@ def _set_permissions(zip_file_info, extracted_path):
"""

# Permission information is stored in first two bytes.
permission = zip_file_info.external_attr >> 16
permission = (zip_file_info.external_attr >> 16) & 511
if not permission:
# Zips created on certain Windows machines, however, might not have any permission information on them.
# Skip setting a permission on these files.
Expand Down
154 changes: 131 additions & 23 deletions tests/unit/local/lambdafn/test_zip.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,51 +14,102 @@

class TestUnzipWithPermissions(TestCase):

files_with_permissions = {
"folder1/1.txt": 0o644,
"folder1/2.txt": 0o777,
"folder2/subdir/1.txt": 0o666,
"folder2/subdir/2.txt": 0o400
"""
External Attribute Magic = type + permission + DOS is-dir flag?

TTTTugsrwxrwxrwx0000000000ADVSHR
^^^^____________________________ File Type [UPPER 4 bits, 29-32]
^___________________________ setuid [bit 28]
^__________________________ setgid [bit 27]
^_________________________ sticky [bit 26]
^^^^^^^^^________________ Permissions [bits 17-25]
^^^^^^^^________ Other [bits 9-16]
^^^^^^^^ DOS attribute bits: [LOWER 8 bits]

Interesting File Types
S_IFDIR 0040000 /* directory */
S_IFREG 0100000 /* regular */
S_IFLNK 0120000 /* symbolic link */

See: https://unix.stackexchange.com/questions/14705/%20the-zip-formats-external-file-attribute
"""

files_with_external_attr = {
"1.txt": {
"file_type": 0o10,
"contents": b'foo',
"permissions": 0o644,
},
"folder1/2.txt": {
"file_type": 0o10,
"contents": b'bar',
"permissions": 0o777,
},
"folder2/subdir/3.txt": {
"file_type": 0o10,
"contents": b'foo bar',
"permissions": 0o666,
},
"folder2/subdir/4.txt": {
"file_type": 0o10,
"contents": b'bar foo',
"permissions": 0o400,
},
"symlinkToF2": {
"file_type": 0o12,
"contents": b'1.txt',
"permissions": 0o644,
}
}

expected_files = 0
expected_symlinks = 0
actual_files = 0
actual_symlinks = 0

@parameterized.expand([param(True), param(False)])
def test_must_unzip(self, check_permissions):
def test_must_unzip(self, verify_external_attributes):
self._reset(verify_external_attributes)

with self._create_zip(self.files_with_permissions, check_permissions) as zip_file_name:
with self._create_zip(self.files_with_external_attr, verify_external_attributes) as zip_file_name:
with self._temp_dir() as extract_dir:

unzip(zip_file_name, extract_dir)

for root, dirs, files in os.walk(extract_dir):
for file in files:
filepath = os.path.join(extract_dir, root, file)
perm = oct(stat.S_IMODE(os.stat(filepath).st_mode))
key = os.path.relpath(filepath, extract_dir)
expected_permission = oct(self.files_with_permissions[key])
self._verify_file(extract_dir, file, root, verify_external_attributes)

self.assertIn(key, self.files_with_permissions)
self._verify_file_count(verify_external_attributes)

if check_permissions:
self.assertEquals(expected_permission,
perm,
"File {} has wrong permission {}".format(key, perm))
@contextmanager
def _reset(self, verify_external_attributes):
self.expected_files = 0
self.expected_symlinks = 0
self.actual_files = 0
self.actual_symlinks = 0
if verify_external_attributes:
for filename, data in self.files_with_external_attr.items():
if data["file_type"] == 0o12:
self.expected_symlinks += 1
elif data["file_type"] == 0o10:
self.expected_files += 1

@contextmanager
def _create_zip(self, files_with_permissions, add_permissions=True):
def _create_zip(self, file_dict, add_attributes=True):

zipfilename = None
data = b'hello world'
try:
zipfilename = NamedTemporaryFile(mode="w+b").name

zf = zipfile.ZipFile(zipfilename, "w", zipfile.ZIP_DEFLATED)
for filename, perm in files_with_permissions.items():
for filename, data in file_dict.items():

fileinfo = zipfile.ZipInfo(filename)

if add_permissions:
fileinfo.external_attr = perm << 16
if add_attributes:
fileinfo.external_attr = (data["file_type"] << 28) | (data["permissions"] << 16)

zf.writestr(fileinfo, data)
zf.writestr(fileinfo, data["contents"])

zf.close()

Expand All @@ -68,6 +119,63 @@ def _create_zip(self, files_with_permissions, add_permissions=True):
if zipfilename:
os.remove(zipfilename)

@contextmanager
def _verify_file(self, extract_dir, file, root, verify_external_attributes):
filepath = os.path.join(extract_dir, root, file)
key = os.path.relpath(filepath, extract_dir)
mode = os.lstat(filepath).st_mode
actual_permissions = oct(stat.S_IMODE(mode))
expected_permission = oct(self.files_with_external_attr[key]["permissions"])

self.assertIn(key, self.files_with_external_attr)
if verify_external_attributes:
self._verify_external_attributes(
actual_permissions,
expected_permission,
key,
mode)

@contextmanager
def _verify_external_attributes(self, actual_permissions, expected_permission, key,
mode):
if stat.S_ISREG(mode):
self.assertTrue(
self.files_with_external_attr[key]["file_type"] == 0o10,
"Expected a regular file."
)
self.actual_files += 1
elif stat.S_ISLNK(mode):
self.assertTrue(
self.files_with_external_attr[key]["file_type"] == 0o12,
"Expected a Symlink."
)
self.actual_symlinks += 1
return

self.assertEquals(
expected_permission,
actual_permissions,
"File {} has wrong permission {}, expected {}.".format(
key,
actual_permissions,
expected_permission
)
)

@contextmanager
def _verify_file_count(self, verify_external_attributes):
if verify_external_attributes:
self.assertEqual(
self.expected_files,
self.actual_files,
"Expected {} files but found {}.".format(self.expected_files, self.actual_files)
)
self.assertEqual(
self.expected_symlinks,
self.actual_symlinks,
"Expected {} symlinks but found {}.".format(self.expected_symlinks, self.actual_symlinks)
)

@contextmanager
def _temp_dir(self):
name = None
Expand Down