-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lenient file-type checking mode #10862 #10863
Merged
jacobtylerwalls
merged 9 commits into
dev/7.6.x
from
jtw/deprecate-file-type-checking-setting
Jun 25, 2024
Merged
Changes from 5 commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
bcaae04
Add lenient file-type checking mode #10862
jacobtylerwalls 56300cc
nit re #10862
jacobtylerwalls 8125407
logic nit re #10862
jacobtylerwalls ca64695
Surface error message if non-zip file given to JSON-LD importer
jacobtylerwalls 52afd2a
Complete test coverage of file validator #10862
jacobtylerwalls 80b39f1
Bump filetype to 1.2.0
jacobtylerwalls 2c5386e
Continue debugging flaky test
jacobtylerwalls 90557ee
Use SimpleTestCase re #10862
jacobtylerwalls 28cf693
Add documentation link (content to follow)
jacobtylerwalls File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
import os | ||
import shutil | ||
from pathlib import Path | ||
|
||
from unittest.mock import Mock, patch | ||
|
||
from django.conf import settings | ||
from django.test import TestCase | ||
from django.test.utils import override_settings | ||
|
||
from arches.app.utils.file_validator import FileValidator | ||
|
||
# these tests can be run from the command line via | ||
# python manage.py test tests.utils.test_file_validator.FileValidatorTests --settings="tests.test_settings" | ||
|
||
|
||
class MockFile: | ||
@staticmethod | ||
def read(): | ||
"""Return a jagged csv file (invalid row length)""" | ||
return b"col1,col2\ndata1" | ||
|
||
@staticmethod | ||
def seek(offset): | ||
return | ||
|
||
|
||
class MockFileType: | ||
def __init__(self, extension): | ||
self.extension = extension | ||
|
||
|
||
class FileValidatorTests(TestCase): | ||
"""FILE_TYPE_CHECKING defaults to 'lenient': overridden as necessary.""" | ||
|
||
validator = FileValidator() | ||
mock_file = MockFile() | ||
|
||
@override_settings(FILE_TYPE_CHECKING=None) | ||
def test_no_file_checking(self): | ||
errors = self.validator.validate_file_type(self.mock_file) | ||
self.assertEqual(errors, []) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
def test_check_unknown_filetype_lenient(self): | ||
errors = self.validator.validate_file_type(self.mock_file) | ||
self.assertEqual(errors, []) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
@override_settings(FILE_TYPE_CHECKING="strict") | ||
def test_check_unknown_filetype_strict(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type(self.mock_file) | ||
self.assertEqual(errors, ["File type is not permitted: None"]) | ||
|
||
@patch("filetype.guess", Mock(return_value=MockFileType("suspicious"))) | ||
def test_filetype_not_listed(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type(self.mock_file) | ||
self.assertEqual(errors, ["Unsafe file type suspicious"]) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
def test_check_invalid_csv(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type(self.mock_file, extension="csv") | ||
self.assertEqual(errors, ["Invalid csv file"]) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
def test_check_invalid_json(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type(self.mock_file, extension="json") | ||
self.assertEqual(errors, ["Invalid json file"]) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
def test_check_invalid_jpeg_lenient(self): | ||
errors = self.validator.validate_file_type(self.mock_file, extension="jpeg") | ||
self.assertEqual(errors, []) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
@override_settings(FILE_TYPE_CHECKING="strict") | ||
def test_check_invalid_jpeg_strict(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type(self.mock_file, extension="jpeg") | ||
self.assertEqual(errors, ["Cannot validate file"]) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
def test_check_invalid_jpeg_lenient(self): | ||
errors = self.validator.validate_file_type(self.mock_file, extension="jpeg") | ||
self.assertEqual(errors, []) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
@override_settings(FILE_TYPE_CHECKING="strict") | ||
def test_check_invalid_but_not_in_listed_types(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type( | ||
self.mock_file, extension="notlisted" | ||
) | ||
self.assertEqual(errors, ["File type is not permitted: notlisted"]) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
def test_check_dsstore_lenient(self): | ||
"""In lenient mode, we assume these might be present in .zip files.""" | ||
with self.assertLogs("arches.app.utils.file_validator", level="WARN"): | ||
errors = self.validator.validate_file_type( | ||
self.mock_file, extension="DS_Store" | ||
) | ||
self.assertEqual(errors, []) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
@override_settings(FILE_TYPE_CHECKING="strict") | ||
def test_check_dsstore_strict(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type( | ||
self.mock_file, extension="DS_Store" | ||
) | ||
self.assertEqual(errors, ["File type is not permitted: DS_Store"]) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
@patch("arches.app.utils.file_validator.load_workbook", lambda noop: None) | ||
def test_valid_xlsx(self): | ||
errors = self.validator.validate_file_type(self.mock_file, extension="xlsx") | ||
self.assertEqual(errors, []) | ||
|
||
@patch("filetype.guess", Mock(return_value=None)) | ||
def test_invalid_xlsx(self): | ||
with self.assertLogs("arches.app.utils.file_validator", level="ERROR"): | ||
errors = self.validator.validate_file_type(self.mock_file, extension="xlsx") | ||
self.assertEqual(errors, ["Invalid xlsx workbook"]) | ||
|
||
def test_zip(self): | ||
# Zip up the files in the tests/fixtures/uploadedfiles dir | ||
# Currently, contains a single .csv file and an empty dir. | ||
dir_to_zip = Path(settings.MEDIA_ROOT) / "uploadedfiles" | ||
zip_file_path = shutil.make_archive("test", "zip", dir_to_zip, dir_to_zip) | ||
self.addCleanup(os.unlink, zip_file_path) | ||
|
||
with open(zip_file_path, "rb") as file: | ||
errors = self.validator.validate_file_type(file) | ||
self.assertEqual(errors, []) | ||
|
||
with ( | ||
open(zip_file_path, "rb") as file, | ||
self.modify_settings(FILE_TYPES={"remove": "csv"}), | ||
self.assertLogs("arches.app.utils.file_validator", level="ERROR"), | ||
): | ||
errors = self.validator.validate_file_type(file) | ||
self.assertEqual( | ||
errors, ["File type is not permitted: csv", "Unsafe zip file contents"] | ||
) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed that in #10885 I forgot to surface this error in the UI where FILE_TYPE_CHECKING is false and we go straight to trying read a file as a zip file.