-
-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix buffer over-read and memory leaks when using long filepaths in minizip API #69677
Conversation
a18ec5e
to
2962452
Compare
@RevoluPowered If you're using this, you may want to review. |
It may be worth checking if this PR helps with #34626. (Not blocking for a merge, but it may be worth looking into for a future PR.) |
Is there anything needed to advance this PR? I have been testing this with some troublesome .zip-files which contain very long file names (file hashes and directories longer than 256/260 characters) and this PR solves the issues very nicely. I also tested that the .zip files in #34626 can be at least read by the ZIPReader now, but that might be also caused by the minizip library updates after that 2019 bug report (or maybe something other than the zip reading causes that bug). At least .zips created using Windows 10 native functionality, 7-zip and Python's zlib seem to work with this PR, long file names and all. One suggestion I have is tweaking the implementation of int godot_unzip_get_current_file_info(unzFile p_zip_file, unz_file_info64 &r_file_info, String &r_filepath) {
// First, ask only for the file name size. If that succeeds, allocate memory and actually retrieve the name.
if (unzGetCurrentFileInfo64(p_zip_file, &r_file_info, nullptr, 0, nullptr, 0, nullptr, 0) == UNZ_OK) {
LocalVector<char> path;
path.resize(r_file_info.size_filename);
if (unzGetCurrentFileInfo64(p_zip_file, &r_file_info, path.ptr(), path.size(), nullptr, 0, nullptr, 0) == UNZ_OK) {
r_filepath = String::utf8(path.ptr(), path.size());
return UNZ_OK;
}
}
return UNZ_ERRNO;
} |
While I agree that the raw memory allocation is somewhat distasteful I kept it in the shape I found it from the original code because it follows the same style throughout the engine of using minizip on the stack when and where possible. Your version instead does multiple memory allocations (even on the good path!) and always drops the first block on the floor (a good candidate for stack use). The way it is currently written permits that faster path of reading from the zip. I don't want to dumb the code down because it otherwise cuts corners and doesn't offer much in terms of maintenance. How often should this code change or be modified? I'd the prefer slightly uglier code that does less allocations / repeat reading from the zip. In saying all of that, I should probably have changed the memory allocation to use Can anyone review this PR to move it forward / identify blockers? |
906a74c
to
555c496
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine to me.
Note: There are few unnecessary //
comments, that should be removed. And auto
.
Anything needed to progress this, @Macksaur? Looks like the only remaining issue is removing the auto-keyword and replacing it with an explicit I would also reduce the stack buffer size to maybe 2048 or even 1024 instead of 16K. There are several platforms where thread stack size is very small. At RocketChat people porting things to WASM and Nintedo 3DS noticed that minizip blows the stack almost immediately with a 64K stack allocation and they had to patch it. |
Hello! This PR is done and approved as far as I can tell and has been that way for at least three months. I have not been notified of anything outstanding and I have already made any necessary changes. (@akien-mga @YuriSizov ?) Regarding As for the stack size issue? Wow! That's not good! However, I don't want to special case this PR when it's already just lingering. Searching the codebase for |
Well, I think the fact that you have I'm not saying that your reasons are not valid, but that's the reason it's not been merged yet. |
By the way, before this can be merged, it would be good to rebase it, since it hasn't been updated in 5 months. |
…zip archive and improved robustness of long filepaths and reading files.
Thanks! And congrats for your first merged Godot contribution 🎉 |
List<String>
of filepaths due to implicitString(c_str)
constructor use.ZIPReader::get_files()
.unzGetCurrentFileInfo
replacement method,godot_unzip_get_current_file_info
, that safely gets the file information and filepath as aString
without exposing/leaking dangerous C strings.unzLocateFile
replacement method,godot_unzip_locate_file
, that handles long filepaths as the defaultunzLocateFile
doesn't handle paths >256 characters at all (it could if it wanted to...). This method also uses Godot'sString
class for comparison in order to support Unicode case-comparison.ZIPReader::read_file
, it should now safely handle uncompressed blobs of up to 2gb and verify each action it takes before returning data.ZipAppend
enum.