-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix writing multi-byte characters to tar archive #627
Conversation
Can we use codePointCount? Not saying we should, just curious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addEntry()
seems to be called by toTarballBlob()
to add what's described as a a blob. Why don't we just pass the file contents as byte[]
— or better yet as ByteSource
— and let the caller deal with the encoding?
If we used ByteSource
then we could get rid of the other addEntry(TarArchiveEntry)
and use MoreFiles.asByteSource(filePath)
.
So it looks like the main problem has been fixed, but using multibyte characters in tar archive entry header names on windows might still be sketchy (e.g. adding non-ascii filenames might not work). Multibyte file contents should work on all platforms, though. I'll keep looking into it, but feel free to review this now so we can get the fix out. |
@briandealwis |
@TadCordle oh we should add a CHANGELOG entry for this |
Cool, just in case, have you tested |
Yeah, tested both with the same args used in the original issue. |
Fixes #626. Looks like we were using string length instead of byte array length.