-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sdist determinisitic by setting gzip mtime to 0 #870
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Can you fix the formatting?
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@@ -64,7 +64,7 @@ def build(self, target_dir=None): # type: (Path) -> Path | |||
target = target_dir / "{}-{}.tar.gz".format( | |||
self._package.pretty_name, self._meta.version | |||
) | |||
gz = GzipFile(target.as_posix(), mode="wb") | |||
gz = GzipFile(target.as_posix(), mode="wb", mtime=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @achow101,
thanks a lot for your contribution. Fixing this looks necessary to me. Are there any reasons for settings the mtime to 0 and not to the current time via time.time()
?
fin swimmer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's literally the whole point of this PR. A fixed time is needed otherwise the resulting archive is non-deterministic. Setting to time.time()
means that the timestamp will continue to be variable which is what I'm trying to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah , sorry it was obviously to late yesterday. The current timestamp is already used if the parameter is omitted and I thought it is something similar to the already merged #1541.
So let's change my question :) Why do you want the file to be deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I subscribe to the idea that the releases that you publish should be built in such a way that other people can repeat the build process and produce the exact same files, down to the bit. This makes it much easier to audit the published binaries/archives. https://reproducible-builds.org/ basically explains why.
There's also a bunch of things already in poetry which make the build results more reproducible, so reproducibility is clearly something that is intended in this project.
Thanks for your contribution! This looks reasonable to me. Could you rebase your changes onto |
Rebased |
Apoligies for dropping the ball on this one. Feel free to ping on discord if something falls through the crack again. We appricate your contributions, issues reports, PRs and doc fixes alike. @achow101 this needs to move |
@achow101 closing this here as this needs to go to poetry-core now anyway. |
Moved to poetry-core: python-poetry/poetry-core#105 |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
A GZip file contains a timestamp with it's last modification time. This timestamp by default is the current time which will make
poetry build
produce non-deterministic sdist archives. To make sdist archives deterministic, this timestamp must be set to a fixed time. So this PR sets that time to 0.