Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better discovery of macos system version to determine proper platform tag #314

Merged
merged 9 commits into from
Oct 23, 2019

Conversation

Czaki
Copy link
Contributor

@Czaki Czaki commented Sep 24, 2019

In this moment wheel package create python wheel with same platform tag like python was compiled. But wheel may contain libraries which needs higher macos version. In this patch I add code which scan all files added to wheel and if their suffix suggest that it is library (*.so or *.dynlib) and then try to determine macos version by scan Mach-O header. This code is pure python and do not depends on external files. The struct in comments are copied from Apple heade files.

To manually check library needs use otool on file. and look on LC_BUILD_VERSION or LC_VERSION_MIN_MACOSX section.

To set macosx minimum version set environment variable MACOSX_DEPLOYMENT_TARGET before compile. If compilation is launched with python setup.py ... then the version from variable need to be higher or equal than version returned by distutils.util.get_platform()

travis ci fail is connected with python 2.7.3 and occurs also in original data. It is not connected with my changes.

pos = lib_file.tell()
segment_base = read_data(SegmentBase, lib_file)
lib_file.seek(pos)
if segment_base.cmd == LC_VERSION_MIN_MACOSX:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that LC_VERSION_MIN_MACOSX always appears before LC_BUILD_VERSION?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are disjoint. form macosx version 10.8 to 10.13 there is LC_VERSION_MIN_MACOSX section and since 10.14 (current) there is LC_BUILD_VERSION section.

As you can see both contains return statement so if function found one of them then it finish.

@codecov
Copy link

codecov bot commented Sep 25, 2019

Codecov Report

Merging #314 into master will increase coverage by 4.92%.
The diff coverage is 91.89%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #314      +/-   ##
==========================================
+ Coverage   59.71%   64.64%   +4.92%     
==========================================
  Files          12       13       +1     
  Lines         844      987     +143     
==========================================
+ Hits          504      638     +134     
- Misses        340      349       +9
Impacted Files Coverage Δ
wheel/bdist_wheel.py 51.61% <0%> (-0.49%) ⬇️
wheel/lib_file_analyse.py 93.51% <93.51%> (ø)
wheel/pep425tags.py 47.58% <94.59%> (+15.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b227ddd...9144297. Read the comment docs.

…on and description on begin of Mach-O parser
@@ -126,8 +141,23 @@ def get_platform(archive_root):
if version is not None:
base_version = max(base_version, version)
if base_version[-1] == 0:
base_version = base_version[:-1]
base_version = "_".join([str(x) for x in base_version])
fin_base_version = base_version[:-1]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like fin_base_version is undefined if the if-condition is not true. This would break the code below, right? Is there any reason why you try to support x.y.z instead of just x.y?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use x.y.z because this is format used inside libraries header. This is 190 line of file lib_file_analyse.py

uint32_t	version;	/* X.Y.Z is encoded in nibbles xxxx.yy.zz */

I push fix for undefined variable.

@codecov
Copy link

codecov bot commented Sep 26, 2019

Codecov Report

Merging #314 into master will increase coverage by 5.17%.
The diff coverage is 91.77%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #314      +/-   ##
==========================================
+ Coverage   59.71%   64.89%   +5.17%     
==========================================
  Files          12       13       +1     
  Lines         844      997     +153     
==========================================
+ Hits          504      647     +143     
- Misses        340      350      +10
Impacted Files Coverage Δ
wheel/bdist_wheel.py 51.61% <0%> (-0.49%) ⬇️
wheel/macosx_libfile.py 92.72% <92.72%> (ø)
wheel/pep425tags.py 50.32% <95.55%> (+18.18%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b227ddd...b1dd2c4. Read the comment docs.

@Czaki
Copy link
Contributor Author

Czaki commented Sep 26, 2019

All improvements comes from request from the discussion in matthew-brett/delocate#56

Copy link
Contributor

@agronholm agronholm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style changes required. As for the actual logic, I am not qualified to review that.

setup.py Outdated Show resolved Hide resolved
wheel/pep425tags.py Outdated Show resolved Hide resolved
tests/test_lib_file_analyse.py Outdated Show resolved Hide resolved
wheel/bdist_wheel.py Show resolved Hide resolved
dirname = os.path.dirname(__file__)
dylib_dir = os.path.join(dirname, "testdata",
"macos_minimal_system_version")
with mocker.patch("distutils.util.get_platform", return_value="macosx-10.14-x86_64"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're using mocks in with blocks, what's the point of using mocker in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm learning how to use mock. If I good understand in python 2.7 there is preference to use mock library and in python 3.4+ use unittests.mock.

And I do not have idea how test this functions without mock.

How to fix it in best way. change dependency on mock. Do some if depends on python version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 2.6 did not have unittest.mock. But Python 2.7 does. The mocker fixture, on the other hand, works like pytest's built-in monkeypatch fixture in that it automatically undoes all the mocks that it sets up during the test request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrite tests to base on monkeypatch. First I think that it is not enough, but maybe I better understand it.

wheel/bdist_wheel.py Outdated Show resolved Hide resolved
wheel/lib_file_analyse.py Outdated Show resolved Hide resolved
@agronholm
Copy link
Contributor

Close enough. Now I'd like to find at least one qualified reviewer for the actual business logic.

@Czaki
Copy link
Contributor Author

Czaki commented Sep 26, 2019

A moment ago I realized that I'm not sure if my change do not break plat-name flag. I'm not sure if I proper understand control flow in distutils.

My changes in bdist_wheel:get_tag.

Could you check it?

@agronholm
Copy link
Contributor

Okay, I'll check.

@agronholm
Copy link
Contributor

The way I see it is that if someone has explicitly set a platform tag starting with macosx, it will get overridden by the detected tag. Couldn't this simply have been set with plat_name = self.plat_name or get_platform(self.bdist_dir)?

@Czaki
Copy link
Contributor Author

Czaki commented Sep 26, 2019

It not fix the poroblem. From the comment on begin of get_tag
# bdist sets self.plat_name if unset, we should only use it for purepy

And I check that on begin of get_tag the name is set.

Maybe finally some patch should be applied to python, but this take very long time (waiting on updates on CI servers, distribution repos etc.) and fixes of wheel package can be used earlier.

@Czaki
Copy link
Contributor Author

Czaki commented Sep 29, 2019

@agronholm I found that if user set platform name by cli argument then plat_name_supplied is set to true (in function finalize_optionswhich is called before `get_tag) so first if is satisfied and there is no overwrite of user choose.

        if self.plat_name_supplied:
            plat_name = self.plat_name

@Czaki
Copy link
Contributor Author

Czaki commented Oct 10, 2019

Did I need introduce any changes?

@agronholm
Copy link
Contributor

No, I've been on holiday for some days and I will review this shortly. Would be nice to also get a statement from someone who understands macOS binary compatibility.

@Czaki
Copy link
Contributor Author

Czaki commented Oct 13, 2019

I can send you macos headers (*.h) on which I base during writing.

@agronholm
Copy link
Contributor

I honestly don't think they would help as I have no expertise on macOS specific development. I'll ask on the distutils-sig mailing list and see if I can get someone with that expertise to review this. Other than that I have no objections to merging this.

wheel/pep425tags.py Outdated Show resolved Hide resolved
@agronholm
Copy link
Contributor

I am also reluctant to add a dependency for wheel, or to vendor such a large project.

@Czaki
Copy link
Contributor Author

Czaki commented Oct 23, 2019

@agronholm There are any news? Someone declared to review the code?

@agronholm
Copy link
Contributor

Nope, so I'm going to go ahead and merge this soon.

@agronholm agronholm merged commit b5733b8 into pypa:master Oct 23, 2019
@agronholm
Copy link
Contributor

Thank you! Now let's just hope that the (mac) world doesn't burn because of this 😄

@Czaki
Copy link
Contributor Author

Czaki commented Oct 24, 2019

Could you inform me when new release was created? I wait for this to update pypa/cibuildwheel#156

@agronholm
Copy link
Contributor

I'll let you know once I've made a new release.

agronholm added a commit that referenced this pull request Oct 25, 2019
@YannickJadoul
Copy link
Member

Just a question: any idea or plan when this new version of wheel would be released?
I definitely don't want to push/rush anything here, but just an estimate would be nice for further planning of our next cibuildwheel release.

@agronholm
Copy link
Contributor

I'm waiting for PR #317 to be finished so I can merge it. I don't want to ship this feature in its current state.

@YannickJadoul
Copy link
Member

Thanks! I had no clue that PR wasn't ready yet. But good to know this is in @Czaki's hands, then; he's also the author of the PR on the cibuildwheel side, so he can coordinate further, then :-)

@Czaki
Copy link
Contributor Author

Czaki commented Dec 26, 2019

@YannickJadoul Could you try to review all changes? (both PR). I testing it in my projects and found two bugs (fixed in PR #317) but maybe it is better to someone with more macos experience take a look on it.

@agronholm In current time I do not see any problem in PR #317 but I learn MacOs on example, so sometimes something surprise me. In this moment I do not have plans for next commits, but I use modified version of wheel in projects to find any problems.

@agronholm
Copy link
Contributor

In current time I do not see any problem in PR #317

Your last message ("I cannot create proper test file but I found such in delocate package. I add missed test case.") indicates that you plan to add a test. Is this not correct?

@Czaki
Copy link
Contributor Author

Czaki commented Dec 26, 2019

@agronholm I add it in this commit 0dd9f3e
(before I write cited comment)

@agronholm
Copy link
Contributor

I see. This is why the correct tense of the verb is important! I will review your PR tomorrow.

@YannickJadoul
Copy link
Member

@YannickJadoul Could you try to review all changes? (both PR). I testing it in my projects and found two bugs (fixed in PR #317) but maybe it is better to someone with more macos experience take a look on it.

I'm happy to take a look. Which PRs do you mean?
But I really don't have more macOS experience/knowledge than you, I'm afraid. I learned a lot about different versions of macOS to be targeted through your PRs. Though, still happy to take a look, if you want.

@Czaki
Copy link
Contributor Author

Czaki commented Dec 27, 2019

@YannickJadoul In both (This and #317) In file https://github.com/Czaki/wheel/blob/master/wheel/pep425tags.py#L112
I introduce function calculate_macosx_platform_tag function.
All tests are here: https://github.com/Czaki/wheel/blob/master/tests/test_macosx_libfile.py

If you have time to check if logic is proper and all most important cases are covered.

@agronholm Sorry. I need to remember to check text before accept comment.

Copy link
Member

@YannickJadoul YannickJadoul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Czaki If you wanted some comments on this, here you go. I don't know enough about these mac headers to say a lot about that part, and for the rest, these are mainly some suggestions and thoughts.

The only two things I'm sure of could be changed are some details in the two messages.

EDIT: If you prefer for me to do it, I can also just create a PR that fixes the typos?

Comment on lines +212 to +215
return (((x << 24) & 0xFF000000) |
((x << 8) & 0x00FF0000) |
((x >> 8) & 0x0000FF00) |
((x >> 24) & 0x000000FF))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using struct might be shorter/easier? It's also closely related to what you're trying to do (fix endianess). (cfr. https://stackoverflow.com/a/27506692)

and if apple introduce some changes both implementation will need to be updated.
"""

import ctypes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not be worth changing everything, but isn't there also the struct built-in library that makes readying binary data easy? (See https://stackoverflow.com/a/52035410 for a discussion on struct vs. ctypes.) So struct.unpack(...) might be slightly shorter to use. But again, probably not worth changing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decide to use ctypes because I can use types like uint8. Struct library has only standard sizes (long, etc), not fixed sizes. And apple use fixed sizes in header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, OK. But it seems that these standard sizes can be enforced:

Note

By default, the result of packing a given C struct includes pad bytes in order to maintain proper alignment for the C types involved; similarly, alignment is taken into account when unpacking. This behavior is chosen so that the bytes of a packed struct correspond exactly to the layout in memory of the corresponding C struct. To handle platform-independent data formats or omit implicit pad bytes, use standard size and alignment instead of native size and alignment: see Byte Order, Size, and Alignment for details.

https://docs.python.org/3/library/struct.html (and even already on Python 2 https://docs.python.org/2/library/struct.html).

But having uint8 might indeed be cleaner. Just thought I'd mention it :-)

problematic_files = "\n".join(problematic_files)
error_message = \
"[WARNING] This wheel needs higher macosx version than {} " \
"is set in MACOSX_DEPLOYMENT_TARGET variable. " \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to remove this line, after adding the format, later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand. I use this text in format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on if "MACOSX_DEPLOYMENT_TARGET" in os.environ, it can be that "is set in MACOSX_DEPLOYMENT_TARGET variable." will be twice in the string, since you format it later and it is already in the format string.

wheel/pep425tags.py Show resolved Hide resolved
wheel/pep425tags.py Show resolved Hide resolved
wheel/pep425tags.py Show resolved Hide resolved
wheel/pep425tags.py Show resolved Hide resolved
wheel/macosx_libfile.py Show resolved Hide resolved


def parse_version(version):
zz = version & 2**9-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here; máybe this is easier with struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to do this with struct but it is not intuitive for me. You have some suggestion how such code should look?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, OK, I forgot the thing was already converted into an int, and not bytes anymore.
This works, but it's not better, I think: struct.unpack('>HBB', struct.pack('>I', 0x000a0e01)). Not worth using if struct is not used anywhere else, I'd say :-)

Copy link
Member

@YannickJadoul YannickJadoul Jan 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still thinking, though, because the current code confuses me a bit. Is this cleaner? I don't know, up to you, and just close if you like your solution better:

def parse_version(version): 
    x = (version & 0xffff0000) >> 16
    y = (version & 0x0000ff00) >> 8
    z = (version & 0x000000ff)
    return x, y, z

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is much cleaner. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants