-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meson errors with Python 3.10 #11864
Comments
It's completely bizarre that the symbolextractor fails without emitting an error message of any sort. It's reminiscent of pypa/pip#10875 (was a bug in the launcher *.exe files pip/distlib uses on Windows, reproducible in CI but not running interactively). |
thanks, good point. The last meson version before was built with the same setuptools version though, so there shouldn't be any difference regarding the launcher. I can try porting the meson package from setuptools to builder/installer so it gets a newer launcher. Maybe that makes a difference, or at least gives some more insight. |
Hmm, meson was built with setuptools launcher which shouldn't have this issue afaik. |
same error with the python-installer launcher (#11865) |
For the record, last I saw the latest version of distlib was the one with the wonky binaries, and pip specifically worked around this by downgrading the bundled distlib and rereleasing. I'm not sure what other installers do on Windows specifically. They may bundle different versions of distlib. |
I fetched the meson build dir from a failed CI build, but there is nothing helpful in there. Just to point out why it might be a different issue: This happens randomly, it sometimes works, sometimes doesn't in CI. It started with the Update to Python 3.10 over the last week, could be a coincidence, but mesa built fine 2 weeks ago. I'll (hackily) try forcing a sys.stderr for symbolextractor in CI and see if that helps next. |
After some more printf debugging, the internal meson command runs through fine, so something after that fails somehow.. I'm giving up for now.. |
Reason for retry is due to intermittent Python symbol extractor failure under MSYS2 reported in bug: msys2/MINGW-packages#11864
Reason for retry is due to intermittent Python symbol extractor failure under MSYS2 reported in bug: msys2/MINGW-packages#11864
@lazka , after seeing you mention the issue was intermittent, I've placed our Meson builds in a diaper script to re-run on failure. Data points:
What's interesting is that the retry logic is never needed, making me suspect that running inside a shell wrapper is altering the parent-level runtime space/behavior in a way that Python likes (perhaps stacksize or other low-level stuff inside MSYS2) versus the "direct-launch" that's giving it troubles. Perhaps to reproduce this locally, you'd need to launch the CI YAML |
Reason for retry is due to intermittent Python symbol extractor failure under MSYS2 reported in bug: msys2/MINGW-packages#11864
I've forward ported an old ninja PR (ninja-build/ninja#1805) to show the exit status and it's STATUS_ACCESS_VIOLATION. So that could either be the launcher crashing or python itself. I'd guess python itself (?). |
They may be using either .NET or NodeJS. That's what their self hosted runner and actions are using respectively at least. |
Oh wow, interesting. |
A user running GDAL was hit with if __name__ == "__main__":
if sys.platform[:3] == "win":
sys.setrecursionlimit(10000000)
threading.stack_size(200000000)
thread = threading.Thread(target=driver)
thread.start()
# launch ... |
This is (hopefully) a temporary work-around for the intermittent Python 3.10 symbol extractor failure under MSYS2, reported in ticket: msys2/MINGW-packages#11864
ELI5? Some Meson internal commands fail randomly and it's unclear whether the problem is in Ninja, Meson or Python 3.10? |
FWIF, the error in #11915 can be reproduced reliably, meaning that it's not random, it just happens anytime I run ninja. It may be useful to investigate this issue further EDIT: maybe it's not related? The error message is: |
Yeah, the vala error is different topic. I can try to create a PR today to workaround that issue. |
Ah, thanks! That would be great. I'm not well-versed in valadoc 🙂 |
Not sure, but maybe those links provide infos about this bug: |
it's python itself.. edit: the faulthandler doesn't print any stacktraces, despite python crashing.. edit: I'm out of ideas now.. |
I suspect that might be because setting MSYS=winjitdebug causes the error-mode for children to be set to 0. Normally that's set to SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX (in normal windows at least). Not having SEM_FAILCRITICALERRORS will cause a bunch of behavioural differences, which could easily prevent the crash from occurring.
|
Turns out it's actually the presence of SEM_NOOPENFILEERRORBOX - I can, very very occasionally, reproduce crashes after os._exit() and before process exit. So this looks very likely to be a microsoft C runtime (or kernel) bug. In my case no msys environment is involved, fwiw. |
Could it be a DLL doing weird things from their Actually in
Also documented on MSDN:
Unfortunately many DLLs don't check |
A quick grep through the python sources doesn't show a problematic DLL_PROCESS_DETACH callback that could be involved here. None of them do anything in the DLL_PROCESS_DETACH case. I didn't see any tls callbacks at all, but might be missing something (not a windows person). Given the crash only happens with SEM_NOGPFAULTERRORBOX set it seems more likely to be a bug in the CRT or kernel to me, since that being set should lead to less application involvement. Edit: I copy-pastoed SEM_NOOPENFILEERRORBOX instead of SEM_NOGPFAULTERRORBOX in an earlier version |
Can you reproduce locally? If so, may I ask how are you invoking python? |
Unfortunately only in CI, windows 2019 in my case. The method of invoking python doesn't seem to matter, it happens even when invoking via path/to/python.exe path/to/script.py |
Thought I'd try to correct the record a bit here since this got referenced again. error mode |
Oh, indeed. I saw 0x8001 because I was using the 'terminal app' to start cmd, rather than just cmd.exe directly. Extremely unhelpful by terminal to start the "sub terminals" with a different error mode than normal...
The reference to SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX was not about the 0x3, but what I saw in the windows terminal. 0x3 / SEM_FAILCRITICALERRORS|SEM_NOGPFAULTERRORBOX is what some msys code sets when MSYS=winjitdebug isn't set. |
Regarding this, how could one go about setting up Github action that can reproduce the error? |
It seems to happen fairly consistently with the MSYS build in my project repo linked in the referenced issue from 6h ago as it does a fairly involved Meson/Python build. vintagepc/MINI404#116 has been giving me grief most recently. You're more than welcome to fork it and hack up the workflow to dig in. |
I'll add in that I just tried force-downgrading to the mingw64 python 3.9 package and it seems to have made the problem go away - will rerun the workflow a few times to confirm. |
Suggested by Christoph Reiter, this is a workaround for random Python crashes in Meson which only appear on this platform. It’s being tracked upstream at msys2/MINGW-packages#11864, but unfortunately it seems hard to fix. Work around the issue the same way that Meson have in their CI, by enabling JIT debugging. See https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3280#note_1678973. Signed-off-by: Philip Withnall <[email protected]>
We've made some progress finding a potential cause here: #17415 With a bit of luck the update to 3.11 will fix this. |
I'm closing this in favor of #17415 |
There is a longstanding bug with random crashes of Python 3.10 on CI. See: python/cpython#105400 msys2/MINGW-packages#11864 msys2/MINGW-packages#17415
There is a long-standing bug with random crashes of Python 3.10 on CI. See: python/cpython#105400 msys2/MINGW-packages#11864 msys2/MINGW-packages#17415
There is a long-standing bug with random crashes of Python 3.10 on CI. See: python/cpython#105400 msys2/MINGW-packages#11864 msys2/MINGW-packages#17415
There is a long-standing bug with random crashes of Python 3.10 on CI. See: python/cpython#105400 msys2/MINGW-packages#11864 msys2/MINGW-packages#17415
There is a long-standing bug with random crashes of Python 3.10 on CI. See: python/cpython#105400 msys2/MINGW-packages#11864 msys2/MINGW-packages#17415
There is a long-standing bug with random crashes of Python 3.10 on CI. See: python/cpython#105400 msys2/MINGW-packages#11864 msys2/MINGW-packages#17415
One last comment here: We've updated to Python 3.11 yesterday, so this should no longer be an issue. |
It randomly fails when calling into meson scripts:
There was one more packages during the Python rebuild that failed that way (with "symbolextractor" I think), but I didn't write it down sadly.
The text was updated successfully, but these errors were encountered: