-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datetime.now(timezone.utc) segfaults with Intel oneAPI v2022.1.2 and higher since Python 3.9 #106424
Comments
From the original issue, -O0 doesn't occur any issue. |
It happens for -O1, -O2, -O3, it does not happen with -O0. So yes, will try to bisect the intel compiler versions now... 😩 |
Ah, thank you for the clarification. |
Building Python 3.9.17, the example segfaults with the following oneAPI compilers:
And succeeds with these:
So it could indeed be a compiler bug. Unclear how to show it is though. |
Should I leave this issue open for discoverability? Others may run into it too. I've written a summary on the Spack package manager's repo spack/spack#38710 (comment), and we're hoping someone from Intel will confirm it's a compiler bug or not |
@oleksandr-pavlyk: Are you building python with recent oneapi compilers? |
@haampie cc @rscohn2 @oleksandr-pavlyk Would you like to check if the |
Strict aliasing is enabled by default (just like in the gcc/clang build)
but when I build with Also verified that defaults in clang and icx are the same: https://godbolt.org/z/jxeaW7sjj |
Hm, it looks like that configure check in Python is just there because some old GCC warned incorrectly about strict aliasing rules. (As in, it compiles some valid code with In any case, the important thing is that |
For me, it's a compiler bug, I don't see how the C code could rely on an undefined behavior here. I suggest you reporting the bug to the Intel compiler. |
@rscohn2 do you want to create a bug report for the compiler? |
@rscohn2 Intel Distribution for Python ships CPython compiled with GCC 11.2:
|
If you provide detailed information with clear evidence, the compiler group will respond quickly. Otherwise it will sit in the queue. This is because they get a lot of bogus bug reports that are application errors. I don't understand how everyone can be so sure it is a compiler bug. C programs have problems with uninitialized data, stale pointers. Optimizations change the layout of memory on stack and bugs and can be specific to compilers and releases. There is another icx coming out next week. I will try that and if the error is still there then I will try to look at the bug. |
It's already been said in this thread: the diff after which Intel Compilers produce a segfaulting Python is very small and innocent looking: 37fcbb6. It works fine on all major compilers. For a project like Python, I can only assume it uses sanitizers to detect UB and what not. If this was an application error I'm sure it would've surfaced elsewhere in the last 3 years. I think it's more a matter of how much Intel folks care about having a working Python than it is up to me as a packager who is not familiar with Python internals to do all the detective work. I've bisected Python commits and the Intel compiler versions -- that's enough. Lastly, what do you mean with clear evidence? As far as I know, Intel compilers are closed source, and all you can do is look at generated assembly (are you officially allowed to?)? Isn't "it works with clang, it doesn't work with icx" sufficient, given that Intel compilers are based on LLVM? |
Segfaults with oneAPI 2023.2.0 compilers too |
With
other sanitizers I can't get to work with icx. |
I spent an afternoon narrowing down the problem to the store for this line not happening: cpython/Modules/_datetimemodule.c Line 1165 in a293fa5
I added a printf to show that Py_NewRef(offset); returns it's input parameter, but the value is not stored info self->offset. I filed a ticket with the compiler that shows the asm output for the function and why I believe the store is missing. It does look like a compiler bug to me, but let's see what they say. |
Thanks for the detective work @rscohn2 👍 |
It was a compiler bug. The test now passes when cpython is built with an internal build. The next compiler release (likely 2024.0) should have the fix. Thanks to @haampie! |
I've bisected this to 37fcbb6 (first "bad" commit)
The following Python code is a minimal reproducer:
When compiling Python with any of the following Intel compilers:
Any version 3.9, 3.10, 3.11 causes a segfault in
add_datetime_timedelta
/timezone_fromutc
where for some reason with-O1
optimizations (edit: and higher) theself->offset
/delta
pointer is aNULL
.Adding
-fno-strict-aliasing
does not prevent this.The text was updated successfully, but these errors were encountered: