-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pywin32<300 causes NULL pointer deference during referrer graph generation #25
Comments
Update: This only happens after |
I just tried, Python 3.8 on Linux:
Looks like by importing itself + the given guppy code doesn't cause a crash. Is it possible for you to get a core dump of a stack trace of the crash? In the meantime, I'll try to find a Windows 7 install that I can test on. I'm not very familiar with Windows debuggers though. |
I'm not a Windows person. Would you be willing to share some steps on to get a test environment set up that is able to reproduce this crash? |
Thanks for the considerable effort. You can find a binary of Then run the following:
|
Ok, the wheel published there ( I did a slight googling around and it seems that TensorFlow for x86 32-bit is probably very complicated and completely unsupported (https://stackoverflow.com/q/44449972). Are you using amd64 Win7? Let me try to find a test VM image for that. |
If I'm not mistaken, when people say "amd64" it's just a silly way to say
64 bit processor, regardless of whether it's Intel or AMD. In other words,
if your Windows computer is from the last 10 years and it's not a weird
netbook or something, it's likely "amd64" :)
If the win32 wheel worked for you, I guess your VM is 32bit. You should do
a 64bit VM just because that's what 99% of Windows users do.
Also, you're going the reproduction route. Another way to tackle this would
be to get more logging output from my machine to let you figure out the
bug. If there's anything you want me to run, as long as it's not something
that requires a lot of setup and work, I'll be happy to do that.
…On Sun, Jan 17, 2021 at 12:21 PM zhuyifei1999 ***@***.***> wrote:
Ok, the wheel published there (h5py‑2.10.0‑cp38‑cp38‑win32.whl) does
allow me to install h5py, and I was able to successfully install keras.
However, upon importing keras it says "Keras requires TensorFlow 2.2 or
higher" (also happened on Linux), and I went to check the page. Only amd64
wheels are available, both from that page and PyPI.
I did a slight googling around and it seems that TensorFlow for x86 32-bit
is probably very complicated and completely unsupported (
https://stackoverflow.com/q/44449972). Are you using amd64 Win7? Let me
try to find a test VM image for that.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#25 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAN3SWR24DQZWRWJIJXDE3S2K2ZBANCNFSM4WFP3JOA>
.
|
Yes, the host machine is x86 64-bit. I don't have a licensed Win 7 to test with, hence I'm looking for a VM to download. Microsoft's test VM image from https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/ are only x86 32-bit images.
I'm guessing this is a segfault in one of the C code. Is it possible for you to use faulthandler or get a C stack trace somehow? Is it possible to get a core dump somehow? I'm also testing on a Win 10 amd64 install (licensed copy on bare metal) and is unable to reproduce the crash ( |
Got a pirated copy of Windows 7. Will try to reproduce on that later. |
LOL, GitHub is owned by Microsoft, hope they won't notice ;) They don't even let people buy legal copies of Windows 7, and I tried multiple times. I've never used these tools you mentioned. I don't want to spend time researching, but if you'll give me lines to run, I'll run them. |
Cannot reproduce on 64 bit Win7. The installation of packages are
The one I suggested is faulthandler. falulthandler is run passively; you just need to enable it:
The problem is that faulthandler is only able to dump a stack trace for the interpreted python code. The fault probably happens in some native C code and it would be helpful to pinpoint the native function that faulted. I googled around a bit and found https://stackoverflow.com/a/49050274 regarding Windows Error Reporting which might be helpful in that. |
This is the output from
|
I was able to create the WER dump only when the debugger was on, not sure why. When it was off, the crash still happens just without the Windows dialog. You can download the dump here but I have no idea how you would read it. |
Looks like the last python frame is View.py#L479, which would call into hv.c#L1518. This is a rather complex C function to workaround issue #7.
Looks like a minidump file:
Searching around Google has a tool called Breakpad to work with this format and I'm looking into it. |
The dumped information reports exception
and the context:
This address maps into python core
The offset matches the original description
Let me see if I can locate which function is at 0x2feaf. |
This matches the release date of Python 3.8.1... hmm Downloaded the Python 3.8 DLL from https://www.python.org/ftp/python/3.8.1/python-3.8.1-embed-amd64.zip, and
Nice. I was under the assumption that you are running under latest Python 3.8 (3.8.7). Let me see if I can reproduce it by using 3.8.1. If not I'll look deeper into the symbols. |
Not on Linux
Not on Win 7 either AFAICT, the python38.dll does not have a symbol table
And the nearest functions to
This is the function of
Questions for myself:
|
@cool-RR can you check, if you create a new virtual environment with the latest packages, it still faults inside the virtual environment? Like:
|
Microsoft x64 Calling Convention, arguments are at RCX RDX R8 R9 PyObject *
PyWeakref_NewRef(PyObject *ob, PyObject *callback)
That's weakrefobject.c#L801, if (!PyType_SUPPORTS_WEAKREFS(Py_TYPE(ob))) { #define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0) Then compare the context (#25 (comment))
We have an object whose type is NULL... how does that happen? |
Looking at stack:
After patching Breakpad like:
I'm able to convert it into a core dump:
Ok we have the stack contents in core dump, just need to load it into gdb with a dummy executable:
Nice. |
For future note, gdb's threads is not useful (these are all ntdll.dll + 0x69d5a):
r11 should point to the saved return address.
This makes no sense to me. The return address is NULL? Looking at the object that's passed in
it's mapped
but the core dump does not contain the data (I'll see if I can figure out how to get it) However, the second argument
This is not mapped at all. |
I stand corrected. I found another tool (https://github.com/skelsec/minidump) to look at dumps and it is in fact actually mapped:
I guess I should write a tool myself to convert a mini dump into core dump |
Performed this patch to Breakpad: https://gist.github.com/zhuyifei1999/ff2094d04b91c8ef704e79ab816993aa
I'm guessing it is working? |
Return address is
Assuming a wheel install,
What could this function be? |
Educated guess: hv.c#L384, It's sure that the "something" it is creating a weak reference to is a type object... let's check its name
Googling around I see pympler/pympler#80 Looking at the code of pywin32 I see mhammond/pywin32@daeb5f2 This was released in latest pywin32 https://pypi.org/project/pywin32/#history,
Also successfully reproduced this: Considering that it is not valid for an object to have a NULL as its type and be passed around to the python interpreter, I don't think this is something we should work around. @cool-RR could you please confirm that the crash is resolved with an upgrade to |
"I'm not a Windows person." You are now 😆 I've been using Windows for users and doing some development for it, and I never got as deep as you now did. That was amazing. Yes, the problem was fixed by upgrading pywin32, both in my test example and in my actual application. Thank you very much. One question that can be asked now is whether to treat this as something that could be improved in guppy3. You could maybe show a warning when someone tries to use guppy3 with an old |
All I did was figuring out how to convert a minidump into a core dump, the rest is my usual GDB process, just complexed by a lack of symbols 😉
Good idea. Hmm |
Wdyt of something like: if 'pythoncom' in sys.modules:
try:
import pkg_resources
pywin32_ver = (pkg_resources.get_distribution('pywin32')
.parsed_version)
except Exception:
pass
else:
if pywin32_ver.major < 300:
import warnings
warnings.warn(
'pythoncom in pywin32 < 300 may cause crashes. '
'See https://github.com/zhuyifei1999/guppy3/issues/25') Should it be more visible? |
IIUC, that will only work when installed via pip, but some users install via a bdist_wininst executable. If you care about that case, then you can probably look for |
if 'pythoncom' in sys.modules:
def get_pywin32_ver():
try:
import pkg_resources
return pkg_resources.get_distribution('pywin32').version
except Exception:
pass
try:
import distutils.sysconfig
site_pkg = distutils.sysconfig.get_python_lib(plat_specific=1)
with open(os.path.join(site_pkg, 'pywin32.version.txt')) as f:
return f.read().strip()
except Exception:
pass
return None
pywin32_ver = get_pywin32_ver()
if pywin32_ver:
try:
pywin32_ver = int(pywin32_ver)
except ValueError:
pass
else:
if pywin32_ver < 300:
warnings.warn(
'pythoncom in pywin32 < 300 may cause crashes. See '
'https://github.com/zhuyifei1999/guppy3/issues/25. '
'You may want to upgrade to the newest version of '
'pywin32 by running "pip install pywin32 --upgrade"') Wdyt? |
Looks good. |
For my future reference, lldb can natively work with minidumps:
|
guppy3==3.1.0
Using Windows 7. I ran this:
It waited for a few seconds and then the process crashed. Here are the error details from the dialog:
The text was updated successfully, but these errors were encountered: