Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refleak in test_importlib on aarch64 RHEL8 #101766

Closed
sobolevn opened this issue Feb 9, 2023 · 14 comments
Closed

Refleak in test_importlib on aarch64 RHEL8 #101766

sobolevn opened this issue Feb 9, 2023 · 14 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@sobolevn
Copy link
Member

sobolevn commented Feb 9, 2023

I am not sure what is going on with this test run:

Ran 1377 tests in 3.713s
OK (skipped=18, expected failures=1)
......
test_importlib leaked [6, 4, 2] references, sum=12
test_importlib leaked [4, 4, 2] memory blocks, sum=10
0:32:52 load avg: 0.42 Re-running test_asyncio in verbose mode (matching: test_fork_asyncio_subprocess)
beginning 6 repetitions
123456
/home/buildbot/buildarea/pull_request.cstratak-RHEL8-aarch64.refleak/build/Lib/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=4145974) is multi-threaded, use of fork() may lead to deadlocks in the child.
  self.pid = os.fork()
test_fork_asyncio_subprocess (test.test_asyncio.test_unix_events.TestFork.test_fork_asyncio_subprocess) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.072s
OK
./home/buildbot/buildarea/pull_request.cstratak-RHEL8-aarch64.refleak/build/Lib/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=4145974) is multi-threaded, use of fork() may lead to deadlocks in the child.
  self.pid = os.fork()
test_fork_asyncio_subprocess (test.test_asyncio.test_unix_events.TestFork.test_fork_asyncio_subprocess) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.060s
OK
./home/buildbot/buildarea/pull_request.cstratak-RHEL8-aarch64.refleak/build/Lib/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=4145974) is multi-threaded, use of fork() may lead to deadlocks in the child.
  self.pid = os.fork()
test_fork_asyncio_subprocess (test.test_asyncio.test_unix_events.TestFork.test_fork_asyncio_subprocess) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.059s
OK
./home/buildbot/buildarea/pull_request.cstratak-RHEL8-aarch64.refleak/build/Lib/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=4145974) is multi-threaded, use of fork() may lead to deadlocks in the child.
  self.pid = os.fork()
test_fork_asyncio_subprocess (test.test_asyncio.test_unix_events.TestFork.test_fork_asyncio_subprocess) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.060s
OK
./home/buildbot/buildarea/pull_request.cstratak-RHEL8-aarch64.refleak/build/Lib/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=4145974) is multi-threaded, use of fork() may lead to deadlocks in the child.
  self.pid = os.fork()
test_fork_asyncio_subprocess (test.test_asyncio.test_unix_events.TestFork.test_fork_asyncio_subprocess) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.061s
OK
./home/buildbot/buildarea/pull_request.cstratak-RHEL8-aarch64.refleak/build/Lib/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=4145974) is multi-threaded, use of fork() may lead to deadlocks in the child.
  self.pid = os.fork()
test_fork_asyncio_subprocess (test.test_asyncio.test_unix_events.TestFork.test_fork_asyncio_subprocess) ... ok
----------------------------------------------------------------------
Ran 1 test in 0.057s
OK
.
1 test failed again:
    test_importlib

See https://buildbot.python.org/all/#builders/802/builds/623

Linked PRs

@sobolevn
Copy link
Member Author

First failing build that I was able to find: https://buildbot.python.org/all/#/builders/802/builds/582

See #101394

@corona10
Copy link
Member

corona10 commented Feb 14, 2023

Linux 65cf87cd990d 6.1.9-200.fc37.aarch64 #1 SMP PREEMPT_DYNAMIC Thu Feb 2 00:41:31 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux


beginning 6 repetitions
123456
......
test_importlib leaked [134, 134, 134] references, sum=402
test_importlib leaked [56, 56, 56] memory blocks, sum=168
test_importlib failed (reference leak)

== Tests result: FAILURE ==

1 test failed:
    test_importlib

Total duration: 598 ms
Tests result: FAILURE
ran 1 tests/2
exit 2
Tests failed: continuing with this subtest

Tests (1):
* test.test_importlib.test_metadata_api.APITests.test_version_egg_info_file

Bisection completed in 16 iterations and 0:00:21
[root@65cf87cd990d cpython]#

@gvanrossum
Copy link
Member

@corona10 If you found a suspect can you link it here?

@corona10
Copy link
Member

@corona10 If you found a suspect can you link it here?

Sure

@corona10
Copy link
Member

corona10 commented Feb 15, 2023

@gvanrossum @sobolevn @mdickinson @brettcannon @exarkun

I finally found the suspect:
commit: 3325f05
PR: #94504

I run ./python -m test test_importlib repeatedly 10 times on aarch64 RHEL8(rockylinux:8.7) with every single bisect commit :(
When I revert the 3325f05, no more leak is detected.

The root cause has yet to be discovered.

@gvanrossum
Copy link
Member

Is the SEND fix it?

@sobolevn
Copy link
Member Author

@corona10 great work, thank you!

@corona10
Copy link
Member

Is the SEND fix it?

Do you mean c776624? No, it doesn't solve the issue.

@gvanrossum
Copy link
Member

Usually leaks are C code. There is no C code in the importlib PR. Maybe the blocking_on dict needs to be cleared?

@corona10
Copy link
Member

Usually leaks are C code.

I think sameway.

Maybe the blocking_on dict needs to be cleared?

I will try it.

@corona10
Copy link
Member

Maybe the blocking_on dict needs to be cleared?

Yeah, this was the root cause. Amazing intuition!!

diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
index bebe7e15cb..e42e7bc653 100644
--- a/Lib/importlib/_bootstrap.py
+++ b/Lib/importlib/_bootstrap.py
@@ -85,6 +85,7 @@ def __enter__(self):
     def __exit__(self, *args, **kwargs):
         """Remove self.lock from this thread's _blocking_on list."""
         self.blocked_on.remove(self.lock)
+        del _blocking_on[self.thread_id]

I am checking the proper solution.

@gvanrossum
Copy link
Member

Whenever there's a global cache you get things like this, I've debugged these since 2000. :-)

Good work finding the missing del! And thanks for the hard work doing bisection.

Also @sobolevn thanks for flagging this -- I had seen this occasionally but always just ignored it as being a flake. It was so much more than that!

All in all great teamwork.

@corona10
Copy link
Member

Close this issue
And thank you for reporting this issue @sobolevn!!!

@corona10
Copy link
Member

Re-open the issue due to #101942 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants