Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_concurrent_futures.test_shutdown: AssertionError: b'' != b'apple' #112542

Closed
furkanonder opened this issue Nov 30, 2023 · 11 comments
Closed
Labels
3.13 bugs and security fixes OS-freebsd tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@furkanonder
Copy link
Contributor

furkanonder commented Nov 30, 2023

Bug report

Bug description:

Bug report

Bug description:

Configuration:

./configure --with-pydebug

Test Output: (full-test-output.txt)

freebsd@generic:~/cpython % ./python -m test -j2
== CPython 3.13.0a2+ (heads/main:37589d76bb, Apr 29 2023, 11:49:22) [Clang 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386
== FreeBSD-13.2-RELEASE-arm64-aarch64-64bit-ELF little-endian
== Python build: debug
== cwd: /home/freebsd/cpython/build/test_python_worker_32780æ
== CPU count: 4
== encodings: locale=UTF-8 FS=utf-8
== resources: all test resources are disabled, use -u option to unskip tests

Using random seed: 3819139510
0:00:00 load avg: 2.45 Run 469 tests in parallel using 2 worker processes
0:00:02 load avg: 2.42 [  1/469] test.test_asyncio.test_buffered_proto passed
0:00:04 load avg: 2.42 [  2/469] test.test_asyncio.test_context passed
0:00:11 load avg: 2.38 [  3/469] test.test_asyncio.test_eager_task_factory passed
0:00:15 load avg: 2.35 [  4/469] test.test_asyncio.test_base_events passed
0:00:39 load avg: 2.23 [  5/469] test.test_asyncio.test_futures passed
0:00:41 load avg: 2.21 [  6/469] test.test_asyncio.test_futures2 passed -- running (1): test.test_asyncio.test_events (30.4 sec)
0:00:49 load avg: 2.20 [  7/469] test.test_asyncio.test_locks passed -- running (1): test.test_asyncio.test_events (38.4 sec)
0:00:53 load avg: 2.26 [  8/469] test.test_asyncio.test_pep492 passed -- running (1): test.test_asyncio.test_events (41.7 sec)
0:00:59 load avg: 2.24 [  9/469] test.test_asyncio.test_proactor_events passed -- running (1): test.test_asyncio.test_events (47.8 sec)
0:01:01 load avg: 2.22 [ 10/469] test.test_asyncio.test_protocols passed -- running (1): test.test_asyncio.test_events (50.1 sec)
0:01:06 load avg: 2.20 [ 11/469] test.test_asyncio.test_queues passed -- running (1): test.test_asyncio.test_events (54.8 sec)
0:01:09 load avg: 2.19 [ 12/469] test.test_asyncio.test_runners passed -- running (1): test.test_asyncio.test_events (58.3 sec)
0:01:15 load avg: 2.17 [ 13/469] test.test_asyncio.test_selector_events passed -- running (1): test.test_asyncio.test_events (1 min 4 sec)
0:01:18 load avg: 2.17 [ 14/469] test.test_asyncio.test_events passed (1 min 6 sec)
0:01:21 load avg: 2.08 [ 15/469] test.test_asyncio.test_server passed
/home/freebsd/cpython/Lib/asyncio/events.py:84: ResourceWarning: unclosed <socket.socket fd=8, family=1, type=1, proto=0, laddr=./test_python_jpgh9dpl.sock>
  self._context.run(self._callback, *self._args)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
Executing <Task finished name='Task-44' coro=<UnixServerCleanupTests.test_unix_server_cleanup_replaced() done, defined at /home/freebsd/cpython/Lib/test/test_asyncio/test_server.py:227> result=None created at /home/freebsd/cpython/Lib/asyncio/runners.py:100> took 0.104 seconds
0:01:33 load avg: 1.91 [ 16/469] test.test_asyncio.test_sock_lowlevel passed
0:02:03 load avg: 1.62 running (2): test.test_asyncio.test_sendfile (47.9 sec), test.test_asyncio.test_ssl (30.0 sec)

0:02:33 load avg: 1.58 running (2): test.test_asyncio.test_sendfile (1 min 17 sec), test.test_asyncio.test_ssl (1 min)
0:02:37 load avg: 1.61 [ 17/469] test.test_asyncio.test_sendfile passed (1 min 22 sec) -- running (1): test.test_asyncio.test_ssl (1 min 4 sec)
0:02:46 load avg: 2.04 [ 18/469] test.test_asyncio.test_sslproto passed -- running (1): test.test_asyncio.test_ssl (1 min 12 sec)
0:02:50 load avg: 2.04 [ 19/469] test.test_asyncio.test_streams passed -- running (1): test.test_asyncio.test_ssl (1 min 17 sec)
0:03:16 load avg: 2.02 [ 20/469] test.test_asyncio.test_ssl passed (1 min 42 sec)
0:03:20 load avg: 1.94 [ 21/469] test.test_asyncio.test_subprocess passed
0:03:27 load avg: 1.95 [ 22/469] test.test_asyncio.test_taskgroups passed
0:03:30 load avg: 1.95 [ 23/469] test.test_asyncio.test_threads passed
0:03:34 load avg: 1.87 [ 24/469] test.test_asyncio.test_timeouts passed
0:03:37 load avg: 1.96 [ 25/469] test.test_asyncio.test_transports passed
0:03:46 load avg: 1.97 [ 26/469] test.test_asyncio.test_unix_events passed
0:03:50 load avg: 1.97 [ 27/469] test.test_asyncio.test_waitfor passed -- running (1): test.test_asyncio.test_tasks (30.5 sec)
0:03:52 load avg: 1.97 [ 28/469] test.test_asyncio.test_windows_events skipped -- running (1): test.test_asyncio.test_tasks (32.8 sec)
test.test_asyncio.test_windows_events skipped -- Windows only
0:03:54 load avg: 1.97 [ 29/469] test.test_asyncio.test_windows_utils skipped -- running (1): test.test_asyncio.test_tasks (34.7 sec)
test.test_asyncio.test_windows_utils skipped -- Windows only
0:04:23 load avg: 1.75 [ 30/469] test.test_concurrent_futures.test_as_completed passed -- running (1): test.test_asyncio.test_tasks (1 min 3 sec)
0:04:40 load avg: 1.84 [ 31/469] test.test_asyncio.test_tasks passed (1 min 20 sec)
0:04:45 load avg: 1.77 [ 32/469] test.test_concurrent_futures.test_future passed
0:04:54 load avg: 1.80 [ 33/469] test.test_concurrent_futures.test_init passed -- running (1): test.test_concurrent_futures.test_deadlock (30.8 sec)
0:05:24 load avg: 2.12 running (2): test.test_concurrent_futures.test_process_pool (30.0 sec), test.test_concurrent_futures.test_deadlock (1 min)
0:05:27 load avg: 2.12 [ 34/469] test.test_concurrent_futures.test_deadlock passed (1 min 4 sec) -- running (1): test.test_concurrent_futures.test_process_pool (33.8 sec)
0:05:57 load avg: 2.95 running (2): test.test_concurrent_futures.test_process_pool (1 min 3 sec), test.test_concurrent_futures.test_shutdown (30.0 sec)
0:06:09 load avg: 3.83 [ 35/469] test.test_concurrent_futures.test_process_pool passed (1 min 15 sec) -- running (1): test.test_concurrent_futures.test_shutdown (41.3 sec)
0:06:13 load avg: 3.68 [ 36/469] test.test_concurrent_futures.test_thread_pool passed -- running (1): test.test_concurrent_futures.test_shutdown (45.2 sec)
0:06:30 load avg: 3.30 [ 37/469/1] test.test_concurrent_futures.test_shutdown failed (1 failure) (1 min 2 sec)
test test.test_concurrent_futures.test_shutdown failed -- Traceback (most recent call last):
  File "/home/freebsd/cpython/Lib/test/test_concurrent_futures/test_shutdown.py", line 50, in test_interpreter_shutdown
    self.assertEqual(out.strip(), b"apple")
AssertionError: b'' != b'apple'

...
..
.

== Tests result: FAILURE ==

24 tests skipped:
    test.test_asyncio.test_windows_events
    test.test_asyncio.test_windows_utils test_ctypes test_dbm_gnu
    test_devpoll test_epoll test_idle test_ioctl test_launcher
    test_msvcrt test_perf_profiler test_perfmaps test_readline
    test_sqlite3 test_stable_abi_ctypes test_startfile test_tcl
    test_tkinter test_ttk test_ttk_textonly test_turtle
    test_winconsoleio test_winreg test_wmi

8 tests skipped (resource denied):
    test_curses test_peg_generator test_smtpnet test_socketserver
    test_urllib2net test_urllibnet test_winsound test_zipfile64

4 tests failed:
    test.test_concurrent_futures.test_shutdown
    test.test_gdb.test_backtrace test.test_gdb.test_misc
    test.test_gdb.test_pretty_print

433 tests OK.

Total duration: 57 min 8 sec
Total tests: run=39,180 failures=7 skipped=1,476
Total test files: run=461/469 failed=4 skipped=24 resource_denied=8
Result: FAILURE

Environment:

freebsd@generic:~/cpython % uname -a
FreeBSD generic 13.2-RELEASE FreeBSD 13.2-RELEASE releng/13.2-n254617-525ecfdad597 GENERIC arm64
freebsd@generic:~/cpython % 

CPython versions tested on:

CPython main branch

Operating systems tested on:

Other

@furkanonder furkanonder added the type-bug An unexpected behavior, bug, or error label Nov 30, 2023
@furkanonder
Copy link
Contributor Author

CC: @vstinner

@furkanonder furkanonder added tests Tests in the Lib/test dir OS-freebsd labels Nov 30, 2023
@vstinner
Copy link
Member

@furkanonder
Copy link
Contributor Author

Sadly, the failure is not new:

* [test_concurrent_futures.test_shutdown: test_interpreter_shutdown() fails randomly (race condition) #109047](https://github.com/python/cpython/issues/109047)

* [test_concurrent_futures: test_cancel_futures_wait_false() and test_interpreter_shutdown() failed on GHA Windows x64 #89099](https://github.com/python/cpython/issues/89099)

My analysis: #109047 (comment)

Do you want to work on a fix?

I fixed the new RuntimeError variant in #109810

Do you want to work on a fix?

Sure, I would like to work on a fix.

@furkanonder furkanonder moved this to In Progress in Multiprocessing issues Dec 8, 2023
@LeamHall
Copy link

LeamHall commented Jan 10, 2024

This just started happening today on Devuan Linux (Linux josiah 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux). It did not happen yesterday, from yesterday's pull from main. It did not happen on today's build on Fedora 37.

@vstinner
Copy link
Member

The test now fails on the Windows 11 Refleaks 3.12 buildbot: https://buildbot.python.org/all/#/builders/1103.

@sobolevn
Copy link
Member

@vstinner
Copy link
Member

It happened again on GitHub: https://github.com/python/cpython/actions/runs/7723607693/job/21054076387

That's a Windows x64 build. Apparently, the issue is specific to Windows.

@vstinner
Copy link
Member

Apparently, the issue is specific to Windows.

Sorry, I wrote too quickly. The race condition was seen on FreeBSD and Linux as well, but it's really hard to reproduce.

@colesbury
Copy link
Contributor

colesbury commented Mar 12, 2024

This happens reliably in the free-threaded build with PYTHON_GIL=0. The cause is that we are trying to start a thread at interpreter shutdown, which sometimes fails depending on if starting the thread is faster than shutting down. The "can't create new thread at interpreter shutdown" error is hidden due to #109810, which catches the exception. Nobody looks at the result, so no exception is printed.

The order of operations is:

  1. t.submit(sleep_and_print, 1.0, "apple") starts _ExecutorManagerThread and then begins interpreter shutdown
  2. _ExecutorManagerThread calls add_call_item_to_queue which starts a new thread.

If (2) happens quickly enough, then we get b"apple" and the tests pass. If the interpreter shutdown sets interp->finalizing first, then the Queue thread fails to start and nothing gets printed.

This is closely related to:

@colesbury
Copy link
Contributor

#116677 should address this as well as a number of other related issues.

@furkanonder
Copy link
Contributor Author

#116677 should address this as well as a number of other related issues.
I've been running the tests for a while, and there haven't been any failed cases. The issue appears to be solved.
Thank you for fixing it! @colesbury

@github-project-automation github-project-automation bot moved this from In Progress to Done in Multiprocessing issues Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes OS-freebsd tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

5 participants