Skip to content

Commit

Permalink
docs: rewrite the subprocess page
Browse files Browse the repository at this point in the history
Now multiprocessing is first, with an example of how to use Pool
properly.
  • Loading branch information
nedbat committed Dec 26, 2024
1 parent 878410c commit 81c5e43
Show file tree
Hide file tree
Showing 12 changed files with 85 additions and 67 deletions.
4 changes: 4 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ Unreleased
understand the problem or the solution, but ``git bisect`` helped find it,
and now it's fixed.

- Docs: re-wrote the :ref:`subprocess` page to put multiprocessing first and to
highlight the correct use of :class:`multiprocessing.Pool
<python:multiprocessing.pool.Pool>`.

.. _issue 1874: https://github.com/nedbat/coveragepy/issues/1874
.. _issue 1875: https://github.com/nedbat/coveragepy/issues/1875
.. _issue 1902: https://github.com/nedbat/coveragepy/issues/1902
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,7 @@ cogdoc: $(DOCBIN) ## Run docs through cog.

dochtml: cogdoc $(DOCBIN) ## Build the docs HTML output.
$(SPHINXBUILD) -b html doc doc/_build/html
@echo "Start at: doc/_build/html/index.html"

docdev: dochtml ## Build docs, and auto-watch for changes.
PATH=$(DOCBIN):$(PATH) $(SPHINXAUTOBUILD) -b html doc doc/_build/html
Expand Down
2 changes: 1 addition & 1 deletion coverage/control.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ def __init__( # pylint: disable=too-many-arguments
context=context,
)

# If we have sub-process measurement happening automatically, then we
# If we have subprocess measurement happening automatically, then we
# want any explicit creation of a Coverage object to mean, this process
# is already coverage-aware, so don't auto-measure it. By now, the
# auto-creation of a Coverage object has already happened. But we can
Expand Down
2 changes: 1 addition & 1 deletion coverage/multiproc.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def patch_multiprocessing(rcfile: str) -> None:

# When spawning processes rather than forking them, we have no state in the
# new process. We sneak in there with a Stowaway: we stuff one of our own
# objects into the data that gets pickled and sent to the sub-process. When
# objects into the data that gets pickled and sent to the subprocess. When
# the Stowaway is unpickled, its __setstate__ method is called, which
# re-applies the monkey-patch.
# Windows only spawns, so this is needed to keep Windows working.
Expand Down
4 changes: 2 additions & 2 deletions doc/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1060,12 +1060,12 @@ Work from the PyCon 2016 Sprints!
- The ``concurrency`` option can now take multiple values, to support programs
using multiprocessing and another library such as eventlet. This is only
possible in the configuration file, not from the command line. The
configuration file is the only way for sub-processes to all run with the same
configuration file is the only way for subprocesses to all run with the same
options. Fixes `issue 484`_. Thanks to Josh Williams for prototyping.

- Using a ``concurrency`` setting of ``multiprocessing`` now implies
``--parallel`` so that the main program is measured similarly to the
sub-processes.
subprocesses.

- When using `automatic subprocess measurement`_, running coverage commands
would create spurious data files. This is now fixed, thanks to diagnosis and
Expand Down
8 changes: 4 additions & 4 deletions doc/cmd.rst
Original file line number Diff line number Diff line change
Expand Up @@ -176,10 +176,10 @@ You can combine multiple values for ``--concurrency``, separated with commas.
You can specify ``thread`` and also one of ``eventlet``, ``gevent``, or
``greenlet``.

If you are using ``--concurrency=multiprocessing``, you must set other options
in the configuration file. Options on the command line will not be passed to
the processes that multiprocessing creates. Best practice is to use the
configuration file for all options.
If you are using ``--concurrency=multiprocessing``, you must set your other
options in the configuration file. Options on the command line will not be
passed to the processes that multiprocessing creates. Best practice is to use
the configuration file for all options.

.. _multiprocessing: https://docs.python.org/3/library/multiprocessing.html
.. _greenlet: https://greenlet.readthedocs.io/
Expand Down
7 changes: 4 additions & 3 deletions doc/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ specification of options that are otherwise only available in the
:ref:`API <api>`.

Configuration files also make it easier to get coverage testing of spawned
sub-processes. See :ref:`subprocess` for more details.
subprocesses. See :ref:`subprocess` for more details.

The default name for the configuration file is ``.coveragerc``, in the same
directory coverage.py is being run in. Most of the settings in the
Expand Down Expand Up @@ -443,11 +443,12 @@ need to know the source origin.

(boolean, default False) if true, register a SIGTERM signal handler to capture
data when the process ends due to a SIGTERM signal. This includes
:meth:`Process.terminate <python:multiprocessing.Process.terminate>`, and other
:meth:`Process.terminate <python:multiprocessing.Process.terminate>` and other
ways to terminate a process. This can help when collecting data in usual
situations, but can also introduce problems (see `issue 1310`_).

Only on Linux and Mac.
The signal handler is only registered on Linux and Mac. On Windows, this
setting has no effect.

.. _issue 1310: https://github.com/nedbat/coveragepy/issues/1310

Expand Down
104 changes: 58 additions & 46 deletions doc/subprocess.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,57 +3,75 @@
.. _subprocess:

=======================
Measuring sub-processes
=======================
======================
Measuring subprocesses
======================

Complex test suites may spawn sub-processes to run tests, either to run them in
parallel, or because sub-process behavior is an important part of the system
under test. Measuring coverage in those sub-processes can be tricky because you
have to modify the code spawning the process to invoke coverage.py.
If your system under test spawns subprocesses, you'll have to take extra steps
to measure coverage in those processes. There are a few ways to ensure they
get measured. The approach you use depends on how you create the processes.

There's an easier way to do it: coverage.py includes a function,
:func:`coverage.process_startup` designed to be invoked when Python starts. It
examines the ``COVERAGE_PROCESS_START`` environment variable, and if it is set,
begins coverage measurement. The environment variable's value will be used as
the name of the :ref:`configuration file <config>` to use.
No matter how your subprocesses are created, you will need the :ref:`parallel
option <config_run_parallel>` to collect separate data for each process, and
the :ref:`coverage combine <cmd_combine>` command to combine them together
before reporting.

.. note::
To successfully write a coverage data file, the Python subprocess under
measurement must shut down cleanly and have a chance for coverage.py to run its
termination code. It will do that when the process ends naturally, or when a
SIGTERM signal is received.

The subprocess only sees options in the configuration file. Options set on
the command line will not be used in the subprocesses.
If your processes are ending with SIGTERM, you must enable the
:ref:`config_run_sigterm` setting to configure coverage to catch SIGTERM
signals and write its data.

Other ways of ending a process, like SIGKILL or :func:`os._exit
<python:os._exit>`, will prevent coverage.py from writing its data file,
leaving you with incomplete or non-existent coverage data.

.. note::

If you have subprocesses created with :mod:`multiprocessing
<python:multiprocessing>`, the ``--concurrency=multiprocessing``
command-line option should take care of everything for you. See
:ref:`cmd_run` for details.
Subprocesses will only see coverage options in the configuration file.
Options set on the command line will not be visible to subprocesses.


Using multiprocessing
---------------------

When using this technique, be sure to set the parallel option to true so that
multiple coverage.py runs will each write their data to a distinct file.
The :mod:`multiprocessing <python:multiprocessing>` module in the Python
standard library provides high-level tools for managing subprocesses. If you
use it, the :ref:`concurrency=multiprocessing <config_run_concurrency>` and
:ref:`sigterm <config_run_sigterm>` settings will configure coverage to measure
the subprocesses.

Even with multiprocessing, you have to be careful that all subprocesses
terminate cleanly or they won't record their coverage measurements. For
example, the correct way to use a Pool requires closing and joining the pool
before terminating::

Configuring Python for sub-process measurement
----------------------------------------------
with multiprocessing.Pool() as pool:
# ... use any of the pool methods ...
pool.close()
pool.join()

Measuring coverage in sub-processes is a little tricky. When you spawn a
sub-process, you are invoking Python to run your program. Usually, to get
coverage measurement, you have to use coverage.py to run your program. Your
sub-process won't be using coverage.py, so we have to convince Python to use
coverage.py even when not explicitly invoked.

To do that, we'll configure Python to run a little coverage.py code when it
starts. That code will look for an environment variable that tells it to start
coverage measurement at the start of the process.
Implicit coverage
-----------------

If you are starting subprocesses another way, you can configure Python to start
coverage when it runs. Coverage.py includes a function designed to be invoked
when Python starts: :func:`coverage.process_startup`. It examines the
``COVERAGE_PROCESS_START`` environment variable, and if it is set, begins
coverage measurement. The environment variable's value will be used as the name
of the :ref:`configuration file <config>` to use.

To arrange all this, you have to do two things: set a value for the
``COVERAGE_PROCESS_START`` environment variable, and then configure Python to
invoke :func:`coverage.process_startup` when Python processes start.

How you set ``COVERAGE_PROCESS_START`` depends on the details of how you create
sub-processes. As long as the environment variable is visible in your
sub-process, it will work.
subprocesses. As long as the environment variable is visible in your
subprocess, it will work.

You can configure your Python installation to invoke the ``process_startup``
function in two ways:
Expand Down Expand Up @@ -84,17 +102,11 @@ start-up. Be sure to remove the change when you uninstall coverage.py, or use
a more defensive approach to importing it.


Process termination
-------------------

To successfully write a coverage data file, the Python sub-process under
analysis must shut down cleanly and have a chance for coverage.py to run its
termination code. It will do that when the process ends naturally, or when a
SIGTERM signal is received.

Coverage.py uses :mod:`atexit <python:atexit>` to handle usual process ends,
and a :mod:`signal <python:signal>` handler to catch SIGTERM signals.
Explicit coverage
-----------------

Other ways of ending a process, like SIGKILL or :func:`os._exit
<python:os._exit>`, will prevent coverage.py from writing its data file,
leaving you with incomplete or non-existent coverage data.
Another option for running coverage on your subprocesses it to run coverage
explicitly as the command for your subprocess instead of using "python" as the
command. This isn't recommended, since it requires running different code
when running coverage than when not, which can complicate your test
environment.
2 changes: 1 addition & 1 deletion igor.py
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ def run_tests_with_coverage(core, *runner_args):
context = os.environ[context[1:]]
os.environ["COVERAGE_CONTEXT"] = context + "." + core

# Create the .pth file that will let us measure coverage in sub-processes.
# Create the .pth file that will let us measure coverage in subprocesses.
# The .pth file seems to have to be alphabetically after easy-install.pth
# or the sys.path entries aren't created right?
# There's an entry in "make clean" to get rid of this file.
Expand Down
10 changes: 5 additions & 5 deletions tests/coveragetest.py
Original file line number Diff line number Diff line change
Expand Up @@ -377,11 +377,11 @@ def command_line(self, args: str, ret: int = OK) -> None:
coverage_command = "coverage"

def run_command(self, cmd: str) -> str:
"""Run the command-line `cmd` in a sub-process.
"""Run the command-line `cmd` in a subprocess.
`cmd` is the command line to invoke in a sub-process. Returns the
`cmd` is the command line to invoke in a subprocess. Returns the
combined content of `stdout` and `stderr` output streams from the
sub-process.
subprocess.
See `run_command_status` for complete semantics.
Expand All @@ -394,7 +394,7 @@ def run_command(self, cmd: str) -> str:
return output

def run_command_status(self, cmd: str) -> tuple[int, str]:
"""Run the command-line `cmd` in a sub-process, and print its output.
"""Run the command-line `cmd` in a subprocess, and print its output.
Use this when you need to test the process behavior of coverage.
Expand All @@ -420,7 +420,7 @@ def run_command_status(self, cmd: str) -> tuple[int, str]:
command_args = split_commandline[1:]

if command_name == "python":
# Running a Python interpreter in a sub-processes can be tricky.
# Running a Python interpreter in a subprocesses can be tricky.
# Use the real name of our own executable. So "python foo.py" might
# get executed as "python3.3 foo.py". This is important because
# Python 3.x doesn't install as "python", so you might get a Python
Expand Down
2 changes: 1 addition & 1 deletion tests/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@


def run_command(cmd: str) -> tuple[int, str]:
"""Run a command in a sub-process.
"""Run a command in a subprocess.
Returns the exit status code and the combined stdout and stderr.
Expand Down
6 changes: 3 additions & 3 deletions tests/test_process.py
Original file line number Diff line number Diff line change
Expand Up @@ -606,7 +606,7 @@ def test_deprecation_warnings(self) -> None:
""")

# Some of our testing infrastructure can issue warnings.
# Turn it all off for the sub-process.
# Turn it all off for the subprocess.
self.del_environ("COVERAGE_TESTING")

out = self.run_command("python allok.py")
Expand Down Expand Up @@ -1197,9 +1197,9 @@ def test_removing_directory_with_error(self) -> None:
assert all(line in out for line in lines)


@pytest.mark.skipif(env.METACOV, reason="Can't test sub-process pth file during metacoverage")
@pytest.mark.skipif(env.METACOV, reason="Can't test subprocess pth file during metacoverage")
class ProcessStartupTest(CoverageTest):
"""Test that we can measure coverage in sub-processes."""
"""Test that we can measure coverage in subprocesses."""

def setUp(self) -> None:
super().setUp()
Expand Down

0 comments on commit 81c5e43

Please sign in to comment.