Skip to content

Commit

Permalink
PEP 670: LTO+PGO benchmark (#2161)
Browse files Browse the repository at this point in the history
* Don't convert l-value macros
* Add "Examples of duplication of side effects" section
* Remove the gcc -O0 versus gcc -Og benchmark: it's not relevent for
  this PEP.
  • Loading branch information
vstinner authored Nov 25, 2021
1 parent c1fda68 commit 907f8e9
Showing 1 changed file with 44 additions and 23 deletions.
67 changes: 44 additions & 23 deletions pep-0670.rst
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,8 @@ The following macros should not be converted:
or recent C features.
Example: ``#define Py_ALWAYS_INLINE __attribute__((always_inline))``.
* Macros that need the stringification or concatenation feature of the C preprocessor.
* Macros which can be used as l-value in an assignment. This change is
an incompatible change and it is out of the scope of this PEP.


Convert static inline functions to regular functions
Expand Down Expand Up @@ -225,6 +227,9 @@ Backwards Compatibility
Removing the return value of macros is an incompatible API change made
on purpose: see the `Remove the return value`_ section.

Macros which can be used as l-value in an assignment are not modified by
this PEP to avoid incompatible changes.


Rejected Ideas
==============
Expand All @@ -250,6 +255,21 @@ to miss a macro pitfall when writing and reviewing macro code. Moreover, macros
are harder to read and maintain than functions.


Examples of duplication of side effects
=======================================

Macros::

#define PySet_Check(ob) \
(Py_IS_TYPE(ob, &PySet_Type) \
|| PyType_IsSubtype(Py_TYPE(ob), &PySet_Type))

#define Py_IS_NAN(X) ((X) != (X))

If the *op* or the *X* argument has a side effect, the side effect is
duplicated: it executed twice by ``PySet_Check()`` and ``Py_IS_NAN()``.


Examples of hard to read macros
===============================

Expand Down Expand Up @@ -414,28 +434,12 @@ private static inline function has been added to the internal C API:
* ``_PyVectorcall_FunctionInline()``


Benchmarks
==========
Benchmark comparing macros and static inline functions
======================================================

Benchmarks run on Fedora 35 (Linux) with GCC 11 on a laptop with 8
Benchmark run on Fedora 35 (Linux) with GCC 11 on a laptop with 8
logical CPUs (4 physical CPU cores).


gcc -O0 versus gcc -Og
----------------------

Benchmark of the ``./python -m test -j10`` command on a Python debug
build:

* ``gcc -Og``: 220 sec ± 3 sec
* ``gcc -O0``: 360 sec ± 6 sec

Python built with ``gcc -O0`` is **1.6x slower** than Python built with
``gcc -Og``.

Replace macros with static inline functions
-------------------------------------------

The `PR 29728 <https://github.com/python/cpython/pull/29728>`_ replaces
existing the following static inline functions with macros:

Expand All @@ -449,11 +453,28 @@ existing the following static inline functions with macros:
* ``Py_NewRef()``
* ``Py_REFCNT()``, ``Py_TYPE()``, ``Py_SIZE()``

Benchmark of the ``./python -m test -j10`` command on a Python debug
build:

* Macros (PR 29728), ``gcc -O0``: 345 sec ± 5 sec
* Static inline functions (reference), ``gcc -O0``: 360 sec ± 6 sec
When static inline functions are inlined: Release build
-------------------------------------------------------

Benchmark of the ``./python -m test -j5`` command on Python built in
release mode with ``gcc -O3``, LTO and PGO:

* Macros (PR 29728): 361 sec +- 1 sec
* Static inline functions (reference): 361 sec +- 1 sec

There is **no significant performance difference** between macros and
static inline functions when static inline functions **are inlined**.


When static inline functions are not inlined: Debug build and -O0
-----------------------------------------------------------------

Benchmark of the ``./python -m test -j10`` command on Python built in
debug mode with ``gcc -O0`` (explicitly disable compiler optimizations):

* Macros (PR 29728): 345 sec ± 5 sec
* Static inline functions (reference): 360 sec ± 6 sec

Replacing macros with static inline functions makes Python
**1.04x slower** when the compiler **does not inline** static inline
Expand Down

0 comments on commit 907f8e9

Please sign in to comment.