Skip to content

Commit

Permalink
Merge pull request pypa#3148 from abravalheri/manifest-links
Browse files Browse the repository at this point in the history
[Docs] Improve mentions to `MANIFEST.in` and instructions for including data files
  • Loading branch information
abravalheri authored Mar 5, 2022
2 parents e309995 + 1cbe68d commit e8ad85b
Show file tree
Hide file tree
Showing 7 changed files with 150 additions and 26 deletions.
3 changes: 3 additions & 0 deletions changelog.d/3148.doc.1.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Added clarifications about ``MANIFEST.in``, that include links to PyPUG docs
and more prominent mentions to using a revision control system plugin as an
alternative.
4 changes: 4 additions & 0 deletions changelog.d/3148.doc.2.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Removed mention to ``pkg_resources`` as the recommended way of accessing data
files, in favour of :doc:`importlib.resources`.
Additionally more emphasis was put on the fact that *package data files* reside
**inside** the *package directory* (and therefore should be *read-only*).
4 changes: 4 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,7 @@
]

intersphinx_mapping['pip'] = 'https://pip.pypa.io/en/latest', None
intersphinx_mapping['PyPUG'] = ('https://packaging.python.org/en/latest/', None)
intersphinx_mapping['importlib-resources'] = (
'https://importlib-resources.readthedocs.io/en/latest', None
)
19 changes: 17 additions & 2 deletions docs/setuptools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ Feature Highlights:
individually in setup.py

* Automatically include all relevant files in your source distributions,
without needing to create a ``MANIFEST.in`` file, and without having to force
regeneration of the ``MANIFEST`` file when your source tree changes.
without needing to create a |MANIFEST.in|_ file, and without having to force
regeneration of the ``MANIFEST`` file when your source tree changes
[#manifest]_.

* Automatically generate wrapper scripts or Windows (console and GUI) .exe
files for any number of "main" functions in your project. (Note: this is not
Expand Down Expand Up @@ -211,3 +212,17 @@ set of steps to reproduce.

.. _GitHub Discussions: https://github.com/pypa/setuptools/discussions
.. _setuptools bug tracker: https://github.com/pypa/setuptools/


----


.. [#manifest] The default behaviour for ``setuptools`` will work well for pure
Python packages, or packages with simple C extensions (that don't require
any special C header). See :ref:`Controlling files in the distribution` and
:doc:`userguide/datafiles` for more information about complex scenarios, if
you want to include other types of files.
.. |MANIFEST.in| replace:: ``MANIFEST.in``
.. _MANIFEST.in: https://packaging.python.org/en/latest/guides/using-manifest-in/
75 changes: 52 additions & 23 deletions docs/userguide/datafiles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ Data Files Support
The distutils have traditionally allowed installation of "data files", which
are placed in a platform-specific location. However, the most common use case
for data files distributed with a package is for use *by* the package, usually
by including the data files in the package directory.
by including the data files **inside the package directory**.

Setuptools offers three ways to specify data files to be included in your
packages. First, you can simply use the ``include_package_data`` keyword,
e.g.::
Setuptools offers three ways to specify this most common type of data files to
be included in your package's [#datafiles]_.
First, you can simply use the ``include_package_data`` keyword, e.g.::

from setuptools import setup, find_packages
setup(
Expand All @@ -18,9 +18,10 @@ e.g.::
)

This tells setuptools to install any data files it finds in your packages.
The data files must be specified via the distutils' ``MANIFEST.in`` file.
The data files must be specified via the |MANIFEST.in|_ file.
(They can also be tracked by a revision control system, using an appropriate
plugin. See the section below on :ref:`Adding Support for Revision
plugin such as :pypi:`setuptools-scm` or :pypi:`setuptools-svn`.
See the section below on :ref:`Adding Support for Revision
Control Systems` for information on how to write such plugins.)

If you want finer-grained control over what files are included (for example,
Expand Down Expand Up @@ -87,14 +88,13 @@ When building an ``sdist``, the datafiles are also drawn from the
``package_name.egg-info/SOURCES.txt`` file, so make sure that this is removed if
the ``setup.py`` ``package_data`` list is updated before calling ``setup.py``.

(Note: although the ``package_data`` argument was previously only available in
``setuptools``, it was also added to the Python ``distutils`` package as of
Python 2.4; there is `some documentation for the feature`__ available on the
python.org website. If using the setuptools-specific ``include_package_data``
argument, files specified by ``package_data`` will *not* be automatically
added to the manifest unless they are listed in the MANIFEST.in file.)
.. note::
If using the ``include_package_data`` argument, files specified by
``package_data`` will *not* be automatically added to the manifest unless
they are listed in the |MANIFEST.in|_ file or by a plugin like
:pypi:`setuptools-scm` or :pypi:`setuptools-svn`.

__ https://docs.python.org/3/distutils/setupscript.html#installing-package-data
.. https://docs.python.org/3/distutils/setupscript.html#installing-package-data
Sometimes, the ``include_package_data`` or ``package_data`` options alone
aren't sufficient to precisely define what files you want included. For
Expand Down Expand Up @@ -125,11 +125,13 @@ included as a result of using ``include_package_data``.
In summary, the three options allow you to:

``include_package_data``
Accept all data files and directories matched by ``MANIFEST.in``.
Accept all data files and directories matched by |MANIFEST.in|_ or added by
a :ref:`plugin <Adding Support for Revision Control Systems>`.

``package_data``
Specify additional patterns to match files that may or may
not be matched by ``MANIFEST.in`` or found in source control.
not be matched by |MANIFEST.in|_ or added by
a :ref:`plugin <Adding Support for Revision Control Systems>`.

``exclude_package_data``
Specify patterns for data files and directories that should *not* be
Expand All @@ -154,14 +156,22 @@ Typically, existing programs manipulate a package's ``__file__`` attribute in
order to find the location of data files. However, this manipulation isn't
compatible with PEP 302-based import hooks, including importing from zip files
and Python Eggs. It is strongly recommended that, if you are using data files,
you should use the :ref:`ResourceManager API` of ``pkg_resources`` to access
them. The ``pkg_resources`` module is distributed as part of setuptools, so if
you're using setuptools to distribute your package, there is no reason not to
use its resource management API. See also `Importlib Resources`_ for
a quick example of converting code that uses ``__file__`` to use
``pkg_resources`` instead.
you should use :mod:`importlib.resources` to access them.
:mod:`importlib.resources` was added to Python 3.7 and the latest version of
the library is also available via the :pypi:`importlib-resources` backport.
See :doc:`importlib-resources:using` for detailed instructions [#importlib]_.

.. tip:: Files inside the package directory should be *read-only* to avoid a
series of common problems (e.g. when multiple users share a common Python
installation, when the package is loaded from a zip file, or when multiple
instances of a Python application run in parallel).

.. _Importlib Resources: https://docs.python.org/3/library/importlib.html#module-importlib.resources
If your Python package needs to write to a file for shared data or configuration,
you can use standard platform/OS-specific system directories, such as
``~/.local/config/$appname`` or ``/usr/share/$appname/$version`` (Linux specific) [#system-dirs]_.
A common approach is to add a read-only template file to the package
directory that is then copied to the correct system directory if no
pre-existing file is found.


Non-Package Data Files
Expand All @@ -174,4 +184,23 @@ fall back to the platform-specific location for installing data files, there is
no supported facility to reliably retrieve these resources.

Instead, the PyPA recommends that any data files you wish to be accessible at
run time be included in the package.
run time be included **inside the package**.


----

.. [#datafiles] ``setuptools`` consider a *package data file* any non-Python
file **inside the package directory** (i.e., that co-exists in the same
location as the regular ``.py`` files being distributed).
.. [#system-dirs] These locations can be discovered with the help of
third-party libraries such as :pypi:`platformdirs`.
.. [#importlib] Recent versions of :mod:`importlib.resources` available in
Pythons' standard library should be API compatible with
:pypi:`importlib-metadata`. However this might vary depending on which version
of Python is installed.
.. |MANIFEST.in| replace:: ``MANIFEST.in``
.. _MANIFEST.in: https://packaging.python.org/en/latest/guides/using-manifest-in/
63 changes: 63 additions & 0 deletions docs/userguide/miscellaneous.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,3 +94,66 @@ correctly when installed as a zipfile, correct any problems if you can, and
then make an explicit declaration of ``True`` or ``False`` for the ``zip_safe``
flag, so that it will not be necessary for ``bdist_egg`` to try to guess
whether your project can work as a zipfile.


.. _Controlling files in the distribution:

Controlling files in the distribution
-------------------------------------

For the most common use cases, ``setuptools`` will automatically find out which
files are necessary for distributing the package.
This includes all :term:`pure Python modules <Pure Module>` in the
``py_modules`` or ``packages`` configuration, and the C sources (but not C
headers) listed as part of extensions when creating a :term:`Source
Distribution (or "sdist")`.

However, when building more complex packages (e.g. packages that include
non-Python files, or that need to use custom C headers), you might find that
not all files present in your project folder are included in package
:term:`distribution archive <Distribution Package>`.

In these situations you can use a ``setuptools``
:ref:`plugin <Adding Support for Revision Control Systems>`,
such as :pypi:`setuptools-scm` or :pypi:`setuptools-svn` to automatically
include all files tracked by your Revision Control System into the ``sdist``.

.. _Using MANIFEST.in:

Alternatively, if you need finer control, you can add a ``MANIFEST.in`` file at
the root of your project.
This file contains instructions that tell ``setuptools`` which files exactly
should be part of the ``sdist`` (or not).
A comprehensive guide to ``MANIFEST.in`` syntax is available at the
:doc:`PyPA's Packaging User Guide <PyPUG:guides/using-manifest-in>`.

Once the correct files are present in the ``sdist``, they can then be used by
binary extensions during the build process, or included in the final
:term:`wheel <Wheel>` [#build-process]_ if you configure ``setuptools`` with
``include_package_data=True``.

.. important::
Please note that, when using ``include_package_data=True``, only files **inside
the package directory** are included in the final ``wheel``, by default.

So for example, if you create a :term:`Python project <Project>` that uses
:pypi:`setuptools-scm` and have a ``tests`` directory outside of the package
folder, the ``tests`` directory will be present in the ``sdist`` but not in the
``wheel`` [#wheel-vs-sdist]_.

See :doc:`/userguide/datafiles` for more information.

----

.. [#build-process]
You can think about the build process as two stages: first the ``sdist``
will be created and then the ``wheel`` will be produced from that ``sdist``.
.. [#wheel-vs-sdist]
This happens because the ``sdist`` can contain files that are useful during
development or the build process itself, but not in runtime (e.g. tests,
docs, examples, etc...).
The ``wheel``, on the other hand, is a file format that has been optimized
and is ready to be unpacked into a running installation of Python or
:term:`Virtual Environment`.
Therefore it only contains items that are required during runtime.
8 changes: 7 additions & 1 deletion docs/userguide/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,9 @@ can simply use the ``include_package_data`` keyword:
include_package_data = True
This tells setuptools to install any data files it finds in your packages.
The data files must be specified via the distutils' ``MANIFEST.in`` file.
The data files must be specified via the distutils' |MANIFEST.in|_ file
or automatically added by a :ref:`Revision Control System plugin
<Adding Support for Revision Control Systems>`.
For more details, see :doc:`datafiles`


Expand Down Expand Up @@ -228,3 +230,7 @@ Resources on Python packaging
Packaging in Python can be hard and is constantly evolving.
`Python Packaging User Guide <https://packaging.python.org>`_ has tutorials and
up-to-date references that can help you when it is time to distribute your work.


.. |MANIFEST.in| replace:: ``MANIFEST.in``
.. _MANIFEST.in: https://packaging.python.org/en/latest/guides/using-manifest-in/

0 comments on commit e8ad85b

Please sign in to comment.