Skip to content

Commit

Permalink
New PEP to serve metadata on Simple Repository API
Browse files Browse the repository at this point in the history
  • Loading branch information
uranusjr committed May 10, 2021
1 parent 3e852b2 commit 640951c
Showing 1 changed file with 178 additions and 0 deletions.
178 changes: 178 additions & 0 deletions pep-9999.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
PEP: 9999
Title: Static Distribution Metadata in the Simple Repository API
Author: Tzu-ping Chung <[email protected]>
Sponsor:
PEP-Delegate:
Discussions-To:
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 12-May-2021
Post-History:
Resolution:


Abstract
========

This PEP proposes adding anchor tag to expose the ``METADATA`` file
from distributions in the :pep:`503` "simple" repository API. A
``data-dist-info-metadata`` attribute is introduced to indicate where
the file from a given distribution can be independently fetched.


Motivation
==========

Package management workflows made popular by recent tooling increase
the need to inspect distribution metadata without intending to install
the distribution, and download multiple distributions of a project to
choose from based on their metadata. This means they end up discarding
much downloaded data, which is inefficient and results in bad user
experience.


Rationale
=========

Tools have been exploring methods to reduce the download size, by
partially downloading wheels with HTTP range requests. This, however,
adds additional run-time requirements to the repository server. It
also still adds additional overhead since a separate request is needed
to fetch the wheel's file listing to find the correct offset to fetch
the metadata file. It is therefore desired to be able to make the
server extract the metadata file in advance, and serve it as an
independent file to avoid the need to perform additional requests and
ZIP inspection.

The metadata file defined by the Core Metadata Specification
[core-metadata]_, will be served directly by repositories since it
contains the necessary information for common use cases. The metadata
served must be completely static, i.e. identical to the ``METADATA``
file in the ``.dist-info`` directory [dist-info]_ if the distribution
is installed. The repository can provide this for any distributions,
but it is expected they will only provide them for wheels [wheel]_,
since an sdist [sdist]_ does not currently have a way to promise the
metadata will stay the same after it is built.

Since not all distributions have static metadata, an HTML attribute
on the distribution file's anchor link is needed to indicate whether a
client is able to choose the separately served metadata file instead.
The attribute can also be used denote whether the metadata file can be
downloaded. If the attribute is missing from an anchor link, static
metadata is not available for the distribution, either because of the
distribution's content, or lack of repository support.


Specification
=============

In a simple repository's project page, each anchor tag pointing to a
distribution **MAY** have a ``data-dist-info-metadata`` attribute. The
presence of the attribute indicates the distribution represented by
the anchor tag **MUST** contain a Core Metadata file that will not be
modified when the distribution is processed and/or installed.

If a ``data-dist-info-metadata`` attribute is present, its value
**MUST** be a URL to the distribution's Core Metadata file. If the URL
is relative, its base URL **SHOULD** be the current project page, as
is the behaviour of an anchor tag's ``href`` attribute.

There are no restrictions where the Core Metadata file should be
hosted relative to the distribution file or project page, as long as
it can be reached when accessed.


Backwards Compatibility
=======================

If an anchor tag lacks the ``data-dist-info-metadata`` attribute,
tools are expected to revert to their current behaviour of downloading
the distribution to inspect the metadata.

Older tools not supporting the new ``data-dist-info-metadata``
attribute are expected to ignore the attribute and maintain their
current behaviour of downloading the distribution to inspect the
metadata. This is similar to how prior ``data-`` attribute additions
expect existing tools to operate.


Rejected Ideas
==============

Put metadata content on the project page
----------------------------------------

Since tools generally only need to dependency information from a
distribution in addition to what's already available on the project
page, it was proposed that repositories may directly include the
information on the project page, like the ``data-requires-python``
attribute specified in :pep:`503`.

This approach was abandoned since a distribution may contain
arbitrarily long lists of dependencies (including required and
optional), and it is unclear whether including the information for
every distribution in a project would result in net savings since the
infmriation for most distributions generally end up unneeded. By
serving the metadata separately, performance can be better estimated
since data usage will be more proportional to the number of
distributions inspected.


Expose more files in the distribution
-------------------------------------

It was proposed to provide the entire ``.dist-info`` directory as a
separate part, instead of only the metadata file. However, searving
multiple files in one entity through HTTP requires re-archiving them
separately after they are extracted from the original distribution
by the repository server, and usefulness of files other than
``METADATA`` is uncertain for use cases the distribution itself is not
going to be installed.

It should also be noted that the approach taken here does not
preclude other files from being introduced in the future, whether we
want to serve them together or individually.


Require the metadata file to live alongside the distribution file
-----------------------------------------------------------------

It was proposed that the location to fetch metadata can be inferred
implicitly instead, similarly to how :pep:`503` designates the GPG
signature's location. However, since an attribute is required either
way to indicate whether a distribution has static metadata, the author
feels it is simpler to explicitly encode the location information in
the attribute instead. This also makes future extension easier if we
decide to expose more files in the distribution; instead of coming up
with a location inference rule for each file added, we will only need
to add an additional attribute.


References
==========

.. [core-metadata] https://packaging.python.org/specifications/core-metadata/
.. [dist-info] https://packaging.python.org/specifications/recording-installed-packages/
.. [wheel] https://packaging.python.org/specifications/binary-distribution-format/
.. [sdist] https://packaging.python.org/specifications/source-distribution-format/
Copyright
=========

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.


..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:

0 comments on commit 640951c

Please sign in to comment.