Python Enhancement Proposals

PEP 643 – Metadata for Package Source Distributions

PEP
643
Title
Metadata for Package Source Distributions
Author
Paul Moore <p.f.moore at gmail.com>
BDFL-Delegate
Paul Ganssle <paul at ganssle.io>
Discussions-To
https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577
Status
Final
Type
Standards Track
Created
24-Oct-2020
Post-History
24-Oct-2020, 01-Nov-2020, 02-Nov-2020, 14-Nov-2020
Resolution
https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577/53

Contents

Abstract

Python package metadata is stored in the distribution file in a standard format, defined in the Core Metadata Specification. However, for source distributions, while the format of the data is defined, there has traditionally been a lot of inconsistency in what data is recorded in the source distribution. See here for a discussion of this issue.

As a result, metadata consumers are unable to rely on the data available from source distributions, and need to use the (costly) PEP 517 build mechanisms to extract medatata.

This PEP defines a standard that allows build backends to reliably store package metadata in the source distribution, while still retaining the necessary flexibility to handle metadata fields that have to be calculated at build time.

Motivation

There are a number of issues with the way that metadata is currently stored in source distributions:

  • The details of how to store metadata, while standardised, are not easy to find.
  • The specification requires an old metadata version, and has not been updated in line with changes to the core metadata spec.
  • There is no way in the spec to distinguish between “this field has been omitted because its value will not be known until build time” and “this field does not have a value”.
  • The core metadata specification allows most fields to be optional, meaning that the previous issue affects nearly every metadata field.

This PEP proposes an update to the metadata specification to allow recording of fields which are expected to be “filled in later”, and updates the source distribution specification to clarify that backends should record sdist metadata using that version of the spec (or later).

Rationale

This PEP allows projects to define source distribution metadata values as being “dynamic”. In this context, saying that a field is “dynamic” means that the value has not been fixed at the time that the source distribution was generated. Dynamic values will be supplied by the build backend at the time when the wheel is generated, and could depend on details of the build environment.

PEP 621 has a similar concept, of “dynamic” values that will be “filled in later”, and so we choose to use the same term here by analogy.

Specification

This PEP defines the relationship between metadata values specified in a source distribution, and the corresponding values in wheels built from it. It requires build backends to clearly mark any fields which will not simply be copied unchanged from the sdist to the wheel.

In addition, this PEP makes the PyPA Specifications document the canonical location for the specification of the source distribution format (collecting the information in PEP 517 and in this PEP).

A new field, Dynamic, will be added to the Core Metadata Specification. This field will be multiple use, and will be allowed to contain the name of another core metadata field.

When found in the metadata of a source distribution, the following rules apply:

  1. If a field is not marked as Dynamic, then the value of the field in any wheel built from the sdist MUST match the value in the sdist. If the field is not in the sdist, and not marked as Dynamic, then it MUST NOT be present in the wheel.
  2. If a field is marked as Dynamic, it may contain any valid value in a wheel built from the sdist (including not being present at all).
  3. Backends MUST NOT mark a field as Dynamic if they can determine that it was generated from data that will not change at build time.

Backends MAY record the value they calculated for a field they mark as Dynamic in a source distribution. Consumers, however, MUST NOT treat this value as canonical, but MAY use it as an hint about what the final value in a wheel could be.

In any context other than a source distribution, if a field is marked as Dynamic, that indicates that the value was generated at wheel build time and may not match the value in the sdist (or in other builds of the project). Backends are not required to record this information, though, and consumers MUST NOT assume that the lack of a Dynamic marking has any significance, except in a source distribution.

The fields Name and Version MUST NOT be marked as Dynamic.

As it adds a new metadata field, this PEP updates the core metadata format to version 2.2.

Source distributions SHOULD use the latest version of the core metadata specification that was available when they were created.

Build backends are strongly encouraged to only mark fields as Dynamic when absolutely necessary, and to encourage projects to avoid backend features that require the use of Dynamic. Projects should prefer to use environment markers on static values to adapt to details of the install location.

Backwards Compatibility

As this proposal increments the core metadata version, it is compatible with existing source distributions, which will use an older metadata version. Tools can determine whether a source distribution conforms to this PEP by checking the metadata version.

Security Implications

As this specification is purely for the storage of data that is intended to be publicly available, there are no security implications.

How to Teach This

This is a data storage format for project metadata, and so will not typically be visible to end users. There is therefore no need to teach users how to use the format. Developers wanting to reference the metadata will be able to find the details in the PyPA Specifications.

Rejected Ideas

  1. Rather than marking fields as Dynamic, fields should be assumed to be dynamic unless explicitly marked as Static.

    This is logically equivalent to the current proposal, but it implies that fields being dynamic is the norm. Packaging tools can be much more efficient in the presence of metadata that is known to be static, so the PEP chooses to make dynamic fields the exception, and require backends to “opt in” to making a field dynamic.

    In addition, if dynamic is the default, then in future, as more and more metadata becomes static, metadata files will include an increasing number of Static declarations.

  2. Rather than having a Dynamic field, add a special value that indicates that a field is “not yet defined”.

    Again, this is logically equivalent to the current proposal. It makes “being dynamic” an explicit choice, but requires a special value. As some fields can contain arbitrary text, choosing a such a value is somewhat awkward (although likely not a problem in practice). There does not seem to be enough benefit to this approach to make it worth using instead of the proposed mechanism.

  3. Special handling of Requires-Python.

    Early drafts of the PEP needed special discussion of Requires-Python, because the lack of environment markers for this field meant that it might be difficult to require it to be static. The final form of the PEP no longer needs this, as the idea of a whitelist of fields allowed to be dynamic was dropped.

  4. Restrict the use of Dynamic to a minimal “white list” of permitted fields.

    This approach was likely to prove extremely difficult for setuptools to implement in a backward compatible way, due to the dynamic nature of the setuptools interface. Instead, the proposal now allows most fields to be dynamic, but encourages backends to avoid dynamic values unless essential.

Open Issues

None


Source: https://github.com/python-discord/peps/blob/main/pep-0643.rst

Last modified: 2021-12-13 19:17:45 GMT