Python Enhancement Proposals

PEP 516 – Build system abstraction for pip/conda etc

PEP
516
Title
Build system abstraction for pip/conda etc
Author
Robert Collins <rbtcollins at hp.com>, Nathaniel Smith <njs at pobox.com>
BDFL-Delegate
Nick Coghlan <ncoghlan at gmail.com>
Discussions-To
distutils-sig@python.org
Status
Rejected
Type
Standards Track
Created
26-Oct-2015
Resolution
Distutils-SIG

Contents

Abstract

This PEP specifies a programmatic interface for pip [1] and other distribution or installation tools to use when working with Python source trees (both the developer tree - e.g. the git tree - and source distributions).

The programmatic interface allows decoupling of pip from its current hard dependency on setuptools [2] able for two key reasons:

  1. It enables new build systems that may be much easier to use without requiring them to even appear to be setuptools.
  2. It facilitates setuptools itself changing its user interface without breaking pip, giving looser coupling.

The interface needed to permit pip to install build systems also enables pip to install build time requirements for packages which is an important step in getting pip to full feature parity with the installation components of easy-install.

As PEP 426 is draft, we cannot utilise the metadata format it defined. However PEP 427 wheels are in wide use and fairly well specified, so we have adopted the METADATA format from that for specifying distribution dependencies and general project metadata. PEP 508 provides a self-contained language for describing a dependency, which we encapsulate in a thin JSON schema to describe bootstrap dependencies.

Since Python sdists specified in PEP 314 are also source trees, this PEP is updating the definition of sdists.

PEP Rejection

The CLI based approach proposed in this PEP has been rejected in favour of the Python API based approach proposed in PEP 517. The specific CLI used to communicate with build backends running as isolated subprocesses will be considered an implementation detail of front-end developer tool implementations.

Motivation

There is significant pent-up frustration in the Python packaging ecosystem around the current lock-in between build system and pip. Breaking that lock-in is better for pip, for setuptools, and for other build systems like flit [3].

Specification

Overview

Build tools will be located by reading a file pypa.json from the root directory of the source tree. That file describes how to get the build tool and the name of the command to run to invoke the tool.

All tools will be expected to conform to a single command line interface modelled on pip’s existing use of the setuptools setup.py interface.

pypa.json

The file pypa.json acts as a neutral configuration file for pip and other tools that want to build source trees to consult for configuration. The absence of a pypa.json file in a Python source tree implies a setuptools or setuptools compatible build system.

The JSON has the following schema. Extra keys are ignored, which permits the use of pypa.json as a configuration file for other related tools. If doing that the chosen keys must be namespaced under tools:

{"tools": {"flit": ["Flits content here"]}}
schema
The version of the schema. This PEP defines version “1”. Defaults to “1” when absent. All tools reading the file must error on an unrecognised schema version.
bootstrap_requires
Optional list of PEP 508 dependency specifications that must be installed before running the build tool. For instance, if using flit, then the requirements might be:
bootstrap_requires: ["flit"]
build_command
A mandatory key, this is a list of Python format strings [8] describing the command to run. For instance, if using flit then the build command might be:
build_command: ["flit"]

If using a command which is a runnable module fred:

build_command: ["{PYTHON}", "-m", "fred"]

Process interface

The command to run is defined by a simple Python format string [8].

This permits build systems with dedicated scripts and those that are invoked using “python -m somemodule”.

Processes will be run with the current working directory set to the root of the source tree.

When run, processes should not read from stdin - while pip currently runs build systems with stdin connected to its own stdin, stdout and stderr are redirected and no communication with the user is possible.

As usual with processes, a non-zero exit status indicates an error.

Available format variables

PYTHON
The Python interpreter in use. This is important to enable calling things which are just Python entry points.
{PYTHON} -m foo

Available environment variables

These variables are set by the caller of the build system and will always be available.

PATH
The standard system path.
PYTHON
As for format variables.
PYTHONPATH
Used to control sys.path per the normal Python mechanisms.

Subcommands

There are a number of separate subcommands that build systems must support. The examples below use a build_command of flit for illustrative purposes.

build_requires
Query build requirements. Build requirements are returned as a UTF-8 encoded JSON document with one key build_requires consisting of a list of PEP 508 dependency specifications. Additional keys must be ignored. The build_requires command is the only command run without setting up a build environment.

Example command:

flit build_requires
metadata
Query project metadata. The metadata and only the metadata should be output on stdout in UTF-8 encoding. pip would run metadata just once to determine what other packages need to be downloaded and installed. The metadata is output as a wheel METADATA file per PEP 427.

Note that the metadata generated by the metadata command, and the metadata present in a generated wheel must be identical.

Example command:

flit metadata
wheel -d OUTPUT_DIR
Command to run to build a wheel of the project. OUTPUT_DIR will point to an existing directory where the wheel should be output. Stdout and stderr have no semantic meaning. Only one file should be output - if more are output then pip would pick an arbitrary one to consume.

Example command:

flit wheel -d /tmp/pip-build_1234
develop [–prefix PREFIX]
Command to do an in-place ‘development’ installation of the project. Stdout and stderr have no semantic meaning.

Not all build systems will be able to perform develop installs. If a build system cannot do develop installs, then it should error when run. Note that doing so will cause use operations like pip install -e foo to fail.

The prefix option is used for defining an alternative prefix for the installation. While setuptools has --root and --user options, they can be done equivalently using --prefix, and pip or other tools that accept --root or --user options should translate appropriately.

The root option is used to define an alternative root within which the command should operate.

For instance:

flit develop --root /tmp/ --prefix /usr/local

Should install scripts within /tmp/usr/local/bin, even if the Python environment in use reports that the sys.prefix is /usr/ which would lead to using /tmp/usr/bin/. Similar logic applies for package files etc.

The build environment

Except for the build_requires command, all commands are run within a build environment. No specific implementation is required, but a build environment must achieve the following requirements.

  1. All dependencies specified by the project’s build_requires must be available for import from within $PYTHON.
  1. All command-line scripts provided by the build-required packages must be present in $PATH.

A corollary of this is that build systems cannot assume access to any Python package that is not declared as a build_requires or in the Python standard library.

Hermetic builds

This specification does not prescribe whether builds should be hermetic or not. Existing build tools like setuptools will use installed versions of build time requirements (e.g. setuptools_scm) and only install other versions on version conflicts or missing dependencies. However its likely that better consistency can be created by always isolation builds and using only the specified dependencies.

However, there are nuanced problems there - such as how can users force the avoidance of a bad version of a build requirement which meets some packages dependencies. Future PEPs may tackle this problem, but it is not currently in scope - it does not affect the metadata required to coordinate between build systems and things that need to do builds, and thus is not PEP material.

Upgrades

‘pypa.json’ is versioned to permit future changes without requiring compatibility.

The sequence for upgrading either of schemas in a new PEP will be:

  1. Issue new PEP defining an updated schema. If the schema is not entirely backward compatible then a new version number must be defined.
  2. Consumers (e.g. pip) implement support for the new schema version.
  3. Package authors opt into the new schema when they are happy to introduce a dependency on the version of ‘pip’ (and potentially other consumers) that introduced support for the new schema version.

The same process will take place for the initial deployment of this PEP:- the propagation of the capability to use this PEP without a setuptools shim will be largely gated by the adoption rate of the first version of pip that supports it.

Static metadata in sdists

This PEP does not tackle the current inability to trust static metadata in sdists. That is a separate problem to identifying and consuming the build system that is in use in a source tree, whether it came from an sdist or not.

Handling of compiler options

Handling of different compiler options is out of scope for this specification.

pip currently handles compiler options by appending user supplied strings to the command line it runs when running setuptools. This approach is sufficient to work with the build system interface defined in this PEP, with the exception that globally specified options will stop working globally as different build systems evolve. That problem can be solved in pip (or conda or other installers) without affecting interoperability.

In the long term, wheels should be able to express the difference between wheels built with one compiler or options vs another, and that is PEP material.

Examples

An example ‘pypa.json’ for using flit:

{"bootstrap_requires": ["flit"],
 "build_command": "flit"}

When ‘pip’ reads this it would prepare an environment with flit in it before trying to use flit.

Because flit doesn’t have setup-requires support today, flit build_requires would just output a constant string:

{"build_requires": []}

flit metadata would interrogate flit.ini and marshal the metadata into a wheel METADATA file and output that on stdout.

flit wheel would need to accept a -d parameter that tells it where to output the wheel (pip needs this).

Backwards Compatibility

Older pips will remain unable to handle alternative build systems. This is no worse than the status quo - and individual build system projects can decide whether to include a shim setup.py or not.

All existing build systems that can product wheels and do develop installs should be able to run under this abstraction and will only need a specific adapter for them constructed and published on PyPI.

In the absence of a pypa.json file, tools like pip should assume a setuptools build system and use setuptools commands directly.

Network effects

Projects that adopt build systems that are not setuptools compatible - that is that they have no setup.py, or the setup.py doesn’t accept commands that existing tools try to use - will not be installable by those existing tools.

Where those projects are used by other projects, this effect will cascade.

In particular, because pip does not handle setup-requires today, any project (A) that adopts a setuptools-incompatible build system and is consumed as a setup-requirement by a second project (B) which has not itself transitioned to having a pypa.json will make B uninstallable by any version of pip. This is because setup.py in B will trigger easy-install when ‘setup.py egg_info’ is run by pip, and that will try and fail to install A.

As such we recommend that tools which are currently used as setup-requires either ensure that they keep a setuptools shim or find their consumers and get them all to upgrade to the use of a pypa.json in advance of moving themselves. Pragmatically that is impossible, so the advice is to keep a setuptools shim indefinitely - both for projects like pbr, setuptools_scm and also projects like numpy.

setuptools shim

It would be possible to write a generic setuptools shim that looks like setup.py and under the hood uses pypa.json to drive the builds. This is not needed for pip to use the system, but would allow package authors to use the new features while still retaining compatibility with older pip versions.

Rationale

This PEP started with a long mailing list thread on distutils-sig [6]. Subsequent to that an online meeting was held to debug all the positions folk had. Minutes from that were posted to the list [7].

This specification is a translation of the consensus reached there into PEP form, along with some arbitrary choices on the minor remaining questions.

The basic heuristic for the design has been to focus on introducing an abstraction without requiring development not strictly tied to the abstraction. Where the gap is small to improvements, or the cost of using the existing interface is very high, then we’ve taken on having the improvement as a dependency, but otherwise deferred such to future iterations.

We chose wheel METADATA files rather than defining a new specification, because pip can already handle wheel .dist-info directories which encode all the necessary data in a METADATA file. PEP 426 can’t be used as it’s still draft, and defining a new metadata format, while we should do that, is a separate problem. Using a directory on disk would not add any value to the interface (pip has to do that today due to limitations in the setuptools CLI).

The use of ‘develop’ as a command is because there is no PEP specifying the interoperability of things that do what ‘setuptools develop’ does - so we’ll need to define that before pip can take on the responsibility for doing the ‘develop’ step. Once that’s done we can issue a successor PEP to this one.

The use of a command line API rather than a Python API is a little contentious. Fundamentally anything can be made to work, and the pip maintainers have spoken strongly in favour of retaining a process based interface - something that is mature and robust in pip today.

The choice of JSON as a file format is a compromise between several constraints. Firstly there is no stdlib YAML interpreter, nor one for any of the other low-friction structured file formats. Secondly, INIParser is a poor format for a number of reasons, primarily that it has very minimal structure - but pip’s maintainers are not fond of it. JSON is in the stdlib, has sufficient structure to permit embedding anything we want in future without requiring embedded DSL’s.

Donald suggested using setup.cfg and the existing setuptools command line rather than inventing something new. While that would permit interoperability with less visible changes, it requires nearly as much engineering on the pip side - looking for the new key in setup.cfg, implementing the non-installed environments to run the build in. And the desire from other build system authors not to confuse their users by delivering something that looks like but behaves quite differently to setuptools seems like a bigger issue than pip learning how to invoke a custom build tool.

The metadata and wheel commands are required to have consistent metadata to avoid a race condition that could otherwise happen where pip reads the metadata, acts on it, and then the resulting wheel has incompatible requirements. That race is exploited today by packages using PEP 426 environment markers, to work with older pip versions that do not support environment markers. That exploit is not needed with this PEP, because either the setuptools shim is in use (with older pip versions), or an environment marker ready pip is in use. The setuptools shim can take care of exploiting the difference older pip versions require.

We discussed having an sdist verb. The main driver for this was to make sure that build systems were able to produce sdists that pip can build - but this is circular: the whole point of this PEP is to let pip consume such sdists or VCS source trees reliably and without requiring an implementation of setuptools. Being able to create new sdists from existing source trees isn’t a thing pip does today, and while there is a PR to do that as part of building from source, it is contentious and lacks consensus. Rather than impose a requirement on all build systems, we are treating it as a YAGNI, and will add such a verb in a future version of the interface if required. The existing PEP 314 requirements for sdists still apply, and distutils or setuptools users can use setup.py sdist to create an sdist. Other tools should create sdists compatible with PEP 314. Note that pip itself does not require PEP 314 compatibility - it does not use any of the metadata from sdists - they are treated like source trees from disk or version control.

References

[1]
pip, the recommended installer for Python packages (http://pip.readthedocs.org/en/stable/)
[2]
setuptools, the de facto Python package build system (https://pythonhosted.org/setuptools/)
[3]
flit, a simple way to put packages in PyPI (http://flit.readthedocs.org/en/latest/)
[4]
PyPI, the Python Package Index (https://pypi.python.org/)
[5]
Shellvars, an implementation of shell variable rules for Python. (https://github.com/testing-cabal/shellvars)
[6]
The kick-off thread. (https://mail.python.org/pipermail/distutils-sig/2015-October/026925.html)
[7]
The minutes. (https://mail.python.org/pipermail/distutils-sig/2015-October/027214.html)
[8] (1, 2)
The Python string formatting syntax. (https://docs.python.org/3.1/library/string.html#format-string-syntax)

Source: https://github.com/python-discord/peps/blob/main/pep-0516.txt

Last modified: 2022-02-27 22:46:36 GMT