Python Enhancement Proposals

PEP 571 – The manylinux2010 Platform Tag

PEP
571
Title
The manylinux2010 Platform Tag
Author
Mark Williams <mrw at enotuniq.org>, Geoffrey Thomas <geofft at ldpreload.com>, Thomas Kluyver <thomas at kluyver.me.uk>
BDFL-Delegate
Nick Coghlan <ncoghlan at gmail.com>
Discussions-To
distutils-sig@python.org
Status
Superseded
Type
Informational
Created
05-Feb-2018
Post-History

Superseded-By
600
Resolution
Distutils-SIG

Contents

Abstract

This PEP proposes the creation of a manylinux2010 platform tag to succeed the manylinux1 tag introduced by PEP 513. It also proposes that PyPI and pip both be updated to support uploading, downloading, and installing manylinux2010 distributions on compatible platforms.

Rationale

True to its name, the manylinux1 platform tag has made the installation of binary extension modules a reality on many Linux systems. Libraries like cryptography [2] and numpy [3] are more accessible to Python developers now that their installation on common architectures does not depend on fragile development environments and build toolchains.

manylinux1 wheels achieve their portability by allowing the extension modules they contain to link against only a small set of system-level shared libraries that export versioned symbols old enough to benefit from backwards-compatibility policies. Extension modules in a manylinux1 wheel that rely on glibc, for example, must be built against version 2.5 or earlier; they may then be run systems that provide more recent glibc version that still export the required symbols at version 2.5.

PEP 513 drew its whitelisted shared libraries and their symbol versions from CentOS 5.11, which was the oldest supported CentOS release at the time of its writing. Unfortunately, CentOS 5.11 reached its end-of-life on March 31st, 2017 with a clear warning against its continued use. [4] No further updates, such as security patches, will be made available. This means that its packages will remain at obsolete versions that hamper the efforts of Python software packagers who use the manylinux1 Docker image.

CentOS 6 is now the oldest supported CentOS release, and will receive maintenance updates through November 30th, 2020. [5] We propose that a new PEP 425-style platform tag called manylinux2010 be derived from CentOS 6 and that the manylinux toolchain, PyPI, and pip be updated to support it.

This was originally proposed as manylinux2, but the versioning has been changed to use calendar years (also known as CalVer [23]). This makes it easier to define future manylinux tags out of order: for example, a hypothetical manylinux2017 standard may be defined via a new PEP before manylinux2014, or a manylinux2007 standard might be defined that targets systems older than this PEP but newer than manylinux1.

Calendar versioning also gives a rough idea of which Linux distribution versions support which tag: manylinux2010 will work on most distribution versions released since 2010. This is only an approximation, however: the actual compatibility rules are defined below, and some newer distributions may not meet them.

The manylinux2010 policy

The following criteria determine a linux wheel’s eligibility for the manylinux2010 tag:

  1. The wheel may only contain binary executables and shared objects compiled for one of the two architectures supported by CentOS 6: x86_64 or i686. [5]
  2. The wheel’s binary executables or shared objects may not link against externally-provided libraries except those in the following whitelist:
    libgcc_s.so.1
    libstdc++.so.6
    libm.so.6
    libdl.so.2
    librt.so.1
    libc.so.6
    libnsl.so.1
    libutil.so.1
    libpthread.so.0
    libresolv.so.2
    libX11.so.6
    libXext.so.6
    libXrender.so.1
    libICE.so.6
    libSM.so.6
    libGL.so.1
    libgobject-2.0.so.0
    libgthread-2.0.so.0
    libglib-2.0.so.0
    

    This list is identical to the externally-provided libraries whitelisted for manylinux1, minus libncursesw.so.5 and libpanelw.so.5. [7] libpythonX.Y remains ineligible for inclusion for the same reasons outlined in PEP 513.

    libcrypt.so.1 was retrospectively removed from the whitelist after Fedora 30 was released with libcrypt.so.2 instead.

    On Debian-based systems, these libraries are provided by the packages:

    Package Libraries
    libc6 libdl.so.2, libresolv.so.2, librt.so.1, libc.so.6, libpthread.so.0, libm.so.6, libutil.so.1, libnsl.so.1
    libgcc1 libgcc_s.so.1
    libgl1 libGL.so.1
    libglib2.0-0 libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0
    libice6 libICE.so.6
    libsm6 libSM.so.6
    libstdc++6 libstdc++.so.6
    libx11-6 libX11.so.6
    libxext6 libXext.so.6
    libxrender1 libXrender.so.1

    On RPM-based systems, they are provided by these packages:

    Package Libraries
    glib2 libglib-2.0.so.0, libgthread-2.0.so.0, libgobject-2.0.so.0
    glibc libresolv.so.2, libutil.so.1, libnsl.so.1, librt.so.1, libpthread.so.0, libdl.so.2, libm.so.6, libc.so.6
    libICE libICE.so.6
    libX11 libX11.so.6
    libXext: libXext.so.6
    libXrender libXrender.so.1
    libgcc: libgcc_s.so.1
    libstdc++ libstdc++.so.6
    mesa libGL.so.1
  3. If the wheel contains binary executables or shared objects linked against any whitelisted libraries that also export versioned symbols, they may only depend on the following maximum versions:
    GLIBC_2.12
    CXXABI_1.3.3
    GLIBCXX_3.4.13
    GCC_4.5.0
    

    As an example, manylinux2010 wheels may include binary artifacts that require glibc symbols at version GLIBC_2.4, because this an earlier version than the maximum of GLIBC_2.12.

  4. If a wheel is built for any version of CPython 2 or CPython versions 3.0 up to and including 3.2, it must include a CPython ABI tag indicating its Unicode ABI. A manylinux2010 wheel built against Python 2, then, must include either the cpy27mu tag indicating it was built against an interpreter with the UCS-4 ABI or the cpy27m tag indicating an interpreter with the UCS-2 ABI. (PEP 3149, [9])
  5. A wheel must not require the PyFPE_jbuf symbol. This is achieved by building it against a Python compiled without the --with-fpectl configure flag.

Compilation of Compliant Wheels

Like manylinux1, the auditwheel tool adds manylinux2010 platform tags to linux wheels built by pip wheel or bdist_wheel in a manylinux2010 Docker container.

Docker Image

Two manylinux2010 Docker images based on CentOS 6 are provided for building binary linux wheels that can reliably be converted to manylinux2010 wheels. [10] The x86_64 and i686 images comes with a new compiler suite installed (gcc, g++, and gfortran from devtoolset-8) as well as the latest releases of Python and pip.

Compatibility with kernels that lack vsyscall

A Docker container assumes that its userland is compatible with its host’s kernel. Unfortunately, an increasingly common kernel configuration breaks this assumption for x86_64 CentOS 6 Docker images.

Versions 2.14 and earlier of glibc require the kernel provide an archaic system call optimization known as vsyscall on x86_64. [11] To effect the optimization, the kernel maps a read-only page of frequently-called system calls – most notably time(2) – into each process at a fixed memory location. glibc then invokes these system calls by dereferencing a function pointer to the appropriate offset into the vsyscall page and calling it. This avoids the overhead associated with invoking the kernel that affects normal system call invocation. vsyscall has long been deprecated in favor of an equivalent mechanism known as vDSO, or “virtual dynamic shared object”, in which the kernel instead maps a relocatable virtual shared object containing the optimized system calls into each process. [12]

The vsyscall page has serious security implications because it does not participate in address space layout randomization (ASLR). Its predictable location and contents make it a useful source of gadgets used in return-oriented programming attacks. [13] At the same time, its elimination breaks the x86_64 ABI, because glibc versions that depend on vsyscall suffer from segmentation faults when attempting to dereference a system call pointer into a non-existent page. As a compromise, Linux 3.1 implemented an “emulated” vsyscall that reduced the executable code, and thus the material for ROP gadgets, mapped into the process. [14] vsyscall=emulated has been the default configuration in most distribution’s kernels for many years.

Unfortunately, vsyscall emulation still exposes predictable code at a reliable memory location, and continues to be useful for return-oriented programming. [15] Because most distributions have now upgraded to glibc versions that do not depend on vsyscall, they are beginning to ship kernels that do not support vsyscall at all. [16]

CentOS 5.11 and 6 both include versions of glibc that depend on the vsyscall page (2.5 and 2.12.2 respectively), so containers based on either cannot run under kernels provided with many distribution’s upcoming releases. [17] If Travis CI, for example, begins running jobs under a kernel that does not provide the vsyscall interface, Python packagers will not be able to use our Docker images there to build manylinux wheels. [19]

We have derived a patch from the glibc git repository that backports the removal of all dependencies on vsyscall to the version of glibc included with our manylinux2010 image. [20] Rebuilding glibc, and thus building manylinux2010 image itself, still requires a host kernel that provides the vsyscall mechanism, but the resulting image can be both run on hosts that provide it and those that do not. Because the vsyscall interface is an optimization that is only applied to running processes, the manylinux2010 wheels built with this modified image should be identical to those built on an unmodified CentOS 6 system. Also, the vsyscall problem applies only to x86_64; it is not part of the i686 ABI.

Auditwheel

The auditwheel tool has also been updated to produce manylinux2010 wheels. [21] Its behavior and purpose are otherwise unchanged from PEP 513.

Platform Detection for Installers

Platforms may define a manylinux2010_compatible boolean attribute on the _manylinux module described in PEP 513. A platform is considered incompatible with manylinux2010 if the attribute is False.

If the _manylinux module is not found, or it does not have the attribute manylinux2010_compatible, tools may fall back to checking for glibc. If the platform has glibc 2.12 or newer, it is assumed to be compatible unless the _manylinux module says otherwise.

Specifically, the algorithm we propose is:

def is_manylinux2010_compatible():
    # Only Linux, and only x86-64 / i686
    from distutils.util import get_platform
    if get_platform() not in ["linux-x86_64", "linux-i686"]:
        return False

    # Check for presence of _manylinux module
    try:
        import _manylinux
        return bool(_manylinux.manylinux2010_compatible)
    except (ImportError, AttributeError):
        # Fall through to heuristic check below
        pass

    # Check glibc version. CentOS 6 uses glibc 2.12.
    # PEP 513 contains an implementation of this function.
    return have_compatible_glibc(2, 12)

Backwards compatibility with manylinux1 wheels

As explained in PEP 513, the specified symbol versions for manylinux1 whitelisted libraries constitute an upper bound. The same is true for the symbol versions defined for manylinux2010 in this PEP. As a result, manylinux1 wheels are considered manylinux2010 wheels. A pip that recognizes the manylinux2010 platform tag will thus install manylinux1 wheels for manylinux2010 platforms – even when explicitly set – when no manylinux2010 wheels are available. [22]

PyPI Support

PyPI should permit wheels containing the manylinux2010 platform tag to be uploaded in the same way that it permits manylinux1. It should not attempt to verify the compatibility of manylinux2010 wheels.

Summary of changes to PEP 571

The following changes were made to this PEP based on feedback received after it was approved:

  • The maximum version symbol of libgcc_s was updated from GCC_4.3.0 to GCC_4.5.0 to address 32-bit Cent OS 6. This doesn’t affect x86_64 because libgcc_s for x86_64 has no additional symbol from GCC_4.3.0 to GCC_4.5.0.

References

[2]
pyca/cryptography (https://cryptography.io/)
[3]
numpy (https://numpy.org)
[4]
CentOS 5.11 EOL announcement (https://lists.centos.org/pipermail/centos-announce/2017-April/022350.html)
[5] (1, 2)
CentOS Product Specifications (https://web.archive.org/web/20180108090257/https://wiki.centos.org/About/Product)
[7]
ncurses 5 -> 6 transition means we probably need to drop some libraries from the manylinux whitelist (https://github.com/pypa/manylinux/issues/94)
[9]
SOABI support for Python 2.X and PyPy https://github.com/pypa/pip/pull/3075
[10]
manylinux2010 Docker image (https://quay.io/repository/pypa/manylinux2010_x86_64)
[11]
On vsyscalls and the vDSO (https://lwn.net/Articles/446528/)
[12]
vdso(7) (http://man7.org/linux/man-pages/man7/vdso.7.html)
[13]
Framing Signals – A Return to Portable Shellcode (http://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf)
[14]
ChangeLog-3.1 (https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.1)
[15]
Project Zero: Three bypasses and a fix for one of Flash’s Vector.<*> mitigations (https://googleprojectzero.blogspot.com/2015/08/three-bypasses-and-fix-for-one-of.html)
[16]
linux: activate CONFIG_LEGACY_VSYSCALL_NONE ? (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852620)
[17]
[Wheel-builders] Heads-up re: new kernel configurations breaking the manylinux docker image (https://mail.python.org/pipermail/wheel-builders/2016-December/000239.html)
[18]
No longer used
[19]
Travis CI (https://travis-ci.org/)
[20]
remove-vsyscall.patch https://github.com/markrwilliams/manylinux/commit/e9493d55471d153089df3aafca8cfbcb50fa8093#diff-3eda4130bdba562657f3ec7c1b3f5720
[21]
auditwheel manylinux2 branch (https://github.com/markrwilliams/auditwheel/tree/manylinux2)
[22]
pip manylinux2 branch https://github.com/markrwilliams/pip/commits/manylinux2
[23]
Calendar Versioning http://calver.org/

Source: https://github.com/python-discord/peps/blob/main/pep-0571.rst

Last modified: 2022-02-27 22:46:36 GMT