PEP 684 – A Per-Interpreter GIL
- PEP
- 684
- Title
- A Per-Interpreter GIL
- Author
- Eric Snow <ericsnowcurrently at gmail.com>
- Discussions-To
- Python-Dev
- Status
- Draft
- Type
- Standards Track
- Created
- 08-Mar-2022
- Python-Version
- 3.11
- Post-History
- 08-Mar-2022
- Resolution
Abstract
Since Python 1.5 (1997), CPython users can run multiple interpreters in the same process. However, interpreters in the same process have always shared a significant amount of global state. This is a source of bugs, with a growing impact as more and more people use the feature. Furthermore, sufficient isolation would facilitate true multi-core parallelism, where interpreters no longer share the GIL. The changes outlined in this proposal will result in that level of interpreter isolation.
High-Level Summary
At a high level, this proposal changes CPython in the following ways:
- stops sharing the GIL between interpreters, given sufficient isolation
- adds several new interpreter config options for isolation settings
- adds some public C-API for fine-grained control when creating interpreters
- keeps incompatible extensions from causing problems
The GIL
The GIL protects concurrent access to most of CPython’s runtime state. So all that GIL-protected global state must move to each interpreter before the GIL can.
(In a handful of cases, other mechanisms can be used to ensure thread-safe sharing instead, such as locks or “immortal” objects.)
CPython Runtime State
Properly isolating interpreters requires that most of CPython’s
runtime state be stored in the PyInterpreterState
struct. Currently,
only a portion of it is; the rest is found either in global variables
or in _PyRuntimeState
. Most of that will have to be moved.
This directly coincides with an ongoing effort (of many years) to greatly
reduce internal use of C global variables and consolidate the runtime
state into _PyRuntimeState
and PyInterpreterState
.
(See Consolidating Runtime Global State below.) That project has
significant merit on its own
and has faced little controversy. So, while a per-interpreter GIL
relies on the completion of that effort, that project should not be
considered a part of this proposal–only a dependency.
Other Isolation Considerations
CPython’s interpreters must be strictly isolated from each other, with few exceptions. To a large extent they already are. Each interpreter has its own copy of all modules, classes, functions, and variables. The CPython C-API docs explain further.
However, aside from what has already been mentioned (e.g. the GIL), there are a couple of ways in which interpreters still share some state.
First of all, some process-global resources (e.g. memory, file descriptors, environment variables) are shared. There are no plans to change this.
Second, some isolation is faulty due to bugs or implementations that did not take multiple interpreters into account. This includes CPython’s runtime and the stdlib, as well as extension modules that rely on global variables. Bugs should be opened in these cases, as some already have been.
Depending on Immortal Objects
PEP 683 introduces immortal objects as a CPython-internal feature. With immortal objects, we can share any otherwise immutable global objects between all interpreters. Consequently, this PEP does not need to address how to deal with the various objects exposed in the public C-API. It also simplifies the question of what to do about the builtin static types. (See Global Objects below.)
Both issues have alternate solutions, but everything is simpler with immortal objects. If PEP 683 is not accepted then this one will be updated with the alternatives. This lets us reduce noise in this proposal.
Motivation
The fundamental problem we’re solving here is a lack of true multi-core parallelism (for Python code) in the CPython runtime. The GIL is the cause. While it usually isn’t a problem in practice, at the very least it makes Python’s multi-core story murky, which makes the GIL a consistent distraction.
Isolated interpreters are also an effective mechanism to support certain concurrency models. PEP 554 discusses this in more detail.
Indirect Benefits
Most of the effort needed for a per-interpreter GIL has benefits that make those tasks worth doing anyway:
- makes multiple-interpreter behavior more reliable
- has led to fixes for long-standing runtime bugs that otherwise hadn’t been prioritized
- has been exposing (and inspiring fixes for) previously unknown runtime bugs
- has driven cleaner runtime initialization (PEP 432, PEP 587)
- has driven cleaner and more complete runtime finalization
- led to structural layering of the C-API (e.g.
Include/internal
) - also see Benefits to Consolidation below
Furthermore, much of that work benefits other CPython-related projects:
- performance improvements (“faster-cpython”)
- pre-fork application deployment (e.g. Instagram)
- extension module isolation (see PEP 630, etc.)
- embedding CPython
Existing Use of Multiple Interpreters
The C-API for multiple interpreters has been used for many years. However, until relatively recently the feature wasn’t widely known, nor extensively used (with the exception of mod_wsgi).
In the last few years use of multiple interpreters has been increasing. Here are some of the public projects using the feature currently:
Note that, with PEP 554, multiple interpreter usage would likely grow significantly (via Python code rather than the C-API).
PEP 554
PEP 554 is strictly about providing a minimal stdlib module to give users access to multiple interpreters from Python code. In fact, it specifically avoids proposing any changes related to the GIL. Consider, however, that users of that module would benefit from a per-interpreter GIL, which makes PEP 554 more appealing.
Rationale
During initial investigations in 2014, a variety of possible solutions for multi-core Python were explored, but each had its drawbacks without simple solutions:
- the existing practice of releasing the GIL in extension modules * doesn’t help with Python code
- other Python implementations (e.g. Jython, IronPython) * CPython dominates the community
- remove the GIL (e.g. gilectomy, “no-gil”) * too much technical risk (at the time)
- Trent Nelson’s “PyParallel” project * incomplete; Windows-only at the time
multiprocessing
- too much work to make it effective enough; high penalties in some situations (at large scale, Windows)
- other parallelism tools (e.g. dask, ray, MPI) * not a fit for the stdlib
- give up on multi-core (e.g. async, do nothing) * this can only end in tears
Even in 2014, it was fairly clear that a solution using isolated interpreters did not have a high level of technical risk and that most of the work was worth doing anyway. (The downside was the volume of work to be done.)
Specification
As summarized above, this proposal involves the following changes, in the order they must happen:
- consolidate global runtime state
(including objects) into
_PyRuntimeState
- move nearly all of the state down into
PyInterpreterState
- finally, move the GIL down into
PyInterpreterState
- everything else
* add to the public C-API
* implement restrictions in
ExtensionFileLoader
- work with popular extension maintainers to help with multi-interpreter support
Per-Interpreter State
The following runtime state will be moved to PyInterpreterState
:
- all global objects that are not safely shareable (fully immutable)
- the GIL
- mutable, currently protected by the GIL
- mutable, currently protected by some other per-interpreter lock
- mutable, may be used independently in different interpreters
- all other mutable (or effectively mutable) state not otherwise excluded below
Furthermore, a number of parts of the global state have already been moved to the interpreter, such as GC, warnings, and atexit hooks.
The following state will not be moved:
- global objects that are safely shareable, if any
- immutable, often
const
- treated as immutable
- related to CPython’s
main()
execution - related to the REPL
- set during runtime init, then treated as immutable
- mutable, protected by some global lock
- mutable, atomic
Note that currently the allocators (see Objects/obmalloc.c
) are shared
between all interpreters, protected by the GIL. They will need to move
to each interpreter (or a global lock will be needed). This is the
highest risk part of the work to isolate interpreters and may require
more than just moving fields down from _PyRuntimeState
. Some of
the complexity is reduced if CPython switches to a thread-safe
allocator like mimalloc.
C-API
The following private API will be made public:
_PyInterpreterConfig
_Py_NewInterpreter()
(asPy_NewInterpreterEx()
)
The following fields will be added to PyInterpreterConfig
:
own_gil
- (bool) create a new interpreter lock (instead of sharing with the main interpreter)strict_extensions
- fail import in this interpreter for incompatible extensions (see Restricting Extension Modules)
Restricting Extension Modules
Extension modules have many of the same problems as the runtime when state is stored in global variables. PEP 630 covers all the details of what extensions must do to support isolation, and thus safely run in multiple interpreters at once. This includes dealing with their globals.
Extension modules that do not implement isolation will only run in
the main interpreter. In all other interpreters, the import will
raise ImportError
. This will be done through
importlib._bootstrap_external.ExtensionFileLoader
.
We will work with popular extensions to help them support use in multiple interpreters. This may involve adding to CPython’s public C-API, which we will address on a case-by-case basis.
Extension Module Compatibility
As noted in Extension Modules, many extensions work fine in multiple interpreters without needing any changes. The import system will still fail if such a module doesn’t explicitly indicate support. At first, not many extension modules will, so this is a potential source of frustration.
We will address this by adding a context manager to temporarily disable
the check on multiple interpreter support:
importlib.util.allow_all_extensions()
.
Documentation
The “Sub-interpreter support” section of Doc/c-api/init.rst
will be
updated with the added API.
Impact
Backwards Compatibility
No behavior or APIs are intended to change due to this proposal, with one exception noted in the next section. The existing C-API for managing interpreters will preserve its current behavior, with new behavior exposed through new API. No other API or runtime behavior is meant to change, including compatibility with the stable ABI.
See Objects Exposed in the C-API below for related discussion.
Extension Modules
Currently the most common usage of Python, by far, is with the main
interpreter running by itself. This proposal has zero impact on
extension modules in that scenario. Likewise, for better or worse,
there is no change in behavior under multiple interpreters created
using the existing Py_NewInterpreter()
.
Keep in mind that some extensions already break when used in multiple interpreters, due to keeping module state in global variables. They may crash or, worse, experience inconsistent behavior. That was part of the motivation for PEP 630 and friends, so this is not a new situation nor a consequence of this proposal.
In contrast, when the proposed API is used to create multiple interpreters, the default behavior will change for some extensions. In that case, importing an extension will fail (outside the main interpreter) if it doesn’t indicate support for multiple interpreters. For extensions that already break in multiple interpreters, this will be an improvement.
Now we get to the break in compatibility mentioned above. Some extensions are safe under multiple interpreters, even though they haven’t indicated that. Unfortunately, there is no reliable way for the import system to infer that such an extension is safe, so importing them will still fail. This case is addressed in Extension Module Compatibility below.
Extension Module Maintainers
One related consideration is that a per-interpreter GIL will likely drive increased use of multiple interpreters, particularly if PEP 554 is accepted. Some maintainers of large extension modules have expressed concern about the increased burden they anticipate due to increased use of multiple interpreters.
Specifically, enabling support for multiple interpreters will require substantial work for some extension modules. To add that support, the maintainer(s) of such a module (often volunteers) would have to set aside their normal priorities and interests to focus on compatibility (see PEP 630).
Of course, extension maintainers are free to not add support for use in multiple interpreters. However, users will increasingly demand such support, especially if the feature grows in popularity.
Either way, the situation can be stressful for maintainers of such extensions, particularly when they are doing the work in their spare time. The concerns they have expressed are understandable, and we address the partial solution in Restricting Extension Modules below.
Alternate Python Implementations
Other Python implementation are not required to provide support for multiple interpreters in the same process (though some do already).
Security Implications
There is no known impact to security with this proposal.
Maintainability
On the one hand, this proposal has already motivated a number of improvements that make CPython more maintainable. That is expected to continue. On the other hand, the underlying work has already exposed various pre-existing defects in the runtime that have had to be fixed. That is also expected to continue as multiple interpreters receive more use. Otherwise, there shouldn’t be a significant impact on maintainability, so the net effect should be positive.
Performance
The work to consolidate globals has already provided a number of improvements to CPython’s performance, both speeding it up and using less memory, and this should continue. Performance benefits to a per-interpreter GIL have not been explored. At the very least, it is not expected to make CPython slower (as long as interpreters are sufficiently isolated).
How to Teach This
This is an advanced feature for users of the C-API. There is no expectation that this will be taught.
That said, if it were taught then it would boil down to the following:
In addition to Py_NewInterpreter(), you can use Py_NewInterpreterEx() to create an interpreter. The config you pass it indicates how you want that interpreter to behave.
Reference Implementation
<TBD>
Open Issues
- What are the risks/hurdles involved with moving the allocators?
- Is
allow_all_extensions
the best name for the context manager?
Deferred Functionality
PyInterpreterConfig
option to always run the interpreter in a new threadPyInterpreterConfig
option to assign a “main” thread to the interpreter and only run in that thread
Rejected Ideas
<TBD>
Extra Context
Consolidating Runtime Global State
As noted in CPython Runtime State above, there is an active effort
(separate from this PEP) to consolidate CPython’s global state into the
_PyRuntimeState
struct. Nearly all the work involves moving that
state from global variables. The project is particularly relevant to
this proposal, so below is some extra detail.
Benefits to Consolidation
Consolidating the globals has a variety of benefits:
- greatly reduces the number of C globals (best practice for C code)
- the move draws attention to runtime state that is unstable or broken
- encourages more consistency in how runtime state is used
- makes multiple-interpreter behavior more reliable
- leads to fixes for long-standing runtime bugs that otherwise haven’t been prioritized
- exposes (and inspires fixes for) previously unknown runtime bugs
- facilitates cleaner runtime initialization and finalization
- makes it easier to discover/identify CPython’s runtime state
- makes it easier to statically allocate runtime state in a consistent way
- better memory locality for runtime state
- structural layering of the C-API (e.g.
Include/internal
)
Furthermore, much of that work benefits other CPython-related projects:
- performance improvements (“faster-cpython”)
- pre-fork application deployment (e.g. Instagram)
- extension module isolation (see PEP 630, etc.)
- embedding CPython
Scale of Work
The number of global variables to be moved is large enough to matter,
but most are Python objects that can be dealt with in large groups
(like Py_IDENTIFIER
). In nearly all cases, moving these globals
to the interpreter is highly mechanical. That doesn’t require
cleverness but instead requires someone to put in the time.
State To Be Moved
The remaining global variables can be categorized as follows:
- global objects * static types (incl. exception types) * non-static types (incl. heap types, structseq types) * singletons (static) * singletons (initialized once) * cached objects
- non-objects * will not (or unlikely to) change after init * only used in the main thread * initialized lazily * pre-allocated buffers * state
Those globals are spread between the core runtime, the builtin modules, and the stdlib extension modules.
For a breakdown of the remaining globals, run:
./python Tools/c-analyzer/table-file.py Tools/c-analyzer/cpython/globals-to-fix.tsv
Already Completed Work
As mentioned, this work has been going on for many years. Here are some of the things that have already been done:
- cleanup of runtime initialization (see PEP 432 / PEP 587)
- extension module isolation machinery (see PEP 384 / PEP 3121 / PEP 489)
- isolation for many builtin modules
- isolation for many stdlib extension modules
- addition of
_PyRuntimeState
- no more
_Py_IDENTIFIER()
- statically allocated:
- empty string
- string literals
- identifiers
- latin-1 strings
- length-1 bytes
- empty tuple
Tooling
As already indicated, there are several tools to help identify the globals and reason about them.
Tools/c-analyzer/cpython/globals-to-fix.tsv
- the list of remaining globalsTools/c-analyzer/c-analyzer.py
*analyze
- identify all the globals *check
- fail if there are any unsupported globals that aren’t ignoredTools/c-analyzer/table-file.py
- summarize the known globals
Also, the check for unsupported globals is incorporated into CI so that no new globals are accidentally added.
Global Objects
Global objects that are safe to be shared (without a GIL) between
interpreters can stay on _PyRuntimeState
. Not only must the object
be effectively immutable (e.g. singletons, strings), but not even the
refcount can change for it to be safe. Immortality (PEP 683)
provides that. (The alternative is that no objects are shared, which
adds significant complexity to the solution, particularly for the
objects exposed in the public C-API.)
Builtin static types are a special case of global objects that will be
shared. They are effectively immutable except for one part:
__subclasses__
(AKA tp_subclasses
). We expect that nothing
else on a builtin type will change, even the content
of __dict__
(AKA tp_dict
).
__subclasses__
for the builtin types will be dealt with by making
it a getter that does a lookup on the current PyInterpreterState
for that type.
References
Related:
Copyright
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Source: https://github.com/python-discord/peps/blob/main/pep-0684.rst
Last modified: 2022-03-09 16:08:07 GMT