PEP 575 – Unifying function/method classes
- PEP
- 575
- Title
- Unifying function/method classes
- Author
- Jeroen Demeyer <J.Demeyer at UGent.be>
- Status
- Withdrawn
- Type
- Standards Track
- Created
- 27-Mar-2018
- Python-Version
- 3.8
- Post-History
- 31-Mar-2018, 12-Apr-2018, 27-Apr-2018, 05-May-2018
Contents
- Withdrawal notice
- Abstract
- Motivation
- New classes
- Calling base_function instances
- Automatic creation of built-in functions
- Further changes
- Non-CPython implementations
- Rationale
- Why not simply change existing classes?
- Why __text_signature__ is not a solution
- defined_function versus function
- Scope of this PEP: which classes are involved?
- Not treating METH_STATIC and METH_CLASS
- __self__ in base_function
- Two implementations of __doc__
- Subclassing
- Replacing tp_call: METH_PASS_FUNCTION and METH_CALL_UNBOUND
- Backwards compatibility
- Two-phase Implementation
- Reference Implementation
- Appendix: current situation
- References
- Copyright
Withdrawal notice
See PEP 580 for a better solution to allowing fast calling of custom classes.
See PEP 579 for a broader discussion of some of the other issues from this PEP.
Abstract
Reorganize the class hierarchy for functions and methods with the goal of reducing the difference between built-in functions (implemented in C) and Python functions. Mainly, make built-in functions behave more like Python functions without sacrificing performance.
A new base class base_function
is introduced and the various function
classes, as well as method
(renamed to bound_method
), inherit from it.
We also allow subclassing the Python function
class.
Motivation
Currently, CPython has two different function classes:
the first is Python functions, which is what you get
when defining a function with def
or lambda
.
The second is built-in functions such as len
, isinstance
or numpy.dot
.
These are implemented in C.
These two classes are implemented completely independently and have different functionality.
In particular, it is currently not possible to implement a function efficiently in C
(only built-in functions can do that)
while still allowing introspection like inspect.signature
or inspect.getsourcefile
(only Python functions can do that).
This is a problem for projects like Cython [1] that want to do exactly that.
In Cython, this was worked around by inventing a new function class called cyfunction
.
Unfortunately, a new function class creates problems:
the inspect
module does not recognize such functions as being functions [2]
and the performance is worse
(CPython has specific optimizations for calling built-in functions).
A second motivation is more generally making built-in functions and methods
behave more like Python functions and methods.
For example, Python unbound methods are just functions but
unbound methods of extension types (e.g. dict.get
) are a distinct class.
Bound methods of Python classes have a __func__
attribute,
bound methods of extension types do not.
Third, this PEP allows great customization of functions.
The function
class becomes subclassable and custom function
subclasses are also allowed for functions implemented in C.
In the latter case, this can be done with the same performance
as true built-in functions.
All functions can access the function object
(the self
in __call__
), paving the way for PEP 573.
New classes
This is the new class hierarchy for functions and methods:
object
|
|
base_function
/ | \
/ | \
/ | defined_function
/ | \
cfunction (*) | \
| function
|
bound_method (*)
The two classes marked with (*) do not allow subclassing; the others do.
There is no difference between functions and unbound methods,
while bound methods are instances of bound_method
.
base_function
The class base_function
becomes a new base class for all function types.
It is based on the existing builtin_function_or_method
class,
but with the following differences and new features:
- It acts as a descriptor implementing
__get__
to turn a function into a method ifm_self
isNULL
. Ifm_self
is notNULL
, then this is a no-op: the existing function is returned instead. - A new read-only attribute
__parent__
, represented in the C structure asm_parent
. If this attribute exists, it represents the defining object. For methods of extension types, this is the defining class (__class__
in plain Python) and for functions of a module, this is the defining module. In general, it can be any Python object. If__parent__
is a class, it carries special semantics: in that case, the function must be called withself
being an instance of that class. Finally,__qualname__
and__reduce__
will use__parent__
as namespace (instead of__self__
before). - A new attribute
__objclass__
which equals__parent__
if__parent__
is a class. Otherwise, accessing__objclass__
raisesAttributeError
. This is meant to be backwards compatible withmethod_descriptor
. - The field
ml_doc
and the attributes__doc__
and__text_signature__
(see Argument Clinic) are not supported. - A new flag
METH_PASS_FUNCTION
forml_flags
. If this flag is set, the C function stored inml_meth
is called with an additional first argument equal to the function object. - A new flag
METH_BINDING
forml_flags
which only applies to functions of a module (not methods of a class). If this flag is set, thenm_self
is set toNULL
instead of the module. This allows the function to behave more like a Python function as it enables__get__
. - A new flag
METH_CALL_UNBOUND
to disable self slicing. - A new flag
METH_PYTHON
forml_flags
. This flag indicates that this function should be treated as Python function. Ideally, use of this flag should be avoided because it goes against the duck typing philosophy. It is still needed in a few places though, for example profiling.
The goal of base_function
is that it supports all different ways
of calling functions and methods in just one structure.
For example, the new flag METH_PASS_FUNCTION
will be used by the implementation of methods.
It is not possible to directly create instances of base_function
(tp_new
is NULL
).
However, it is legal for C code to manually create instances.
These are the relevant C structures:
PyTypeObject PyBaseFunction_Type;
typedef struct {
PyObject_HEAD
PyCFunctionDef *m_ml; /* Description of the C function to call */
PyObject *m_self; /* __self__: anything, can be NULL; readonly */
PyObject *m_module; /* __module__: anything (typically str) */
PyObject *m_parent; /* __parent__: anything, can be NULL; readonly */
PyObject *m_weakreflist; /* List of weak references */
} PyBaseFunctionObject;
typedef struct {
const char *ml_name; /* The name of the built-in function/method */
PyCFunction ml_meth; /* The C function that implements it */
int ml_flags; /* Combination of METH_xxx flags, which mostly
describe the args expected by the C func */
} PyCFunctionDef;
Subclasses may extend PyCFunctionDef
with extra fields.
The Python attribute __self__
returns m_self
,
except if METH_STATIC
is set.
In that case or if m_self
is NULL
,
then there is no __self__
attribute at all.
For that reason, we write either m_self
or __self__
in this PEP
with slightly different meanings.
cfunction
This is the new version of the old builtin_function_or_method
class.
The name cfunction
was chosen to avoid confusion with “built-in”
in the sense of “something in the builtins
module”.
It also fits better with the C API which use the PyCFunction
prefix.
The class cfunction
is a copy of base_function
, with the following differences:
m_ml
points to aPyMethodDef
structure, extendingPyCFunctionDef
with an additionalml_doc
field to implement__doc__
and__text_signature__
as read-only attributes:typedef struct { const char *ml_name; PyCFunction ml_meth; int ml_flags; const char *ml_doc; } PyMethodDef;
Note that
PyMethodDef
is part of the Python Stable ABI and it is used by practically all extension modules, so we absolutely cannot change this structure.- Argument Clinic is supported.
__self__
always exists. In the cases wherebase_function.__self__
would raiseAttributeError
, insteadNone
is returned.
The type object is PyTypeObject PyCFunction_Type
and we define PyCFunctionObject
as alias of PyBaseFunctionObject
(except for the type of m_ml
).
defined_function
The class defined_function
is an abstract base class meant
to indicate that the function has introspection support.
Instances of defined_function
are required to support all attributes
that Python functions have, namely
__code__
, __globals__
, __doc__
,
__defaults__
, __kwdefaults__
, __closure__
and __annotations__
.
There is also a __dict__
to support attributes added by the user.
None of these is required to be meaningful.
In particular, __code__
may not be a working code object,
possibly only a few fields may be filled in.
This PEP does not dictate how the various attributes are implemented.
They may be simple struct members or more complicated descriptors.
Only read-only support is required, none of the attributes is required to be writable.
The class defined_function
is mainly meant for auto-generated C code,
for example produced by Cython [1].
There is no API to create instances of it.
The C structure is the following:
PyTypeObject PyDefinedFunction_Type;
typedef struct {
PyBaseFunctionObject base;
PyObject *func_dict; /* __dict__: dict or NULL */
} PyDefinedFunctionObject;
TODO: maybe find a better name for defined_function
.
Other proposals: inspect_function
(anything that satisfies inspect.isfunction
),
builtout_function
(a function that is better built out; pun on builtin),
generic_function
(original proposal but conflicts with functools.singledispatch
generic functions),
user_function
(defined by the user as opposed to CPython).
function
This is the class meant for functions implemented in Python.
Unlike the other function types,
instances of function
can be created from Python code.
This is not changed, so we do not describe the details in this PEP.
The layout of the C structure is the following:
PyTypeObject PyFunction_Type;
typedef struct {
PyBaseFunctionObject base;
PyObject *func_dict; /* __dict__: dict or NULL */
PyObject *func_code; /* __code__: code */
PyObject *func_globals; /* __globals__: dict; readonly */
PyObject *func_name; /* __name__: string */
PyObject *func_qualname; /* __qualname__: string */
PyObject *func_doc; /* __doc__: can be anything or NULL */
PyObject *func_defaults; /* __defaults__: tuple or NULL */
PyObject *func_kwdefaults; /* __kwdefaults__: dict or NULL */
PyObject *func_closure; /* __closure__: tuple of cell objects or NULL; readonly */
PyObject *func_annotations; /* __annotations__: dict or NULL */
PyCFunctionDef _ml; /* Storage for base.m_ml */
} PyFunctionObject;
The descriptor __name__
returns func_name
.
When setting __name__
, also base.m_ml->ml_name
is updated
with the UTF-8 encoded name.
The _ml
field reserves space to be used by base.m_ml
.
A base_function
instance must have the flag METH_PYTHON
set
if and only if it is an instance of function
.
When constructing an instance of function
from code
and globals
,
an instance is created with base.m_ml = &_ml
,
base.m_self = NULL
.
To make subclassing easier, we also add a copy constructor:
if f
is an instance of function
, then types.FunctionType(f)
copies f
.
This conveniently allows using a custom function type as decorator:
>>> from types import FunctionType
>>> class CustomFunction(FunctionType):
... pass
>>> @CustomFunction
... def f(x):
... return x
>>> type(f)
<class '__main__.CustomFunction'>
This also removes many use cases of functools.wraps
:
wrappers can be replaced by subclasses of function
.
bound_method
The class bound_method
is used for all bound methods,
regardless of the class of the underlying function.
It adds one new attribute on top of base_function
:
__func__
points to that function.
bound_method
replaces the old method
class
which was used only for Python functions bound as method.
There is a complication because we want to allow
constructing a method from an arbitrary callable.
This may be an already-bound method or simply not an instance of base_function
.
Therefore, in practice there are two kinds of methods:
- For arbitrary callables, we use a single fixed
PyCFunctionDef
structure with theMETH_PASS_FUNCTION
flag set. - For methods which bind instances of
base_function
(more precisely, which have thePy_TPFLAGS_BASEFUNCTION
flag set) that have self slicing, we instead use thePyCFunctionDef
from the original function. This way, we don’t lose any performance when calling bound methods. In this case, the__func__
attribute is only used to implement various attributes but not for calling the method.
When constructing a new method from a base_function
,
we check that the self
object is an instance of __objclass__
(if a class was specified as parent) and raise a TypeError
otherwise.
The C structure is:
PyTypeObject PyMethod_Type;
typedef struct {
PyBaseFunctionObject base;
PyObject *im_func; /* __func__: function implementing the method; readonly */
} PyMethodObject;
Calling base_function instances
We specify the implementation of __call__
for instances of base_function
.
Checking __objclass__
First of all, a type check is done if the __parent__
of the function
is a class
(recall that __objclass__
then becomes an alias of __parent__
):
if m_self
is NULL
(this is the case for unbound methods of extension types),
then the function must be called with at least one positional argument
and the first (typically called self
) must be an instance of __objclass__
.
If not, a TypeError
is raised.
Note that bound methods have m_self != NULL
, so the __objclass__
is not checked.
Instead, the __objclass__
check is done when constructing the method.
Flags
For convenience, we define a new constant:
METH_CALLFLAGS
combines all flags from PyCFunctionDef.ml_flags
which specify the signature of the C function to be called.
It is equal to
METH_VARARGS | METH_FASTCALL | METH_NOARGS | METH_O | METH_KEYWORDS | METH_PASS_FUNCTION
Exactly one of the first four flags above must be set
and only METH_VARARGS
and METH_FASTCALL
may be combined with METH_KEYWORDS
.
Violating these rules is undefined behaviour.
There are one new flags which affects calling functions,
namely METH_PASS_FUNCTION
and METH_CALL_UNBOUND
.
Some flags are already documented in [5].
We explain the others below.
Self slicing
If the function has m_self == NULL
and the flag METH_CALL_UNBOUND
is not set, then the first positional argument (if any)
is removed from *args
and instead passed as first argument to the C function.
Effectively, the first positional argument is treated as __self__
.
This is meant to support unbound methods
such that the C function does not see the difference
between bound and unbound method calls.
This does not affect keyword arguments in any way.
This process is called self slicing and a function is said to
have self slicing if m_self == NULL
and METH_CALL_UNBOUND
is not set.
Note that a METH_NOARGS
function which has self slicing
effectively has one argument, namely self
.
Analogously, a METH_O
function with self slicing has two arguments.
METH_PASS_FUNCTION
If this flag is set, then the C function is called with an
additional first argument, namely the function itself
(the base_function
instance).
As special case, if the function is a bound_method
,
then the underlying function of the method is passed
(but not recursively: if a bound_method
wraps a bound_method
,
then __func__
is only applied once).
For example, an ordinary METH_VARARGS
function has signature
(PyObject *self, PyObject *args)
.
With METH_VARARGS | METH_PASS_FUNCTION
, this becomes
(PyObject *func, PyObject *self, PyObject *args)
.
METH_FASTCALL
This is an existing but undocumented flag. We suggest to officially support and document it.
If the flag METH_FASTCALL
is set without METH_KEYWORDS
,
then the ml_meth
field is of type PyCFunctionFast
which takes the arguments (PyObject *self, PyObject *const *args, Py_ssize_t nargs)
.
Such a function takes only positional arguments and they are passed as plain C array
args
of length nargs
.
If the flags METH_FASTCALL | METH_KEYWORDS
are set,
then the ml_meth
field is of type PyCFunctionFastKeywords
which takes the arguments (PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)
.
The positional arguments are passed as C array args
of length nargs
.
The values of the keyword arguments follow in that array,
starting at position nargs
.
The keys (names) of the keyword arguments are passed as a tuple
in kwnames
.
As an example, assume that 3 positional and 2 keyword arguments are given.
Then args
is an array of length 3 + 2 = 5, nargs
equals 3 and kwnames
is a 2-tuple.
Automatic creation of built-in functions
Python automatically generates instances of cfunction
for extension types (using the PyTypeObject.tp_methods
field) and modules
(using the PyModuleDef.m_methods
field).
The arrays PyTypeObject.tp_methods
and PyModuleDef.m_methods
must be arrays of PyMethodDef
structures.
Unbound methods of extension types
The type of unbound methods changes from method_descriptor
to cfunction
.
The object which appears as unbound method is the same object which
appears in the class __dict__
.
Python automatically sets the __parent__
attribute to the defining class.
Built-in functions of a module
For the case of functions of a module,
__parent__
will be set to the module.
Unless the flag METH_BINDING
is given, also __self__
will be set to the module (for backwards compatibility).
An important consequence is that such functions by default
do not become methods when used as attribute
(base_function.__get__
only does that if m_self
was NULL
).
One could consider this a bug, but this was done for backwards compatibility reasons:
in an initial post on python-ideas [6] the consensus was to keep this
misfeature of built-in functions.
However, to allow this anyway for specific or newly implemented
built-in functions, the METH_BINDING
flag prevents setting __self__
.
Further changes
New type flag
A new PyTypeObject
flag (for tp_flags
) is added:
Py_TPFLAGS_BASEFUNCTION
to indicate that instances of this type are
functions which can be called and bound as method like a base_function
.
This is different from flags like Py_TPFLAGS_LIST_SUBCLASS
because it indicates more than just a subclass:
it also indicates a default implementation of __call__
and __get__
.
In particular, such subclasses of base_function
must follow the implementation from the section Calling base_function instances.
This flag is automatically set for extension types which
inherit the tp_call
and tp_descr_get
implementation from base_function
.
Extension types can explicitly specify it if they
override __call__
or __get__
in a compatible way.
The flag Py_TPFLAGS_BASEFUNCTION
must never be set for a heap type
because that would not be safe (heap types can be changed dynamically).
C API functions
We list some relevant Python/C API macros and functions. Some of these are existing (possibly changed) functions, some are new:
int PyBaseFunction_CheckFast(PyObject *op)
: return true ifop
is an instance of a class with thePy_TPFLAGS_BASEFUNCTION
set. This is the function that you need to use to determine whether it is meaningful to access thebase_function
internals.int PyBaseFunction_Check(PyObject *op)
: return true ifop
is an instance ofbase_function
.PyObject *PyBaseFunction_New(PyTypeObject *cls, PyCFunctionDef *ml, PyObject *self, PyObject *module, PyObject *parent)
: create a new instance ofcls
(which must be a subclass ofbase_function
) from the given data.int PyCFunction_Check(PyObject *op)
: return true ifop
is an instance ofcfunction
.int PyCFunction_NewEx(PyMethodDef* ml, PyObject *self, PyObject* module)
: create a new instance ofcfunction
. As special case, ifself
isNULL
, then setself = Py_None
instead (for backwards compatibility). Ifself
is a module, then__parent__
is set toself
. Otherwise,__parent__
isNULL
.- For many existing
PyCFunction_...
andPyMethod_
functions, we define a new functionPyBaseFunction_...
acting onbase_function
instances. The old functions are kept as aliases of the new functions. int PyFunction_Check(PyObject *op)
: return true ifop
is an instance ofbase_function
with theMETH_PYTHON
flag set (this is equivalent to checking whetherop
is an instance offunction
).int PyFunction_CheckFast(PyObject *op)
: equivalent toPyFunction_Check(op) && PyBaseFunction_CheckFast(op)
.int PyFunction_CheckExact(PyObject *op)
: return true if the type ofop
isfunction
.PyObject *PyFunction_NewPython(PyTypeObject *cls, PyObject *code, PyObject *globals, PyObject *name, PyObject *qualname)
: create a new instance ofcls
(which must be a subclass offunction
) from the given data.PyObject *PyFunction_New(PyObject *code, PyObject *globals)
: create a new instance offunction
.PyObject *PyFunction_NewWithQualName(PyObject *code, PyObject *globals, PyObject *qualname)
: create a new instance offunction
.PyObject *PyFunction_Copy(PyTypeObject *cls, PyObject *func)
: create a new instance ofcls
(which must be a subclass offunction
) by copying a givenfunction
.
Changes to the types module
Two types are added: types.BaseFunctionType
corresponding to
base_function
and types.DefinedFunctionType
corresponding to
defined_function
.
Apart from that, no changes to the types
module are made.
In particular, types.FunctionType
refers to function
.
However, the actual types will change:
in particular, types.BuiltinFunctionType
will no longer be the same
as types.BuiltinMethodType
.
Changes to the inspect module
The new function inspect.isbasefunction
checks for an instance of base_function
.
inspect.isfunction
checks for an instance of defined_function
.
inspect.isbuiltin
checks for an instance of cfunction
.
inspect.isroutine
checks isbasefunction
or ismethoddescriptor
.
NOTE: bpo-33261 [3] should be fixed first.
Profiling
Currently, sys.setprofile
supports c_call
, c_return
and c_exception
events for built-in functions.
These events are generated when calling or returning from a built-in function.
By contrast, the call
and return
events are generated by the function itself.
So nothing needs to change for the call
and return
events.
Since we no longer make a difference between C functions and Python functions,
we need to prevent the c_*
events for Python functions.
This is done by not generating those events if the
METH_PYTHON
flag in ml_flags
is set.
Non-CPython implementations
Most of this PEP is only relevant to CPython.
For other implementations of Python,
the two changes that are required are the base_function
base class
and the fact that function
can be subclassed.
The classes cfunction
and defined_function
are not required.
We require base_function
for consistency but we put no requirements on it:
it is acceptable if this is just a copy of object
.
Support for the new __parent__
(and __objclass__
) attribute is not required.
If there is no defined_function
class,
then types.DefinedFunctionType
should be an alias of types.FunctionType
.
Rationale
Why not simply change existing classes?
One could try to solve the problem by keeping the existing classes
without introducing a new base_function
class.
That might look like a simpler solution but it is not:
it would require introspection support for 3 distinct classes:
function
, builtin_function_or_method
and method_descriptor
.
For the latter two classes, “introspection support” would mean
at a minimum allowing subclassing.
But we don’t want to lose performance, so we want fast subclass checks.
This would require two new flags in tp_flags
.
And we want subclasses to allow __get__
for built-in functions,
so we should implement the LOAD_METHOD
opcode for built-in functions too.
More generally, a lot of functionality would need to be duplicated
and the end result would be far more complex code.
It is also not clear how the introspection of built-in function subclasses
would interact with __text_signature__
.
Having two independent kinds of inspect.signature
support on the same
class sounds like asking for problems.
And this would not fix some of the other differences between built-in functions and Python functions that were mentioned in the motivation.
Why __text_signature__ is not a solution
Built-in functions have an attribute __text_signature__
,
which gives the signature of the function as plain text.
The default values are evaluated by ast.literal_eval
.
Because of this, it supports only a small number of standard Python classes
and not arbitrary Python objects.
And even if __text_signature__
would allow arbitrary signatures somehow,
that is only one piece of introspection:
it does not help with inspect.getsourcefile
for example.
defined_function versus function
In many places, a decision needs to be made whether the old function
class
should be replaced by defined_function
or the new function
class.
This is done by thinking of the most likely use case:
types.FunctionType
refers tofunction
because that type might be used to construct instances usingtypes.FunctionType(...)
.inspect.isfunction()
refers todefined_function
because this is the class where introspection is supported.- The C API functions must refer to
function
because we do not specify how the various attributes ofdefined_function
are implemented. We expect that this is not a problem since there is typically no reason for introspection to be done by C extensions.
Scope of this PEP: which classes are involved?
The main motivation of this PEP is fixing function classes,
so we certainly want to unify the existing classes
builtin_function_or_method
and function
.
Since built-in functions and methods have the same class, it seems natural to include bound methods too. And since there are no “unbound methods” for Python functions, it makes sense to get rid of unbound methods for extension types.
For now, no changes are made to the classes staticmethod
,
classmethod
and classmethod_descriptor
.
It would certainly make sense to put these in the base_function
class hierarchy and unify classmethod
and classmethod_descriptor
.
However, this PEP is already big enough
and this is left as a possible future improvement.
Slot wrappers for extension types like __init__
or __eq__
are quite different from normal methods.
They are also typically not called directly because you would normally
write foo[i]
instead of foo.__getitem__(i)
.
So these are left outside the scope of this PEP.
Python also has an instancemethod
class,
which seems to be a relic from Python 2,
where it was used for bound and unbound methods.
It is not clear whether there is still a use case for it.
In any case, there is no reason to deal with it in this PEP.
TODO: should instancemethod
be deprecated?
It doesn’t seem used at all within CPython 3.7,
but maybe external packages use it?
Not treating METH_STATIC and METH_CLASS
Almost nothing in this PEP refers to the flags METH_STATIC
and METH_CLASS
.
These flags are checked only by the automatic creation of built-in functions.
When a staticmethod
, classmethod
or classmethod_descriptor
is bound (i.e. __get__
is called),
a base_function
instance is created with m_self != NULL
.
For a classmethod
, this is obvious since m_self
is the class that the method is bound to.
For a staticmethod
, one can take an arbitrary Python object for m_self
.
For backwards compatibility, we choose m_self = __parent__
for static methods
of extension types.
__self__ in base_function
It may look strange at first sight to add the __self__
slot
in base_function
as opposed to bound_method
.
We took this idea from the existing builtin_function_or_method
class.
It allows us to have a single general implementation of __call__
and __get__
for the various function classes discussed in this PEP.
It also makes it easy to support existing built-in functions
which set __self__
to the module (for example, sys.exit.__self__
is sys
).
Two implementations of __doc__
base_function
does not support function docstrings.
Instead, the classes cfunction
and function
each have their own way of dealing with docstrings
(and bound_method
just takes the __doc__
from the wrapped function).
For cfunction
, the docstring is stored (together with the text signature)
as C string in the read-only ml_doc
field of a PyMethodDef
.
For function
, the docstring is stored as a writable Python object
and it does not actually need to be a string.
It looks hard to unify these two very different ways of dealing with __doc__
.
For backwards compatibility, we keep the existing implementations.
For defined_function
, we require __doc__
to be implemented
but we do not specify how. A subclass can implement __doc__
the
same way as cfunction
or using a struct member or some other way.
Subclassing
We disallow subclassing of cfunction
and bound_method
to enable fast type checks for PyCFunction_Check
and PyMethod_Check
.
We allow subclassing of the other classes because there is no reason to disallow it.
For Python modules, the only relevant class to subclass is
function
because the others cannot be instantiated anyway.
Replacing tp_call: METH_PASS_FUNCTION and METH_CALL_UNBOUND
The new flags METH_PASS_FUNCTION
and METH_CALL_UNBOUND
are meant to support cases where formerly a custom tp_call
was used.
It reduces the number of special fast paths in Python/ceval.c
for calling objects:
instead of treating Python functions, built-in functions and method descriptors
separately, there would only be a single check.
The signature of tp_call
is essentially the signature
of PyBaseFunctionObject.m_ml.ml_meth
with flags
METH_VARARGS | METH_KEYWORDS | METH_PASS_FUNCTION | METH_CALL_UNBOUND
(the only difference is an added self
argument).
Therefore, it should be easy to change existing tp_call
slots
to use the base_function
implementation instead.
It also makes sense to use METH_PASS_FUNCTION
without METH_CALL_UNBOUND
in cases where the C function simply needs access to additional metadata
from the function, such as the __parent__
.
This is for example needed to support PEP 573.
Converting existing methods to use METH_PASS_FUNCTION
is trivial:
it only requires adding an extra argument to the C function.
Backwards compatibility
While designing this PEP, great care was taken to not break backwards compatibility too much. Most of the potentially incompatible changes are changes to CPython implementation details which are different anyway in other Python interpreters. In particular, Python code which correctly runs on PyPy will very likely continue to work with this PEP.
The standard classes and functions like
staticmethod
, functools.partial
or operator.methodcaller
do not need to change at all.
Changes to types and inspect
The proposed changes to types
and inspect
are meant to minimize changes in behaviour.
However, it is unavoidable that some things change
and this can cause code which uses types
or inspect
to break.
In the Python standard library for example,
changes are needed in the doctest
module because of this.
Also, tools which take various kinds of functions as input will need to deal with the new function hierarchy and the possibility of custom function classes.
Python functions
For Python functions, essentially nothing changes. The attributes that existed before still exist and Python functions can be initialized, called and turned into methods as before.
The name function
is kept for backwards compatibility.
While it might make sense to change the name to something more
specific like python_function
,
that would require a lot of annoying changes in documentation and testsuites.
Built-in functions of a module
Also for built-in functions, nothing changes.
We keep the old behaviour that such functions do not bind as methods.
This is a consequence of the fact that __self__
is set to the module.
Built-in bound and unbound methods
The types of built-in bound and unbound methods will change.
However, this does not affect calling such methods
because the protocol in base_function.__call__
(in particular the handling of __objclass__
and self slicing)
was specifically designed to be backwards compatible.
All attributes which existed before (like __objclass__
and __self__
)
still exist.
New attributes
Some objects get new special double-underscore attributes.
For example, the new attribute __parent__
appears on
all built-in functions and all methods get a __func__
attribute.
The fact that __self__
is now a special read-only attribute
for Python functions caused trouble in [4].
Generally, we expect that not much will break though.
method_descriptor and PyDescr_NewMethod
The class method_descriptor
and the constructor PyDescr_NewMethod
should be deprecated.
They are no longer used by CPython itself but are still supported.
Two-phase Implementation
TODO: this section is optional. If this PEP is accepted, it should be decided whether to apply this two-phase implementation or not.
As mentioned above, the changes to types and inspect can break some existing code. In order to further minimize breakage, this PEP could be implemented in two phases.
Phase one: keep existing classes but add base classes
Initially, implement the base_function
class
and use it as common base class but otherwise keep the existing classes
(but not their implementation).
In this proposal, the class hierarchy would become:
object
|
|
base_function
/ | \
/ | \
/ | \
cfunction | defined_function
| | | \
| | bound_method \
| | \
| method_descriptor function
|
builtin_function_or_method
The leaf classes builtin_function_or_method
, method_descriptor
,
bound_method
and function
correspond to the existing classes
(with method
renamed to bound_method
).
Automatically created functions created in modules become instances
of builtin_function_or_method
.
Unbound methods of extension types become instances of method_descriptor
.
The class method_descriptor
is a copy of cfunction
except
that __get__
returns a builtin_function_or_method
instead of a
bound_method
.
The class builtin_function_or_method
has the same C structure as a
bound_method
, but it inherits from cfunction
.
The __func__
attribute is not mandatory:
it is only defined when binding a method_descriptor
.
We keep the implementation of the inspect
functions as they are.
Because of this and because the existing classes are kept,
backwards compatibility is ensured for code doing type checks.
Since showing an actual DeprecationWarning
would affect a lot
of correctly-functioning code,
any deprecations would only appear in the documentation.
Another reason is that it is hard to show warnings for calling isinstance(x, t)
(but it could be done using __instancecheck__
hacking)
and impossible for type(x) is t
.
Phase two
Phase two is what is actually described in the rest of this PEP. In terms of implementation, it would be a relatively small change compared to phase one.
Reference Implementation
Most of this PEP has been implemented for CPython at https://github.com/jdemeyer/cpython/tree/pep575
There are four steps, corresponding to the commits on that branch. After each step, CPython is in a mostly working state.
- Add the
base_function
class and make it a subclass forcfunction
. This is by far the biggest step as the complete__call__
protocol is implemented in this step. - Rename
method
tobound_method
and make it a subclass ofbase_function
. Change unbound methods of extension types to be instances ofcfunction
such that bound methods of extension types are also instances ofbound_method
. - Implement
defined_function
andfunction
. - Changes to other parts of Python, such as the standard library and testsuite.
Appendix: current situation
NOTE: This section is more useful during the draft period of the PEP, so feel free to remove this once the PEP has been accepted.
For reference, we describe in detail the relevant existing classes in CPython 3.7.
Each of the classes involved is an “orphan” class (no non-trivial subclasses nor superclasses).
builtin_function_or_method: built-in functions and bound methods
These are of type PyCFunction_Type with structure PyCFunctionObject:
typedef struct {
PyObject_HEAD
PyMethodDef *m_ml; /* Description of the C function to call */
PyObject *m_self; /* Passed as 'self' arg to the C func, can be NULL */
PyObject *m_module; /* The __module__ attribute, can be anything */
PyObject *m_weakreflist; /* List of weak references */
} PyCFunctionObject;
struct PyMethodDef {
const char *ml_name; /* The name of the built-in function/method */
PyCFunction ml_meth; /* The C function that implements it */
int ml_flags; /* Combination of METH_xxx flags, which mostly
describe the args expected by the C func */
const char *ml_doc; /* The __doc__ attribute, or NULL */
};
where PyCFunction
is a C function pointer (there are various forms of this, the most basic
takes two arguments for self
and *args
).
This class is used both for functions and bound methods:
for a method, the m_self
slot points to the object:
>>> dict(foo=42).get
<built-in method get of dict object at 0x...>
>>> dict(foo=42).get.__self__
{'foo': 42}
In some cases, a function is considered a “method” of the module defining it:
>>> import os
>>> os.kill
<built-in function kill>
>>> os.kill.__self__
<module 'posix' (built-in)>
method_descriptor: built-in unbound methods
These are of type PyMethodDescr_Type with structure PyMethodDescrObject:
typedef struct {
PyDescrObject d_common;
PyMethodDef *d_method;
} PyMethodDescrObject;
typedef struct {
PyObject_HEAD
PyTypeObject *d_type;
PyObject *d_name;
PyObject *d_qualname;
} PyDescrObject;
function: Python functions
These are of type PyFunction_Type with structure PyFunctionObject:
typedef struct {
PyObject_HEAD
PyObject *func_code; /* A code object, the __code__ attribute */
PyObject *func_globals; /* A dictionary (other mappings won't do) */
PyObject *func_defaults; /* NULL or a tuple */
PyObject *func_kwdefaults; /* NULL or a dict */
PyObject *func_closure; /* NULL or a tuple of cell objects */
PyObject *func_doc; /* The __doc__ attribute, can be anything */
PyObject *func_name; /* The __name__ attribute, a string object */
PyObject *func_dict; /* The __dict__ attribute, a dict or NULL */
PyObject *func_weakreflist; /* List of weak references */
PyObject *func_module; /* The __module__ attribute, can be anything */
PyObject *func_annotations; /* Annotations, a dict or NULL */
PyObject *func_qualname; /* The qualified name */
/* Invariant:
* func_closure contains the bindings for func_code->co_freevars, so
* PyTuple_Size(func_closure) == PyCode_GetNumFree(func_code)
* (func_closure may be NULL if PyCode_GetNumFree(func_code) == 0).
*/
} PyFunctionObject;
In Python 3, there is no “unbound method” class: an unbound method is just a plain function.
method: Python bound methods
These are of type PyMethod_Type with structure PyMethodObject:
typedef struct {
PyObject_HEAD
PyObject *im_func; /* The callable object implementing the method */
PyObject *im_self; /* The instance it is bound to */
PyObject *im_weakreflist; /* List of weak references */
} PyMethodObject;
References
- [1] (1, 2)
- Cython (http://cython.org/)
- [2]
- Python bug 30071, Duck-typing inspect.isfunction() (https://bugs.python.org/issue30071)
- [3]
- Python bug 33261, inspect.isgeneratorfunction fails on hand-created methods (https://bugs.python.org/issue33261 and https://github.com/python/cpython/pull/6448)
- [4]
- Python bug 33265, contextlib.ExitStack abuses __self__ (https://bugs.python.org/issue33265 and https://github.com/python/cpython/pull/6456)
- [5]
- PyMethodDef documentation (https://docs.python.org/3.7/c-api/structures.html#c.PyMethodDef)
- [6]
- PEP proposal: unifying function/method classes (https://mail.python.org/pipermail/python-ideas/2018-March/049398.html)
Copyright
This document has been placed in the public domain.
Source: https://github.com/python-discord/peps/blob/main/pep-0575.rst
Last modified: 2022-03-09 16:04:44 GMT