PEP 505 – None-aware operators
- PEP
- 505
- Title
- None-aware operators
- Author
- Mark E. Haase <mehaase at gmail.com>, Steve Dower <steve.dower at python.org>
- Status
- Deferred
- Type
- Standards Track
- Created
- 18-Sep-2015
- Python-Version
- 3.8
Abstract
Several modern programming languages have so-called “null
-coalescing” or
“null
- aware” operators, including C# [1], Dart [2], Perl, Swift, and PHP
(starting in version 7). There are also stage 3 draft proposals for their
addition to ECMAScript (a.k.a. JavaScript) [3] [4]. These operators provide
syntactic sugar for common patterns involving null references.
- The “
null
-coalescing” operator is a binary operator that returns its left operand if it is notnull
. Otherwise it returns its right operand. - The “
null
-aware member access” operator accesses an instance member only if that instance is non-null
. Otherwise it returnsnull
. (This is also called a “safe navigation” operator.) - The “
null
-aware index access” operator accesses an element of a collection only if that collection is non-null
. Otherwise it returnsnull
. (This is another type of “safe navigation” operator.)
This PEP proposes three None
-aware operators for Python, based on the
definitions and other language’s implementations of those above. Specifically:
- The “
None
coalescing” binary operator??
returns the left hand side if it evaluates to a value that is notNone
, or else it evaluates and returns the right hand side. A coalescing??=
augmented assignment operator is included. - The “
None
-aware attribute access” operator?.
(“maybe dot”) evaluates the complete expression if the left hand side evaluates to a value that is notNone
- The “
None
-aware indexing” operator?[]
(“maybe subscript”) evaluates the complete expression if the left hand site evaluates to a value that is notNone
See the Grammar changes section for specifics and examples of the required grammar changes.
See the Examples section for more realistic examples of code that could be updated to use the new operators.
Syntax and Semantics
Specialness of None
The None
object denotes the lack of a value. For the purposes of these
operators, the lack of a value indicates that the remainder of the expression
also lacks a value and should not be evaluated.
A rejected proposal was to treat any value that evaluates as “false” in a Boolean context as not having a value. However, the purpose of these operators is to propagate the “lack of value” state, rather than the “false” state.
Some argue that this makes None
special. We contend that None
is
already special, and that using it as both the test and the result of these
operators does not change the existing semantics in any way.
See the Rejected Ideas section for discussions on alternate approaches.
Grammar changes
The following rules of the Python grammar are updated to read:
augassign: ('+=' | '-=' | '*=' | '@=' | '/=' | '%=' | '&=' | '|=' | '^=' |
'<<=' | '>>=' | '**=' | '//=' | '??=')
power: coalesce ['**' factor]
coalesce: atom_expr ['??' factor]
atom_expr: ['await'] atom trailer*
trailer: ('(' [arglist] ')' |
'[' subscriptlist ']' |
'?[' subscriptlist ']' |
'.' NAME |
'?.' NAME)
The coalesce rule
The coalesce
rule provides the ??
binary operator. Unlike most binary
operators, the right-hand side is not evaluated until the left-hand side is
determined to be None
.
The ??
operator binds more tightly than other binary operators as most
existing implementations of these do not propagate None
values (they will
typically raise TypeError
). Expressions that are known to potentially
result in None
can be substituted for a default value without needing
additional parentheses.
Some examples of how implicit parentheses are placed when evaluating operator
precedence in the presence of the ??
operator:
a, b = None, None
def c(): return None
def ex(): raise Exception()
(a ?? 2 ** b ?? 3) == a ?? (2 ** (b ?? 3))
(a * b ?? c // d) == a * (b ?? c) // d
(a ?? True and b ?? False) == (a ?? True) and (b ?? False)
(c() ?? c() ?? True) == True
(True ?? ex()) == True
(c ?? ex)() == c()
Particularly for cases such as a ?? 2 ** b ?? 3
, parenthesizing the
sub-expressions any other way would result in TypeError
, as int.__pow__
cannot be called with None
(and the fact that the ??
operator is used
at all implies that a
or b
may be None
). However, as usual,
while parentheses are not required they should be added if it helps improve
readability.
An augmented assignment for the ??
operator is also added. Augmented
coalescing assignment only rebinds the name if its current value is None
.
If the target name already has a value, the right-hand side is not evaluated.
For example:
a = None
b = ''
c = 0
a ??= 'value'
b ??= undefined_name
c ??= shutil.rmtree('/') # don't try this at home, kids
assert a == 'value'
assert b == ''
assert c == 0 and any(os.scandir('/'))
The maybe-dot and maybe-subscript operators
The maybe-dot and maybe-subscript operators are added as trailers for atoms, so that they may be used in all the same locations as the regular operators, including as part of an assignment target (more details below). As the existing evaluation rules are not directly embedded in the grammar, we specify the required changes below.
Assume that the atom
is always successfully evaluated. Each trailer
is
then evaluated from left to right, applying its own parameter (either its
arguments, subscripts or attribute name) to produce the value for the next
trailer
. Finally, if present, await
is applied.
For example, await a.b(c).d[e]
is currently parsed as
['await', 'a', '.b', '(c)', '.d', '[e]']
and evaluated:
_v = a
_v = _v.b
_v = _v(c)
_v = _v.d
_v = _v[e]
await _v
When a None
-aware operator is present, the left-to-right evaluation may be
short-circuited. For example, await a?.b(c).d?[e]
is evaluated:
_v = a
if _v is not None:
_v = _v.b
_v = _v(c)
_v = _v.d
if _v is not None:
_v = _v[e]
await _v
Note
await
will almost certainly fail in this context, as it would in
the case where code attempts await None
. We are not proposing to add a
None
-aware await
keyword here, and merely include it in this
example for completeness of the specification, since the atom_expr
grammar rule includes the keyword. If it were in its own rule, we would have
never mentioned it.
Parenthesised expressions are handled by the atom
rule (not shown above),
which will implicitly terminate the short-circuiting behaviour of the above
transformation. For example, (a?.b ?? c).d?.e
is evaluated as:
# a?.b
_v = a
if _v is not None:
_v = _v.b
# ... ?? c
if _v is None:
_v = c
# (...).d?.e
_v = _v.d
if _v is not None:
_v = _v.e
When used as an assignment target, the None
-aware operations may only be
used in a “load” context. That is, a?.b = 1
and a?[b] = 1
will raise
SyntaxError
. Use earlier in the expression (a?.b.c = 1
) is permitted,
though unlikely to be useful unless combined with a coalescing operation:
(a?.b ?? d).c = 1
Reading expressions
For the maybe-dot and maybe-subscript operators, the intention is that
expressions including these operators should be read and interpreted as for the
regular versions of these operators. In “normal” cases, the end results are
going to be identical between an expression such as a?.b?[c]
and
a.b[c]
, and just as we do not currently read “a.b” as “read attribute b
from a if it has an attribute a or else it raises AttributeError”, there is
no need to read “a?.b” as “read attribute b from a if a is not None”
(unless in a context where the listener needs to be aware of the specific
behaviour).
For coalescing expressions using the ??
operator, expressions should either
be read as “or … if None” or “coalesced with”. For example, the expression
a.get_value() ?? 100
would be read “call a dot get_value or 100 if None”,
or “call a dot get_value coalesced with 100”.
Note
Reading code in spoken text is always lossy, and so we make no attempt to define an unambiguous way of speaking these operators. These suggestions are intended to add context to the implications of adding the new syntax.
Examples
This section presents some examples of common None
patterns and shows what
conversion to use None
-aware operators may look like.
Standard Library
Using the find-pep505.py
script[5]_ an analysis of the Python 3.7 standard
library discovered up to 678 code snippets that could be replaced with use of
one of the None
-aware operators:
$ find /usr/lib/python3.7 -name '*.py' | xargs python3.7 find-pep505.py
<snip>
Total None-coalescing `if` blocks: 449
Total [possible] None-coalescing `or`: 120
Total None-coalescing ternaries: 27
Total Safe navigation `and`: 13
Total Safe navigation `if` blocks: 61
Total Safe navigation ternaries: 8
Some of these are shown below as examples before and after converting to use the new operators.
From bisect.py
:
def insort_right(a, x, lo=0, hi=None):
# ...
if hi is None:
hi = len(a)
# ...
After updating to use the ??=
augmented assignment statement:
def insort_right(a, x, lo=0, hi=None):
# ...
hi ??= len(a)
# ...
From calendar.py
:
encoding = options.encoding
if encoding is None:
encoding = sys.getdefaultencoding()
optdict = dict(encoding=encoding, css=options.css)
After updating to use the ??
operator:
optdict = dict(encoding=options.encoding ?? sys.getdefaultencoding(),
css=options.css)
From email/generator.py
(and importantly note that there is no way to
substitute or
for ??
in this situation):
mangle_from_ = True if policy is None else policy.mangle_from_
After updating:
mangle_from_ = policy?.mangle_from_ ?? True
From asyncio/subprocess.py
:
def pipe_data_received(self, fd, data):
if fd == 1:
reader = self.stdout
elif fd == 2:
reader = self.stderr
else:
reader = None
if reader is not None:
reader.feed_data(data)
After updating to use the ?.
operator:
def pipe_data_received(self, fd, data):
if fd == 1:
reader = self.stdout
elif fd == 2:
reader = self.stderr
else:
reader = None
reader?.feed_data(data)
From asyncio/tasks.py
:
try:
await waiter
finally:
if timeout_handle is not None:
timeout_handle.cancel()
After updating to use the ?.
operator:
try:
await waiter
finally:
timeout_handle?.cancel()
From ctypes/_aix.py
:
if libpaths is None:
libpaths = []
else:
libpaths = libpaths.split(":")
After updating:
libpaths = libpaths?.split(":") ?? []
From os.py
:
if entry.is_dir():
dirs.append(name)
if entries is not None:
entries.append(entry)
else:
nondirs.append(name)
After updating to use the ?.
operator:
if entry.is_dir():
dirs.append(name)
entries?.append(entry)
else:
nondirs.append(name)
From importlib/abc.py
:
def find_module(self, fullname, path):
if not hasattr(self, 'find_spec'):
return None
found = self.find_spec(fullname, path)
return found.loader if found is not None else None
After partially updating:
def find_module(self, fullname, path):
if not hasattr(self, 'find_spec'):
return None
return self.find_spec(fullname, path)?.loader
After extensive updating (arguably excessive, though that’s for the style guides to determine):
def find_module(self, fullname, path):
return getattr(self, 'find_spec', None)?.__call__(fullname, path)?.loader
From dis.py
:
def _get_const_info(const_index, const_list):
argval = const_index
if const_list is not None:
argval = const_list[const_index]
return argval, repr(argval)
After updating to use the ?[]
and ??
operators:
def _get_const_info(const_index, const_list):
argval = const_list?[const_index] ?? const_index
return argval, repr(argval)
jsonify
This example is from a Python web crawler that uses the Flask framework as its front-end. This function retrieves information about a web site from a SQL database and formats it as JSON to send to an HTTP client:
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen.isoformat() if site.first_seen is not None else None,
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen.isoformat() if site.last_seen is not None else None,
url=site.url.rstrip('/')
)
Both first_seen
and last_seen
are allowed to be null
in the
database, and they are also allowed to be null
in the JSON response. JSON
does not have a native way to represent a datetime
, so the server’s contract
states that any non-null
date is represented as an ISO-8601 string.
Without knowing the exact semantics of the first_seen
and last_seen
attributes, it is impossible to know whether the attribute can be safely or
performantly accessed multiple times.
One way to fix this code is to replace each conditional expression with an
explicit value assignment and a full if
/else
block:
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
first_seen_dt = site.first_seen
if first_seen_dt is None:
first_seen = None
else:
first_seen = first_seen_dt.isoformat()
last_seen_dt = site.last_seen
if last_seen_dt is None:
last_seen = None
else:
last_seen = last_seen_dt.isoformat()
return jsonify(
first_seen=first_seen,
id=site.id,
is_active=site.is_active,
last_seen=last_seen,
url=site.url.rstrip('/')
)
This adds ten lines of code and four new code paths to the function,
dramatically increasing the apparent complexity. Rewriting using the
None
-aware attribute operator results in shorter code with more clear
intent:
class SiteView(FlaskView):
@route('/site/<id_>', methods=['GET'])
def get_site(self, id_):
site = db.query('site_table').find(id_)
return jsonify(
first_seen=site.first_seen?.isoformat(),
id=site.id,
is_active=site.is_active,
last_seen=site.last_seen?.isoformat(),
url=site.url.rstrip('/')
)
Grab
The next example is from a Python scraping library called Grab:
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
if ctype is None:
return 'application/octet-stream'
else:
return ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
if filename is None:
self.filename = self.get_random_filename()
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
if filename is None:
self.filename = os.path.split(path)[1]
else:
self.filename = filename
if content_type is None:
self.content_type = self.find_content_type(self.filename)
else:
self.content_type = content_type
This example contains several good examples of needing to provide default values. Rewriting to use conditional expressions reduces the overall lines of code, but does not necessarily improve readability:
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return 'application/octet-stream' if ctype is None else ctype
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = (self.get_random_filename() if filename
is None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = (os.path.split(path)[1] if filename is
None else filename)
self.content_type = (self.find_content_type(self.filename)
if content_type is None else content_type)
The first ternary expression is tidy, but it reverses the intuitive order of
the operands: it should return ctype
if it has a value and use the string
literal as fallback. The other ternary expressions are unintuitive and so
long that they must be wrapped. The overall readability is worsened, not
improved.
Rewriting using the None
coalescing operator:
class BaseUploadObject(object):
def find_content_type(self, filename):
ctype, encoding = mimetypes.guess_type(filename)
return ctype ?? 'application/octet-stream'
class UploadContent(BaseUploadObject):
def __init__(self, content, filename=None, content_type=None):
self.content = content
self.filename = filename ?? self.get_random_filename()
self.content_type = content_type ?? self.find_content_type(self.filename)
class UploadFile(BaseUploadObject):
def __init__(self, path, filename=None, content_type=None):
self.path = path
self.filename = filename ?? os.path.split(path)[1]
self.content_type = content_type ?? self.find_content_type(self.filename)
This syntax has an intuitive ordering of the operands. In find_content_type
,
for example, the preferred value ctype
appears before the fallback value.
The terseness of the syntax also makes for fewer lines of code and less code to
visually parse, and reading from left-to-right and top-to-bottom more accurately
follows the execution flow.
Rejected Ideas
The first three ideas in this section are oft-proposed alternatives to treating
None
as special. For further background on why these are rejected, see their
treatment in PEP 531 and
PEP 532 and the associated
discussions.
No-Value Protocol
The operators could be generalised to user-defined types by defining a protocol
to indicate when a value represents “no value”. Such a protocol may be a dunder
method __has_value__(self)
that returns True
if the value should be
treated as having a value, and False
if the value should be treated as no
value.
With this generalization, object
would implement a dunder method equivalent
to this:
def __has_value__(self):
return True
NoneType
would implement a dunder method equivalent to this:
def __has_value__(self):
return False
In the specification section, all uses of x is None
would be replaced with
not x.__has_value__()
.
This generalization would allow for domain-specific “no-value” objects to be
coalesced just like None
. For example, the pyasn1
package has a type
called Null
that represents an ASN.1 null
:
>>> from pyasn1.type import univ
>>> univ.Null() ?? univ.Integer(123)
Integer(123)
Similarly, values such as math.nan
and NotImplemented
could be treated
as representing no value.
However, the “no-value” nature of these values is domain-specific, which means
they should be treated as a value by the language. For example,
math.nan.imag
is well defined (it’s 0.0
), and so short-circuiting
math.nan?.imag
to return math.nan
would be incorrect.
As None
is already defined by the language as being the value that
represents “no value”, and the current specification would not preclude
switching to a protocol in the future (though changes to built-in objects would
not be compatible), this idea is rejected for now.
Boolean-aware operators
This suggestion is fundamentally the same as adding a no-value protocol, and so the discussion above also applies.
Similar behavior to the ??
operator can be achieved with an or
expression, however or
checks whether its left operand is false-y and not
specifically None
. This approach is attractive, as it requires fewer changes
to the language, but ultimately does not solve the underlying problem correctly.
Assuming the check is for truthiness rather than None
, there is no longer a
need for the ??
operator. However, applying this check to the ?.
and
?[]
operators prevents perfectly valid operations applying
Consider the following example, where get_log_list()
may return either a
list containing current log messages (potentially empty), or None
if logging
is not enabled:
lst = get_log_list()
lst?.append('A log message')
If ?.
is checking for true values rather than specifically None
and the
log has not been initialized with any items, no item will ever be appended. This
violates the obvious intent of the code, which is to append an item. The
append
method is available on an empty list, as are all other list methods,
and there is no reason to assume that these members should not be used because
the list is presently empty.
Further, there is no sensible result to use in place of the expression. A
normal lst.append
returns None
, but under this idea lst?.append
may
result in either []
or None
, depending on the value of lst
. As with
the examples in the previous section, this makes no sense.
As checking for truthiness rather than None
results in apparently valid
expressions no longer executing as intended, this idea is rejected.
Exception-aware operators
Arguably, the reason to short-circuit an expression when None
is encountered
is to avoid the AttributeError
or TypeError
that would be raised under
normal circumstances. As an alternative to testing for None
, the ?.
and
?[]
operators could instead handle AttributeError
and TypeError
raised by the operation and skip the remainder of the expression.
This produces a transformation for a?.b.c?.d.e
similar to this:
_v = a
try:
_v = _v.b
except AttributeError:
pass
else:
_v = _v.c
try:
_v = _v.d
except AttributeError:
pass
else:
_v = _v.e
One open question is which value should be returned as the expression when an
exception is handled. The above example simply leaves the partial result, but
this is not helpful for replacing with a default value. An alternative would be
to force the result to None
, which then raises the question as to why
None
is special enough to be the result but not special enough to be the
test.
Secondly, this approach masks errors within code executed implicitly as part of
the expression. For ?.
, any AttributeError
within a property or
__getattr__
implementation would be hidden, and similarly for ?[]
and
__getitem__
implementations.
Similarly, simple typing errors such as {}?.ietms()
could go unnoticed.
Existing conventions for handling these kinds of errors in the form of the
getattr
builtin and the .get(key, default)
method pattern established by
dict
show that it is already possible to explicitly use this behaviour.
As this approach would hide errors in code, it is rejected.
None
-aware Function Call
The None
-aware syntax applies to attribute and index access, so it seems
natural to ask if it should also apply to function invocation syntax. It might
be written as foo?()
, where foo
is only called if it is not None.
This has been deferred on the basis of the proposed operators being intended to aid traversal of partially populated hierarchical data structures, not for traversal of arbitrary class hierarchies. This is reflected in the fact that none of the other mainstream languages that already offer this syntax have found it worthwhile to support a similar syntax for optional function invocations.
A workaround similar to that used by C# would be to write
maybe_none?.__call__(arguments)
. If the callable is None
, the
expression will not be evaluated. (The C# equivalent uses ?.Invoke()
on its
callable type.)
?
Unary Postfix Operator
To generalize the None
-aware behavior and limit the number of new operators
introduced, a unary, postfix operator spelled ?
was suggested. The idea is
that ?
might return a special object that could would override dunder
methods that return self
. For example, foo?
would evaluate to foo
if
it is not None
, otherwise it would evaluate to an instance of
NoneQuestion
:
class NoneQuestion():
def __call__(self, *args, **kwargs):
return self
def __getattr__(self, name):
return self
def __getitem__(self, key):
return self
With this new operator and new type, an expression like foo?.bar[baz]
evaluates to NoneQuestion
if foo
is None. This is a nifty
generalization, but it’s difficult to use in practice since most existing code
won’t know what NoneQuestion
is.
Going back to one of the motivating examples above, consider the following:
>>> import json
>>> created = None
>>> json.dumps({'created': created?.isoformat()})
The JSON serializer does not know how to serialize NoneQuestion
, nor will
any other API. This proposal actually requires lots of specialized logic
throughout the standard library and any third party library.
At the same time, the ?
operator may also be too general, in the sense
that it can be combined with any other operator. What should the following
expressions mean?:
>>> x? + 1
>>> x? -= 1
>>> x? == 1
>>> ~x?
This degree of generalization is not useful. The operators actually proposed herein are intentionally limited to a few operators that are expected to make it easier to write common code patterns.
Built-in maybe
Haskell has a concept called Maybe that
encapsulates the idea of an optional value without relying on any special
keyword (e.g. null
) or any special instance (e.g. None
). In Haskell, the
purpose of Maybe
is to avoid separate handling of “something” and nothing”.
A Python package called pymaybe provides a rough approximation. The documentation shows the following example:
>>> maybe('VALUE').lower()
'value'
>>> maybe(None).invalid().method().or_else('unknown')
'unknown'
The function maybe()
returns either a Something
instance or a
Nothing
instance. Similar to the unary postfix operator described in the
previous section, Nothing
overrides dunder methods in order to allow
chaining on a missing value.
Note that or_else()
is eventually required to retrieve the underlying value
from pymaybe
’s wrappers. Furthermore, pymaybe
does not short circuit any
evaluation. Although pymaybe
has some strengths and may be useful in its own
right, it also demonstrates why a pure Python implementation of coalescing is
not nearly as powerful as support built into the language.
The idea of adding a builtin maybe
type to enable this scenario is rejected.
Just use a conditional expression
Another common way to initialize default values is to use the ternary operator. Here is an excerpt from the popular Requests package:
data = [] if data is None else data
files = [] if files is None else files
headers = {} if headers is None else headers
params = {} if params is None else params
hooks = {} if hooks is None else hooks
This particular formulation has the undesirable effect of putting the operands
in an unintuitive order: the brain thinks, “use data
if possible and use
[]
as a fallback,” but the code puts the fallback before the preferred
value.
The author of this package could have written it like this instead:
data = data if data is not None else []
files = files if files is not None else []
headers = headers if headers is not None else {}
params = params if params is not None else {}
hooks = hooks if hooks is not None else {}
This ordering of the operands is more intuitive, but it requires 4 extra
characters (for “not “). It also highlights the repetition of identifiers:
data if data
, files if files
, etc.
When written using the None
coalescing operator, the sample reads:
data = data ?? []
files = files ?? []
headers = headers ?? {}
params = params ?? {}
hooks = hooks ?? {}
References
- [1]
- C# Reference: Operators (https://msdn.microsoft.com/en-us/library/6a71f45d.aspx)
- [2]
- A Tour of the Dart Language: Operators (https://www.dartlang.org/docs/dart-up-and-running/ch02.html#operators)
- [3]
- Proposal: Nullish Coalescing for JavaScript (https://github.com/tc39/proposal-nullish-coalescing)
- [4]
- Proposal: Optional Chaining for JavaScript (https://github.com/tc39/proposal-optional-chaining)
- [5]
- Associated scripts (https://github.com/python/peps/tree/master/pep-0505/)
Copyright
This document has been placed in the public domain.
Source: https://github.com/python-discord/peps/blob/main/pep-0505.rst
Last modified: 2022-01-21 11:03:51 GMT