PEP 3156 – Asynchronous IO Support Rebooted: the “asyncio” Module
- PEP
- 3156
- Title
- Asynchronous IO Support Rebooted: the “asyncio” Module
- Author
- Guido van Rossum <guido at python.org>
- BDFL-Delegate
- Antoine Pitrou <antoine at python.org>
- Discussions-To
- python-tulip@googlegroups.com
- Status
- Final
- Type
- Standards Track
- Created
- 12-Dec-2012
- Post-History
- 21-Dec-2012
- Replaces
- 3153
- Resolution
- Python-Dev
Contents
- Abstract
- Introduction
- Event Loop Interface Specification
- Coroutines and the Scheduler
- Synchronization
- Miscellaneous
- Wish List
- Open Issues
- References
- Acknowledgments
- Copyright
Abstract
This is a proposal for asynchronous I/O in Python 3, starting at
Python 3.3. Consider this the concrete proposal that is missing from
PEP 3153. The proposal includes a pluggable event loop, transport and
protocol abstractions similar to those in Twisted, and a higher-level
scheduler based on yield from
(PEP 380). The proposed package
name is asyncio
.
Introduction
Status
A reference implementation exists under the code name Tulip. The
Tulip repo is linked from the References section at the end. Packages
based on this repo will be provided on PyPI (see References) to enable
using the asyncio
package with Python 3.3 installations.
As of October 20th 2013, the asyncio
package has been checked into
the Python 3.4 repository and released with Python 3.4-alpha-4, with
“provisional” API status. This is an expression of confidence and
intended to increase early feedback on the API, and not intended to
force acceptance of the PEP. The expectation is that the package will
keep provisional status in Python 3.4 and progress to final status in
Python 3.5. Development continues to occur primarily in the Tulip
repo, with changes occasionally merged into the CPython repo.
Dependencies
Python 3.3 is required for many of the proposed features. The reference implementation (Tulip) requires no new language or standard library features beyond Python 3.3, no third-party modules or packages, and no C code, except for the (optional) IOCP support on Windows.
Module Namespace
The specification here lives in a new top-level package, asyncio
.
Different components live in separate submodules of the package. The
package will import common APIs from their respective submodules and
make them available as package attributes (similar to the way the
email package works). For such common APIs, the name of the submodule
that actually defines them is not part of the specification. Less
common APIs may have to explicitly be imported from their respective
submodule, and in this case the submodule name is part of the
specification.
Classes and functions defined without a submodule name are assumed to live in the namespace of the top-level package. (But do not confuse these with methods of various classes, which for brevity are also used without a namespace prefix in certain contexts.)
Interoperability
The event loop is the place where most interoperability occurs. It should be easy for (Python 3.3 ports of) frameworks like Twisted, Tornado, or even gevents to either adapt the default event loop implementation to their needs using a lightweight adapter or proxy, or to replace the default event loop implementation with an adaptation of their own event loop implementation. (Some frameworks, like Twisted, have multiple event loop implementations. This should not be a problem since these all have the same interface.)
In most cases it should be possible for two different third-party frameworks to interoperate, either by sharing the default event loop implementation (each using its own adapter), or by sharing the event loop implementation of either framework. In the latter case two levels of adaptation would occur (from framework A’s event loop to the standard event loop interface, and from there to framework B’s event loop). Which event loop implementation is used should be under control of the main program (though a default policy for event loop selection is provided).
For this interoperability to be effective, the preferred direction of adaptation in third party frameworks is to keep the default event loop and adapt it to the framework’s API. Ideally all third party frameworks would give up their own event loop implementation in favor of the standard implementation. But not all frameworks may be satisfied with the functionality provided by the standard implementation.
In order to support both directions of adaptation, two separate APIs are specified:
- An interface for managing the current event loop
- The interface of a conforming event loop
An event loop implementation may provide additional methods and guarantees, as long as these are called out in the documentation as non-standard. An event loop implementation may also leave certain methods unimplemented if they cannot be implemented in the given environment; however, such deviations from the standard API should be considered only as a last resort, and only if the platform or environment forces the issue. (An example would be a platform where there is a system event loop that cannot be started or stopped; see “Embedded Event Loops” below.)
The event loop API does not depend on await/yield from
. Rather, it uses
a combination of callbacks, additional interfaces (transports and
protocols), and Futures. The latter are similar to those defined in
PEP 3148, but have a different implementation and are not tied to
threads. In particular, the result()
method raises an exception
instead of blocking when a result is not yet ready; the user is
expected to use callbacks (or await/yield from
) to wait for the result.
All event loop methods specified as returning a coroutine are allowed
to return either a Future or a coroutine, at the implementation’s
choice (the standard implementation always returns coroutines). All
event loop methods documented as accepting coroutine arguments must
accept both Futures and coroutines for such arguments. (A convenience
function, ensure_future()
, exists to convert an argument that is either a
coroutine or a Future into a Future.)
For users (like myself) who don’t like using callbacks, a scheduler is
provided for writing asynchronous I/O code as coroutines using the PEP
380 yield from
or PEP 492 await
expressions.
The scheduler is not pluggable;
pluggability occurs at the event loop level, and the standard
scheduler implementation should work with any conforming event loop
implementation. (In fact this is an important litmus test for
conforming implementations.)
For interoperability between code written using coroutines and other async frameworks, the scheduler defines a Task class that behaves like a Future. A framework that interoperates at the event loop level can wait for a Future to complete by adding a callback to the Future. Likewise, the scheduler offers an operation to suspend a coroutine until a callback is called.
If such a framework cannot use the Future and Task classes as-is, it
may reimplement the loop.create_future()
and
loop.create_task()
methods. These should return objects
implementing (a superset of) the Future/Task interfaces.
A less ambitious framework may just call the
loop.set_task_factory()
to replace the Task class without
implementing its own event loop.
The event loop API provides limited interoperability with threads: there is an API to submit a function to an executor (see PEP 3148) which returns a Future that is compatible with the event loop, and there is a method to schedule a callback with an event loop from another thread in a thread-safe manner.
Transports and Protocols
For those not familiar with Twisted, a quick explanation of the relationship between transports and protocols is in order. At the highest level, the transport is concerned with how bytes are transmitted, while the protocol determines which bytes to transmit (and to some extent when).
A different way of saying the same thing: a transport is an abstraction for a socket (or similar I/O endpoint) while a protocol is an abstraction for an application, from the transport’s point of view.
Yet another view is simply that the transport and protocol interfaces together define an abstract interface for using network I/O and interprocess I/O.
There is almost always a 1:1 relationship between transport and protocol objects: the protocol calls transport methods to send data, while the transport calls protocol methods to pass it data that has been received. Neither transport nor protocol methods “block” – they set events into motion and then return.
The most common type of transport is a bidirectional stream transport.
It represents a pair of buffered streams (one in each direction) that
each transmit a sequence of bytes. The most common example of a
bidirectional stream transport is probably a TCP connection. Another
common example is an SSL/TLS connection. But there are some other things
that can be viewed this way, for example an SSH session or a pair of
UNIX pipes. Typically there aren’t many different transport
implementations, and most of them come with the event loop
implementation. However, there is no requirement that all transports
must be created by calling an event loop method: a third party module
may well implement a new transport and provide a constructor or
factory function for it that simply takes an event loop as an argument
or calls get_event_loop()
.
Note that transports don’t need to use sockets, not even if they use TCP – sockets are a platform-specific implementation detail.
A bidirectional stream transport has two “ends”: one end talks to the network (or another process, or whatever low-level interface it wraps), and the other end talks to the protocol. The former uses whatever API is necessary to implement the transport; but the interface between transport and protocol is standardized by this PEP.
A protocol can represent some kind of “application-level” protocol such as HTTP or SMTP; it can also implement an abstraction shared by multiple protocols, or a whole application. A protocol’s primary interface is with the transport. While some popular protocols (and other abstractions) may have standard implementations, often applications implement custom protocols. It also makes sense to have libraries of useful third party protocol implementations that can be downloaded and installed from PyPI.
There general notion of transport and protocol includes other interfaces, where the transport wraps some other communication abstraction. Examples include interfaces for sending and receiving datagrams (e.g. UDP), or a subprocess manager. The separation of concerns is the same as for bidirectional stream transports and protocols, but the specific interface between transport and protocol is different in each case.
Details of the interfaces defined by the various standard types of transports and protocols are given later.
Event Loop Interface Specification
Event Loop Policy: Getting and Setting the Current Event Loop
Event loop management is controlled by an event loop policy, which is a global (per-process) object. There is a default policy, and an API to change the policy. A policy defines the notion of context; a policy manages a separate event loop per context. The default policy’s notion of context is defined as the current thread.
Certain platforms or programming frameworks may change the default policy to something more suitable to the expectations of the users of that platform or framework. Such platforms or frameworks must document their policy and at what point during their initialization sequence the policy is set, in order to avoid undefined behavior when multiple active frameworks want to override the default policy. (See also “Embedded Event Loops” below.)
To get the event loop for current context, use get_event_loop()
.
This returns an event loop object implementing the interface specified
below, or raises an exception in case no event loop has been set for
the current context and the current policy does not specify to create
one. It should never return None
.
To set the event loop for the current context, use
set_event_loop(event_loop)
, where event_loop
is an event loop
object, i.e. an instance of AbstractEventLoop
, or None
.
It is okay to set the current event loop to None
, in
which case subsequent calls to get_event_loop()
will raise an
exception. This is useful for testing code that should not depend on
the existence of a default event loop.
It is expected that get_event_loop()
returns a different event
loop object depending on the context (in fact, this is the definition
of context). It may create a new event loop object if none is set and
creation is allowed by the policy. The default policy will create a
new event loop only in the main thread (as defined by threading.py,
which uses a special subclass for the main thread), and only if
get_event_loop()
is called before set_event_loop()
is ever
called. (To reset this state, reset the policy.) In other threads an
event loop must be explicitly set. Other policies may behave
differently. Event loop by the default policy creation is lazy;
i.e. the first call to get_event_loop()
creates an event loop
instance if necessary and specified by the current policy.
For the benefit of unit tests and other special cases there’s a third
policy function: new_event_loop()
, which creates and returns a new
event loop object according to the policy’s default rules. To make
this the current event loop, you must call set_event_loop()
with
it.
To change the event loop policy, call
set_event_loop_policy(policy)
, where policy
is an event loop
policy object or None
. If not None
, the policy object must be
an instance of AbstractEventLoopPolicy
that defines methods
get_event_loop()
, set_event_loop(loop)
and
new_event_loop()
, all behaving like the functions described above.
Passing a policy value of None
restores the default event loop
policy (overriding the alternate default set by the platform or
framework). The default event loop policy is an instance of the class
DefaultEventLoopPolicy
. The current event loop policy object can
be retrieved by calling get_event_loop_policy()
.
TBD: describe child watchers and UNIX quirks for subprocess processing.
Passing an Event Loop Around Explicitly
It is possible to write code that uses an event loop without relying
on a global or per-thread default event loop. For this purpose, all
APIs that need access to the current event loop (and aren’t methods on
an event class) take an optional keyword argument named loop
. If
this argument is None
or unspecified, such APIs will call
get_event_loop()
to get the default event loop, but if the
loop
keyword argument is set to an event loop object, they will
use that event loop, and pass it along to any other such APIs they
call. For example, Future(loop=my_loop)
will create a Future tied
to the event loop my_loop
. When the default current event is
None
, the loop
keyword argument is effectively mandatory.
Note that an explicitly passed event loop must still belong to the
current thread; the loop
keyword argument does not magically
change the constraints on how an event loop can be used.
Specifying Times
As usual in Python, all timeouts, intervals and delays are measured in seconds, and may be ints or floats. However, absolute times are not specified as POSIX timestamps. The accuracy, precision and epoch of the clock are up to the implementation.
The default implementation uses time.monotonic()
. Books could be
written about the implications of this choice. Better read the docs
for the standard library time
module.
Embedded Event Loops
On some platforms an event loop is provided by the system. Such a
loop may already be running when the user code starts, and there may
be no way to stop or close it without exiting from the program. In
this case, the methods for starting, stopping and closing the event
loop may not be implementable, and is_running()
may always return
True
.
Event Loop Classes
There is no actual class named EventLoop
. There is an
AbstractEventLoop
class which defines all the methods without
implementations, and serves primarily as documentation. The following
concrete classes are defined:
SelectorEventLoop
is a concrete implementation of the full API based on theselectors
module (new in Python 3.4). The constructor takes one optional argument, aselectors.Selector
object. By default an instance ofselectors.DefaultSelector
is created and used.ProactorEventLoop
is a concrete implementation of the API except for the I/O event handling and signal handling methods. It is only defined on Windows (or on other platforms which support a similar API for “overlapped I/O”). The constructor takes one optional argument, aProactor
object. By default an instance ofIocpProactor
is created and used. (TheIocpProactor
class is not specified by this PEP; it is just an implementation detail of theProactorEventLoop
class.)
Event Loop Methods Overview
The methods of a conforming event loop are grouped into several categories. The first set of categories must be supported by all conforming event loop implementations, with the exception that embedded event loops may not implement the methods for starting, stopping and closing. (However, a partially-conforming event loop is still better than nothing. :-)
- Starting, stopping and closing:
run_forever()
,run_until_complete()
,stop()
,is_running()
,close()
,is_closed()
. - Basic and timed callbacks:
call_soon()
,call_later()
,call_at()
,time()
. - Thread interaction:
call_soon_threadsafe()
,run_in_executor()
,set_default_executor()
. - Internet name lookups:
getaddrinfo()
,getnameinfo()
. - Internet connections:
create_connection()
,create_server()
,create_datagram_endpoint()
. - Wrapped socket methods:
sock_recv()
,sock_sendall()
,sock_connect()
,sock_accept()
. - Tasks and futures support:
create_future()
,create_task()
,set_task_factory()
,get_task_factory()
. - Error handling:
get_exception_handler()
,set_exception_handler()
,default_exception_handler()
,call_exception_handler()
. - Debug mode:
get_debug()
,set_debug()
.
The second set of categories may be supported by conforming event
loop implementations. If not supported, they will raise
NotImplementedError
. (In the default implementation,
SelectorEventLoop
on UNIX systems supports all of these;
SelectorEventLoop
on Windows supports the I/O event handling
category; ProactorEventLoop
on Windows supports the pipes and
subprocess category.)
- I/O callbacks:
add_reader()
,remove_reader()
,add_writer()
,remove_writer()
. - Pipes and subprocesses:
connect_read_pipe()
,connect_write_pipe()
,subprocess_shell()
,subprocess_exec()
. - Signal callbacks:
add_signal_handler()
,remove_signal_handler()
.
Event Loop Methods
Starting, Stopping and Closing
An (unclosed) event loop can be in one of two states: running or stopped. These methods deal with starting and stopping an event loop:
run_forever()
. Runs the event loop untilstop()
is called. This cannot be called when the event loop is already running. (This has a long name in part to avoid confusion with earlier versions of this PEP, whererun()
had different behavior, in part because there are already too many APIs that have a method namedrun()
, and in part because there shouldn’t be many places where this is called anyway.)run_until_complete(future)
. Runs the event loop until the Future is done. If the Future is done, its result is returned, or its exception is raised. This cannot be called when the event loop is already running. The method creates a newTask
object if the parameter is a coroutine.stop()
. Stops the event loop as soon as it is convenient. It is fine to restart the loop withrun_forever()
orrun_until_complete()
subsequently; no scheduled callbacks will be lost if this is done. Note:stop()
returns normally and the current callback is allowed to continue. How soon after this point the event loop stops is up to the implementation, but the intention is to stop short of polling for I/O, and not to run any callbacks scheduled in the future; the major freedom an implementation has is how much of the “ready queue” (callbacks already scheduled withcall_soon()
) it processes before stopping.is_running()
. ReturnsTrue
if the event loop is currently running,False
if it is stopped.close()
. Closes the event loop, releasing any resources it may hold, such as the file descriptor used byepoll()
orkqueue()
, and the default executor. This should not be called while the event loop is running. After it has been called the event loop should not be used again. It may be called multiple times; subsequent calls are no-ops.is_closed()
. ReturnsTrue
if the event loop is closed,False
otherwise. (Primarily intended for error reporting; please don’t implement functionality based on this method.)
Basic Callbacks
Callbacks associated with the same event loop are strictly serialized: one callback must finish before the next one will be called. This is an important guarantee: when two or more callbacks use or modify shared state, each callback is guaranteed that while it is running, the shared state isn’t changed by another callback.
call_soon(callback, *args)
. This schedules a callback to be called as soon as possible. Returns aHandle
(see below) representing the callback, whosecancel()
method can be used to cancel the callback. It guarantees that callbacks are called in the order in which they were scheduled.call_later(delay, callback, *args)
. Arrange forcallback(*args)
to be called approximatelydelay
seconds in the future, once, unless cancelled. Returns aHandle
representing the callback, whosecancel()
method can be used to cancel the callback. Callbacks scheduled in the past or at exactly the same time will be called in an undefined order.call_at(when, callback, *args)
. This is likecall_later()
, but the time is expressed as an absolute time. Returns a similarHandle
. There is a simple equivalency:loop.call_later(delay, callback, *args)
is the same asloop.call_at(loop.time() + delay, callback, *args)
.time()
. Returns the current time according to the event loop’s clock. This may betime.time()
ortime.monotonic()
or some other system-specific clock, but it must return a float expressing the time in units of approximately one second since some epoch. (No clock is perfect – see PEP 418.)
Note: A previous version of this PEP defined a method named
call_repeatedly()
, which promised to call a callback at regular
intervals. This has been withdrawn because the design of such a
function is overspecified. On the one hand, a simple timer loop can
easily be emulated using a callback that reschedules itself using
call_later()
; it is also easy to write coroutine containing a loop
and a sleep()
call (a toplevel function in the module, see below).
On the other hand, due to the complexities of accurate timekeeping
there are many traps and pitfalls here for the unaware (see PEP 418),
and different use cases require different behavior in edge cases. It
is impossible to offer an API for this purpose that is bullet-proof in
all cases, so it is deemed better to let application designers decide
for themselves what kind of timer loop to implement.
Thread interaction
call_soon_threadsafe(callback, *args)
. Likecall_soon(callback, *args)
, but when called from another thread while the event loop is blocked waiting for I/O, unblocks the event loop. Returns aHandle
. This is the only method that is safe to call from another thread. (To schedule a callback for a later time in a threadsafe manner, you can useloop.call_soon_threadsafe(loop.call_later, when, callback, *args)
.) Note: this is not safe to call from a signal handler (since it may use locks). In fact, no API is signal-safe; if you want to handle signals, useadd_signal_handler()
described below.run_in_executor(executor, callback, *args)
. Arrange to callcallback(*args)
in an executor (see PEP 3148). Returns anasyncio.Future
instance whose result on success is the return value of that call. This is equivalent towrap_future(executor.submit(callback, *args))
. Ifexecutor
isNone
, the default executor set byset_default_executor()
is used. If no default executor has been set yet, aThreadPoolExecutor
with a default number of threads is created and set as the default executor. (The default implementation uses 5 threads in this case.)set_default_executor(executor)
. Set the default executor used byrun_in_executor()
. The argument must be a PEP 3148Executor
instance orNone
, in order to reset the default executor.
See also the wrap_future()
function described in the section about
Futures.
Internet name lookups
These methods are useful if you want to connect or bind a socket to an
address without the risk of blocking for the name lookup. They are
usually called implicitly by create_connection()
,
create_server()
or create_datagram_endpoint()
.
getaddrinfo(host, port, family=0, type=0, proto=0, flags=0)
. Similar to thesocket.getaddrinfo()
function but returns a Future. The Future’s result on success will be a list of the same format as returned bysocket.getaddrinfo()
, i.e. a list of(address_family, socket_type, socket_protocol, canonical_name, address)
whereaddress
is a 2-tuple(ipv4_address, port)
for IPv4 addresses and a 4-tuple(ipv4_address, port, flow_info, scope_id)
for IPv6 addresses. If thefamily
argument is zero or unspecified, the list returned may contain a mixture of IPv4 and IPv6 addresses; otherwise the addresses returned are constrained by thefamily
value (similar forproto
andflags
). The default implementation callssocket.getaddrinfo()
usingrun_in_executor()
, but other implementations may choose to implement their own DNS lookup. The optional arguments must be specified as keyword arguments.Note: implementations are allowed to implement a subset of the full socket.getaddrinfo() interface; e.g. they may not support symbolic port names, or they may ignore or incompletely implement the
type
,proto
andflags
arguments. However, iftype
andproto
are ignored, the argument values passed in should be copied unchanged into the return tuples’socket_type
andsocket_protocol
elements. (You can’t ignorefamily
, since IPv4 and IPv6 addresses must be looked up differently. The only permissible values forfamily
aresocket.AF_UNSPEC
(0
),socket.AF_INET
andsocket.AF_INET6
, and the latter only if it is defined by the platform.)getnameinfo(sockaddr, flags=0)
. Similar tosocket.getnameinfo()
but returns a Future. The Future’s result on success will be a tuple(host, port)
. Same implementation remarks as forgetaddrinfo()
.
Internet connections
These are the high-level interfaces for managing internet connections. Their use is recommended over the corresponding lower-level interfaces because they abstract away the differences between selector-based and proactor-based event loops.
Note that the client and server side of stream connections use the same transport and protocol interface. However, datagram endpoints use a different transport and protocol interface.
create_connection(protocol_factory, host, port, <options>)
. Creates a stream connection to a given internet host and port. This is a task that is typically called from the client side of the connection. It creates an implementation-dependent bidirectional stream Transport to represent the connection, then callsprotocol_factory()
to instantiate (or retrieve) the user’s Protocol implementation, and finally ties the two together. (See below for the definitions of Transport and Protocol.) The user’s Protocol implementation is created or retrieved by callingprotocol_factory()
without arguments(*). The coroutine’s result on success is the(transport, protocol)
pair; if a failure prevents the creation of a successful connection, an appropriate exception will be raised. Note that when the coroutine completes, the protocol’sconnection_made()
method has not yet been called; that will happen when the connection handshake is complete.(*) There is no requirement that
protocol_factory
is a class. If your protocol class needs to have specific arguments passed to its constructor, you can uselambda
. You can also pass a triviallambda
that returns a previously constructed Protocol instance.The <options> are all specified using optional keyword arguments:
ssl
: PassTrue
to create an SSL/TLS transport (by default a plain TCP transport is created). Or pass anssl.SSLContext
object to override the default SSL context object to be used. If a default context is created it is up to the implementation to configure reasonable defaults. The reference implementation currently usesPROTOCOL_SSLv23
and sets theOP_NO_SSLv2
option, callsset_default_verify_paths()
and setsverify_mode
toCERT_REQUIRED
. In addition, whenever the context (default or otherwise) specifies averify_mode
ofCERT_REQUIRED
orCERT_OPTIONAL
, if a hostname is given, immediately after a successful handshakessl.match_hostname(peercert, hostname)
is called, and if this raises an exception the connection is closed. (To avoid this behavior, pass in an SSL context that hasverify_mode
set toCERT_NONE
. But this means you are not secure, and vulnerable to for example man-in-the-middle attacks.)family
,proto
,flags
: Address family, protocol and flags to be passed through togetaddrinfo()
. These all default to0
, which means “not specified”. (The socket type is alwaysSOCK_STREAM
.) If any of these values are not specified, thegetaddrinfo()
method will choose appropriate values. Note:proto
has nothing to do with the high-level Protocol concept or theprotocol_factory
argument.sock
: An optional socket to be used instead of using thehost
,port
,family
,proto
andflags
arguments. If this is given,host
andport
must be explicitly set toNone
.local_addr
: If given, a(host, port)
tuple used to bind the socket to locally. This is rarely needed but on multi-homed servers you occasionally need to force a connection to come from a specific address. This is how you would do that. The host and port are looked up usinggetaddrinfo()
.server_hostname
: This is only relevant when using SSL/TLS; it should not be used whenssl
is not set. Whenssl
is set, this sets or overrides the hostname that will be verified. By default the value of thehost
argument is used. Ifhost
is empty, there is no default and you must pass a value forserver_hostname
. To disable hostname verification (which is a serious security risk) you must pass an empty string here and pass anssl.SSLContext
object whoseverify_mode
is set tossl.CERT_NONE
as thessl
argument.
create_server(protocol_factory, host, port, <options>)
. Enters a serving loop that accepts connections. This is a coroutine that completes once the serving loop is set up to serve. The return value is aServer
object which can be used to stop the serving loop in a controlled fashion (see below). Multiple sockets may be bound if the specified address allows both IPv4 and IPv6 connections.Each time a connection is accepted,
protocol_factory
is called without arguments(**) to create a Protocol, a bidirectional stream Transport is created to represent the network side of the connection, and the two are tied together by callingprotocol.connection_made(transport)
.(**) See previous footnote for
create_connection()
. However, sinceprotocol_factory()
is called once for each new incoming connection, it should return a new Protocol object each time it is called.The <options> are all specified using optional keyword arguments:
ssl
: Pass anssl.SSLContext
object (or an object with the same interface) to override the default SSL context object to be used. (Unlike forcreate_connection()
, passingTrue
does not make sense here – theSSLContext
object is needed to specify the certificate and key.)backlog
: Backlog value to be passed to thelisten()
call. The default is implementation-dependent; in the default implementation the default value is100
.reuse_address
: Whether to set theSO_REUSEADDR
option on the socket. The default isTrue
on UNIX,False
on Windows.family
,flags
: Address family and flags to be passed- through to
getaddrinfo()
. The family defaults toAF_UNSPEC
; the flags default toAI_PASSIVE
. (The socket type is alwaysSOCK_STREAM
; the socket protocol always set to0
, to letgetaddrinfo()
choose.)
sock
: An optional socket to be used instead of using thehost
,port
,family
andflags
arguments. If this is given,host
andport
must be explicitly set toNone
.
create_datagram_endpoint(protocol_factory, local_addr=None, remote_addr=None, <options>)
. Creates an endpoint for sending and receiving datagrams (typically UDP packets). Because of the nature of datagram traffic, there are no separate calls to set up client and server side, since usually a single endpoint acts as both client and server. This is a coroutine that returns a(transport, protocol)
pair on success, or raises an exception on failure. If the coroutine returns successfully, the transport will call callbacks on the protocol whenever a datagram is received or the socket is closed; it is up to the protocol to call methods on the protocol to send datagrams. The transport returned is aDatagramTransport
. The protocol returned is aDatagramProtocol
. These are described later.Mandatory positional argument:
protocol_factory
: A class or factory function that will be called exactly once, without arguments, to construct the protocol object to be returned. The interface between datagram transport and protocol is described below.
Optional arguments that may be specified positionally or as keyword arguments:
local_addr
: An optional tuple indicating the address to which the socket will be bound. If given this must be a(host, port)
pair. It will be passed togetaddrinfo()
to be resolved and the result will be passed to thebind()
method of the socket created. Ifgetaddrinfo()
returns more than one address, they will be tried in turn. If omitted, nobind()
call will be made.remote_addr
: An optional tuple indicating the address to which the socket will be “connected”. (Since there is no such thing as a datagram connection, this just specifies a default value for the destination address of outgoing datagrams.) If given this must be a(host, port)
pair. It will be passed togetaddrinfo()
to be resolved and the result will be passed tosock_connect()
together with the socket created. Ifgetaddrinfo()
returns more than one address, they will be tried in turn. If omitted, nosock_connect()
call will be made.
The <options> are all specified using optional keyword arguments:
family
,proto
,flags
: Address family, protocol and flags to be passed through togetaddrinfo()
. These all default to0
, which means “not specified”. (The socket type is alwaysSOCK_DGRAM
.) If any of these values are not specified, thegetaddrinfo()
method will choose appropriate values.
Note that if both
local_addr
andremote_addr
are present, all combinations of local and remote addresses with matching address family will be tried.
Wrapped Socket Methods
The following methods for doing async I/O on sockets are not for
general use. They are primarily meant for transport implementations
working with IOCP through the ProactorEventLoop
class. However,
they are easily implementable for other event loop types, so there is
no reason not to require them. The socket argument has to be a
non-blocking socket.
sock_recv(sock, n)
. Receive up ton
bytes from socketsock
. Returns a Future whose result on success will be a bytes object.sock_sendall(sock, data)
. Send bytesdata
to socketsock
. Returns a Future whose result on success will beNone
. Note: the name usessendall
instead ofsend
, to reflect that the semantics and signature of this method echo those of the standard library socket methodsendall()
rather thansend()
.sock_connect(sock, address)
. Connect to the given address. Returns a Future whose result on success will beNone
.sock_accept(sock)
. Accept a connection from a socket. The socket must be in listening mode and bound to an address. Returns a Future whose result on success will be a tuple(conn, peer)
whereconn
is a connected non-blocking socket andpeer
is the peer address.
I/O Callbacks
These methods are primarily meant for transport implementations
working with a selector. They are implemented by
SelectorEventLoop
but not by ProactorEventLoop
. Custom event
loop implementations may or may not implement them.
The fd
arguments below may be integer file descriptors, or
“file-like” objects with a fileno()
method that wrap integer file
descriptors. Not all file-like objects or file descriptors are
acceptable. Sockets (and socket file descriptors) are always
accepted. On Windows no other types are supported. On UNIX, pipes
and possibly tty devices are also supported, but disk files are not.
Exactly which special file types are supported may vary by platform
and per selector implementation. (Experimentally, there is at least
one kind of pseudo-tty on OS X that is supported by select
and
poll
but not by kqueue
: it is used by Emacs shell windows.)
add_reader(fd, callback, *args)
. Arrange forcallback(*args)
to be called whenever file descriptorfd
is deemed ready for reading. Callingadd_reader()
again for the same file descriptor implies a call toremove_reader()
for the same file descriptor.add_writer(fd, callback, *args)
. Likeadd_reader()
, but registers the callback for writing instead of for reading.remove_reader(fd)
. Cancels the current read callback for file descriptorfd
, if one is set. If no callback is currently set for the file descriptor, this is a no-op and returnsFalse
. Otherwise, it removes the callback arrangement and returnsTrue
.remove_writer(fd)
. This is toadd_writer()
asremove_reader()
is toadd_reader()
.
Pipes and Subprocesses
These methods are supported by SelectorEventLoop
on UNIX and
ProactorEventLoop
on Windows.
The transports and protocols used with pipes and subprocesses differ from those used with regular stream connections. These are described later.
Each of the methods below has a protocol_factory
argument, similar
to create_connection()
; this will be called exactly once, without
arguments, to construct the protocol object to be returned.
Each method is a coroutine that returns a (transport, protocol)
pair on success, or raises an exception on failure.
connect_read_pipe(protocol_factory, pipe)
: Create a unidrectional stream connection from a file-like object wrapping the read end of a UNIX pipe, which must be in non-blocking mode. The transport returned is aReadTransport
.connect_write_pipe(protocol_factory, pipe)
: Create a unidrectional stream connection from a file-like object wrapping the write end of a UNIX pipe, which must be in non-blocking mode. The transport returned is aWriteTransport
; it does not have any read-related methods. The protocol returned is aBaseProtocol
.subprocess_shell(protocol_factory, cmd, <options>)
: Create a subprocess fromcmd
, which is a string using the platform’s “shell” syntax. This is similar to the standard librarysubprocess.Popen()
class called withshell=True
. The remaining arguments and return value are described below.subprocess_exec(protocol_factory, *args, <options>)
: Create a subprocess from one or more string arguments, where the first string specifies the program to execute, and the remaining strings specify the program’s arguments. (Thus, together the string arguments form thesys.argv
value of the program, assuming it is a Python script.) This is similar to the standard librarysubprocess.Popen()
class called withshell=False
and the list of strings passed as the first argument; however, wherePopen()
takes a single argument which is list of strings,subprocess_exec()
takes multiple string arguments. The remaining arguments and return value are described below.
Apart from the way the program to execute is specified, the two
subprocess_*()
methods behave the same. The transport returned is
a SubprocessTransport
which has a different interface than the
common bidirectional stream transport. The protocol returned is a
SubprocessProtocol
which also has a custom interface.
The <options> are all specified using optional keyword arguments:
stdin
: Either a file-like object representing the pipe to be connected to the subprocess’s standard input stream usingconnect_write_pipe()
, or the constantsubprocess.PIPE
(the default). By default a new pipe will be created and connected.stdout
: Either a file-like object representing the pipe to be connected to the subprocess’s standard output stream usingconnect_read_pipe()
, or the constantsubprocess.PIPE
(the default). By default a new pipe will be created and connected.stderr
: Either a file-like object representing the pipe to be connected to the subprocess’s standard error stream usingconnect_read_pipe()
, or one of the constantssubprocess.PIPE
(the default) orsubprocess.STDOUT
. By default a new pipe will be created and connected. Whensubprocess.STDOUT
is specified, the subprocess’s standard error stream will be connected to the same pipe as the standard output stream.bufsize
: The buffer size to be used when creating a pipe; this is passed tosubprocess.Popen()
. In the default implementation this defaults to zero, and on Windows it must be zero; these defaults deviate fromsubprocess.Popen()
.executable
,preexec_fn
,close_fds
,cwd
,env
,startupinfo
,creationflags
,restore_signals
,start_new_session
,pass_fds
: These optional arguments are passed tosubprocess.Popen()
without interpretation.
Signal callbacks
These methods are only supported on UNIX.
add_signal_handler(sig, callback, *args)
. Whenever signalsig
is received, arrange forcallback(*args)
to be called. Specifying another callback for the same signal replaces the previous handler (only one handler can be active per signal). Thesig
must be a valid signal number defined in thesignal
module. If the signal cannot be handled this raises an exception:ValueError
if it is not a valid signal or if it is an uncatchable signal (e.g.SIGKILL
),RuntimeError
if this particular event loop instance cannot handle signals (since signals are global per process, only an event loop associated with the main thread can handle signals).remove_signal_handler(sig)
. Removes the handler for signalsig
, if one is set. Raises the same exceptions asadd_signal_handler()
(except that it may returnFalse
instead raisingRuntimeError
for uncatchable signals). ReturnsTrue
if a handler was removed successfully,False
if no handler was set.
Note: If these methods are statically known to be unsupported, they
may raise NotImplementedError
instead of RuntimeError
.
Mutual Exclusion of Callbacks
An event loop should enforce mutual exclusion of callbacks, i.e. it
should never start a callback while a previously callback is still
running. This should apply across all types of callbacks, regardless
of whether they are scheduled using call_soon()
, call_later()
,
call_at()
, call_soon_threadsafe()
, add_reader()
,
add_writer()
, or add_signal_handler()
.
Exceptions
There are two categories of exceptions in Python: those that derive
from the Exception
class and those that derive from
BaseException
. Exceptions deriving from Exception
will
generally be caught and handled appropriately; for example, they will
be passed through by Futures, and they will be logged and ignored when
they occur in a callback.
However, exceptions deriving only from BaseException
are typically
not caught, and will usually cause the program to terminate with a
traceback. In some cases they are caught and re-raised. (Examples of
this category include KeyboardInterrupt
and SystemExit
; it is
usually unwise to treat these the same as most other exceptions.)
The event loop passes the latter category into its exception handler. This is a callback which accepts a context dict as a parameter:
def exception_handler(context):
...
context may have many different keys but several of them are very widely used:
'message'
: error message.'exception'
: exception instance;None
if there is no exception.'source_traceback'
: a list of strings representing stack at the point the object involved in the error was created.'handle_traceback'
: a list of strings representing the stack at the moment the handle involved in the error was created.
The loop has the following methods related to exception handling:
get_exception_handler()
returns the current exception handler registered for the loop.set_exception_handler(handler)
sets the exception handler.default_exception_handler(context)
the default exception handler for this loop implementation.call_exception_handler(context)
passes context into the registered exception handler. This allows handling uncaught exceptions uniformly by third-party libraries.The loop uses
default_exception_handler()
if the default was not overridden by explicitset_exception_handler()
call.
Debug Mode
By default the loop operates in release mode. Applications may enable debug mode better error reporting at the cost of some performance.
In debug mode many additional checks are enabled, for example:
- Source tracebacks are available for unhandled exceptions in futures/tasks.
- The loop checks for slow callbacks to detect accidental blocking for I/O.
The
loop.slow_callback_duration
attribute controls the maximum execution time allowed between two yield points before a slow callback is reported. The default value is 0.1 seconds; it may be changed by assigning to it.
There are two methods related to debug mode:
get_debug()
returnsTrue
if debug mode is enabled,False
otherwise.set_debug(enabled)
enables debug mode if the argument isTrue
.
Debug mode is automatically enabled if the PYTHONASYNCIODEBUG
environment variable is defined and not empty.
Handles
The various methods for registering one-off callbacks
(call_soon()
, call_later()
, call_at()
and
call_soon_threadsafe()
) all return an object representing the
registration that can be used to cancel the callback. This object is
called a Handle
. Handles are opaque and have only one public
method:
cancel()
: Cancel the callback.
Note that add_reader()
, add_writer()
and
add_signal_handler()
do not return Handles.
Servers
The create_server()
method returns a Server
instance, which
wraps the sockets (or other network objects) used to accept requests.
This class has two public methods:
close()
: Close the service. This stops accepting new requests but does not cancel requests that have already been accepted and are currently being handled.wait_closed()
: A coroutine that blocks until the service is closed and all accepted requests have been handled.
Futures
The asyncio.Future
class here is intentionally similar to the
concurrent.futures.Future
class specified by PEP 3148, but there
are slight differences. Whenever this PEP talks about Futures or
futures this should be understood to refer to asyncio.Future
unless
concurrent.futures.Future
is explicitly mentioned. The supported
public API is as follows, indicating the differences with PEP 3148:
cancel()
. If the Future is already done (or cancelled), do nothing and returnFalse
. Otherwise, this attempts to cancel the Future and returnsTrue
. If the cancellation attempt is successful, eventually the Future’s state will change to cancelled (so thatcancelled()
will returnTrue
) and the callbacks will be scheduled. For regular Futures, cancellation will always succeed immediately; but for Tasks (see below) the task may ignore or delay the cancellation attempt.cancelled()
. ReturnsTrue
if the Future was successfully cancelled.done()
. ReturnsTrue
if the Future is done. Note that a cancelled Future is considered done too (here and everywhere).result()
. Returns the result set withset_result()
, or raises the exception set withset_exception()
. RaisesCancelledError
if cancelled. Difference with PEP 3148: This has no timeout argument and does not wait; if the future is not yet done, it raises an exception.exception()
. Returns the exception if set withset_exception()
, orNone
if a result was set withset_result()
. RaisesCancelledError
if cancelled. Difference with PEP 3148: This has no timeout argument and does not wait; if the future is not yet done, it raises an exception.add_done_callback(fn)
. Add a callback to be run when the Future becomes done (or is cancelled). If the Future is already done (or cancelled), schedules the callback to usingcall_soon()
. Difference with PEP 3148: The callback is never called immediately, and always in the context of the caller – typically this is a thread. You can think of this as calling the callback throughcall_soon()
. Note that in order to match PEP 3148, the callback (unlike all other callbacks defined in this PEP, and ignoring the convention from the section “Callback Style” below) is always called with a single argument, the Future object. (The motivation for strictly serializing callbacks scheduled withcall_soon()
applies here too.)remove_done_callback(fn)
. Remove the argument from the list of callbacks. This method is not defined by PEP 3148. The argument must be equal (using==
) to the argument passed toadd_done_callback()
. Returns the number of times the callback was removed.set_result(result)
. The Future must not be done (nor cancelled) already. This makes the Future done and schedules the callbacks. Difference with PEP 3148: This is a public API.set_exception(exception)
. The Future must not be done (nor cancelled) already. This makes the Future done and schedules the callbacks. Difference with PEP 3148: This is a public API.
The internal method set_running_or_notify_cancel()
is not
supported; there is no way to set the running state. Likewise,
the method running()
is not supported.
The following exceptions are defined:
InvalidStateError
. Raised whenever the Future is not in a state acceptable to the method being called (e.g. callingset_result()
on a Future that is already done, or callingresult()
on a Future that is not yet done).InvalidTimeoutError
. Raised byresult()
andexception()
when a nonzerotimeout
argument is given.CancelledError
. An alias forconcurrent.futures.CancelledError
. Raised whenresult()
orexception()
is called on a Future that is cancelled.TimeoutError
. An alias forconcurrent.futures.TimeoutError
. May be raised byrun_until_complete()
.
A Future is associated with an event loop when it is created.
A asyncio.Future
object is not acceptable to the wait()
and
as_completed()
functions in the concurrent.futures
package.
However, there are similar APIs asyncio.wait()
and
asyncio.as_completed()
, described below.
A asyncio.Future
object is acceptable to a yield from
expression
when used in a coroutine. This is implemented through the
__iter__()
interface on the Future. See the section “Coroutines
and the Scheduler” below.
When a Future is garbage-collected, if it has an associated exception
but neither result()
nor exception()
has ever been called, the
exception is logged. (When a coroutine uses yield from
to wait
for a Future, that Future’s result()
method is called once the
coroutine is resumed.)
In the future (pun intended) we may unify asyncio.Future
and
concurrent.futures.Future
, e.g. by adding an __iter__()
method
to the latter that works with yield from
. To prevent accidentally
blocking the event loop by calling e.g. result()
on a Future
that’s not done yet, the blocking operation may detect that an event
loop is active in the current thread and raise an exception instead.
However the current PEP strives to have no dependencies beyond Python
3.3, so changes to concurrent.futures.Future
are off the table for
now.
There are some public functions related to Futures:
asyncio.async(arg)
. This takes an argument that is either a coroutine object or a Future (i.e., anything you can use withyield from
) and returns a Future. If the argument is a Future, it is returned unchanged; if it is a coroutine object, it wraps it in a Task (remember thatTask
is a subclass ofFuture
).asyncio.wrap_future(future)
. This takes a PEP 3148 Future (i.e., an instance ofconcurrent.futures.Future
) and returns a Future compatible with the event loop (i.e., aasyncio.Future
instance).
Transports
Transports and protocols are strongly influenced by Twisted and PEP 3153. Users rarely implement or instantiate transports – rather, event loops offer utility methods to set up transports.
Transports work in conjunction with protocols. Protocols are typically written without knowing or caring about the exact type of transport used, and transports can be used with a wide variety of protocols. For example, an HTTP client protocol implementation may be used with either a plain socket transport or an SSL/TLS transport. The plain socket transport can be used with many different protocols besides HTTP (e.g. SMTP, IMAP, POP, FTP, IRC, SPDY).
The most common type of transport is a bidirectional stream transport.
There are also unidirectional stream transports (used for pipes) and
datagram transports (used by the create_datagram_endpoint()
method).
Methods For All Transports
get_extra_info(name, default=None)
. This is a catch-all method that returns implementation-specific information about a transport. The first argument is the name of the extra field to be retrieved. The optional second argument is a default value to be returned. Consult the implementation documentation to find out the supported extra field names. For an unsupported name, the default is always returned.
Bidirectional Stream Transports
A bidirectional stream transport is an abstraction on top of a socket or something similar (for example, a pair of UNIX pipes or an SSL/TLS connection).
Most connections have an asymmetric nature: the client and server
usually have very different roles and behaviors. Hence, the interface
between transport and protocol is also asymmetric. From the
protocol’s point of view, writing data is done by calling the
write()
method on the transport object; this buffers the data and
returns immediately. However, the transport takes a more active role
in reading data: whenever some data is read from the socket (or
other data source), the transport calls the protocol’s
data_received()
method.
Nevertheless, the interface between transport and protocol used by bidirectional streams is the same for clients as it is for servers, since the connection between a client and a server is essentially a pair of streams, one in each direction.
Bidirectional stream transports have the following public methods:
write(data)
. Write some bytes. The argument must be a bytes object. ReturnsNone
. The transport is free to buffer the bytes, but it must eventually cause the bytes to be transferred to the entity at the other end, and it must maintain stream behavior. That is,t.write(b'abc'); t.write(b'def')
is equivalent tot.write(b'abcdef')
, as well as to:t.write(b'a') t.write(b'b') t.write(b'c') t.write(b'd') t.write(b'e') t.write(b'f')
writelines(iterable)
. Equivalent to:for data in iterable: self.write(data)
write_eof()
. Close the writing end of the connection. Subsequent calls towrite()
are not allowed. Once all buffered data is transferred, the transport signals to the other end that no more data will be received. Some protocols don’t support this operation; in that case, callingwrite_eof()
will raise an exception. (Note: This used to be calledhalf_close()
, but unless you already know what it is for, that name doesn’t indicate which end is closed.)can_write_eof()
. ReturnTrue
if the protocol supportswrite_eof()
,False
if it does not. (This method typically returns a fixed value that depends only on the specific Transport class, not on the state of the Transport object. It is needed because some protocols need to change their behavior whenwrite_eof()
is unavailable. For example, in HTTP, to send data whose size is not known ahead of time, the end of the data is typically indicated usingwrite_eof()
; however, SSL/TLS does not support this, and an HTTP protocol implementation would have to use the “chunked” transfer encoding in this case. But if the data size is known ahead of time, the best approach in both cases is to use the Content-Length header.)get_write_buffer_size()
. Return the current size of the transport’s write buffer in bytes. This only knows about the write buffer managed explicitly by the transport; buffering in other layers of the network stack or elsewhere of the network is not reported.set_write_buffer_limits(high=None, low=None)
. Set the high- and low-water limits for flow control.These two values control when to call the protocol’s
pause_writing()
andresume_writing()
methods. If specified, the low-water limit must be less than or equal to the high-water limit. Neither value can be negative.The defaults are implementation-specific. If only the high-water limit is given, the low-water limit defaults to an implementation-specific value less than or equal to the high-water limit. Setting high to zero forces low to zero as well, and causes
pause_writing()
to be called whenever the buffer becomes non-empty. Setting low to zero causesresume_writing()
to be called only once the buffer is empty. Use of zero for either limit is generally sub-optimal as it reduces opportunities for doing I/O and computation concurrently.pause_reading()
. Suspend delivery of data to the protocol until a subsequentresume_reading()
call. Betweenpause_reading()
andresume_reading()
, the protocol’sdata_received()
method will not be called.resume_reading()
. Restart delivery of data to the protocol viadata_received()
. Note that “paused” is a binary state –pause_reading()
should only be called when the transport is not paused, whileresume_reading()
should only be called when the transport is paused.close()
. Sever the connection with the entity at the other end. Any data buffered bywrite()
will (eventually) be transferred before the connection is actually closed. The protocol’sdata_received()
method will not be called again. Once all buffered data has been flushed, the protocol’sconnection_lost()
method will be called withNone
as the argument. Note that this method does not wait for all that to happen.abort()
. Immediately sever the connection. Any data still buffered by the transport is thrown away. Soon, the protocol’sconnection_lost()
method will be called withNone
as argument.
Unidirectional Stream Transports
A writing stream transport supports the write()
, writelines()
,
write_eof()
, can_write_eof()
, close()
and abort()
methods
described for bidirectional stream transports.
A reading stream transport supports the pause_reading()
,
resume_reading()
and close()
methods described for
bidirectional stream transports.
A writing stream transport calls only connection_made()
and
connection_lost()
on its associated protocol.
A reading stream transport can call all protocol methods specified in
the Protocols section below (i.e., the previous two plus
data_received()
and eof_received()
).
Datagram Transports
Datagram transports have these methods:
sendto(data, addr=None)
. Sends a datagram (a bytes object). The optional second argument is the destination address. If omitted,remote_addr
must have been specified in thecreate_datagram_endpoint()
call that created this transport. If present, andremote_addr
was specified, they must match. The (data, addr) pair may be sent immediately or buffered. The return value isNone
.abort()
. Immediately close the transport. Buffered data will be discarded.close()
. Close the transport. Buffered data will be transmitted asynchronously.
Datagram transports call the following methods on the associated
protocol object: connection_made()
, connection_lost()
,
error_received()
and datagram_received()
. (“Connection”
in these method names is a slight misnomer, but the concepts still
exist: connection_made()
means the transport representing the
endpoint has been created, and connection_lost()
means the
transport is closed.)
Subprocess Transports
Subprocess transports have the following methods:
get_pid()
. Return the process ID of the subprocess.get_returncode()
. Return the process return code, if the process has exited; otherwiseNone
.get_pipe_transport(fd)
. Return the pipe transport (a unidirectional stream transport) corresponding to the argument, which should be 0, 1 or 2 representing stdin, stdout or stderr (of the subprocess). If there is no such pipe transport, returnNone
. For stdin, this is a writing transport; for stdout and stderr this is a reading transport. You must use this method to get a transport you can use to write to the subprocess’s stdin.send_signal(signal)
. Send a signal to the subprocess.terminate()
. Terminate the subprocess.kill()
. Kill the subprocess. On Windows this is an alias forterminate()
.close()
. This is an alias forterminate()
.
Note that send_signal()
, terminate()
and kill()
wrap the
corresponding methods in the standard library subprocess
module.
Protocols
Protocols are always used in conjunction with transports. While a few common protocols are provided (e.g. decent though not necessarily excellent HTTP client and server implementations), most protocols will be implemented by user code or third-party libraries.
Like for transports, we distinguish between stream protocols, datagram protocols, and perhaps other custom protocols. The most common type of protocol is a bidirectional stream protocol. (There are no unidirectional protocols.)
Stream Protocols
A (bidirectional) stream protocol must implement the following methods, which will be called by the transport. Think of these as callbacks that are always called by the event loop in the right context. (See the “Context” section way above.)
connection_made(transport)
. Indicates that the transport is ready and connected to the entity at the other end. The protocol should probably save the transport reference as an instance variable (so it can call itswrite()
and other methods later), and may write an initial greeting or request at this point.data_received(data)
. The transport has read some bytes from the connection. The argument is always a non-empty bytes object. There are no guarantees about the minimum or maximum size of the data passed along this way.p.data_received(b'abcdef')
should be treated exactly equivalent to:p.data_received(b'abc') p.data_received(b'def')
eof_received()
. This is called when the other end calledwrite_eof()
(or something equivalent). If this returns a false value (includingNone
), the transport will close itself. If it returns a true value, closing the transport is up to the protocol. However, for SSL/TLS connections this is ignored, because the TLS standard requires that no more data is sent and the connection is closed as soon as a “closure alert” is received.The default implementation returns
None
.pause_writing()
. Asks that the protocol temporarily stop writing data to the transport. Heeding the request is optional, but the transport’s buffer may grow without bounds if you keep writing. The buffer size at which this is called can be controlled through the transport’sset_write_buffer_limits()
method.resume_writing()
. Tells the protocol that it is safe to start writing data to the transport again. Note that this may be called directly by the transport’swrite()
method (as opposed to being called indirectly usingcall_soon()
), so that the protocol may be aware of its paused state immediately afterwrite()
returns.connection_lost(exc)
. The transport has been closed or aborted, has detected that the other end has closed the connection cleanly, or has encountered an unexpected error. In the first three cases the argument isNone
; for an unexpected error, the argument is the exception that caused the transport to give up.
Here is a table indicating the order and multiplicity of the basic calls:
connection_made()
– exactly oncedata_received()
– zero or more timeseof_received()
– at most onceconnection_lost()
– exactly once
Calls to pause_writing()
and resume_writing()
occur in pairs
and only between #1 and #4. These pairs will not be nested. The
final resume_writing()
call may be omitted; i.e. a paused
connection may be lost and never be resumed.
Datagram Protocols
Datagram protocols have connection_made()
and
connection_lost()
methods with the same signatures as stream
protocols. (As explained in the section about datagram transports, we
prefer the slightly odd nomenclature over defining different method
names to indicating the opening and closing of the socket.)
In addition, they have the following methods:
datagram_received(data, addr)
. Indicates that a datagramdata
(a bytes objects) was received from remote addressaddr
(an IPv4 2-tuple or an IPv6 4-tuple).error_received(exc)
. Indicates that a send or receive operation raised anOSError
exception. Since datagram errors may be transient, it is up to the protocol to call the transport’sclose()
method if it wants to close the endpoint.
Here is a chart indicating the order and multiplicity of calls:
connection_made()
– exactly oncedatagram_received()
,error_received()
– zero or more timesconnection_lost()
– exactly once
Subprocess Protocol
Subprocess protocols have connection_made()
, connection_lost()
,
pause_writing()
and resume_writing()
methods with the same
signatures as stream protocols. In addition, they have the following
methods:
pipe_data_received(fd, data)
. Called when the subprocess writes data to its stdout or stderr.fd
is the file descriptor (1 for stdout, 2 for stderr).data
is abytes
object.pipe_connection_lost(fd, exc)
. Called when the subprocess closes its stdin, stdout or stderr.fd
is the file descriptor.exc
is an exception orNone
.process_exited()
. Called when the subprocess has exited. To retrieve the exit status, use the transport’sget_returncode()
method.
Note that depending on the behavior of the subprocess it is possible
that process_exited()
is called either before or after
pipe_connection_lost()
. For example, if the subprocess creates a
sub-subprocess that shares its stdin/stdout/stderr and then itself
exits, process_exited()
may be called while all the pipes are
still open. On the other hand, when the subprocess closes its
stdin/stdout/stderr but does not exit, pipe_connection_lost()
may
be called for all three pipes without process_exited()
being
called. If (as is the more common case) the subprocess exits and
thereby implicitly closes all pipes, the calling order is undefined.
Callback Style
Most interfaces taking a callback also take positional arguments. For
instance, to arrange for foo("abc", 42)
to be called soon, you
call loop.call_soon(foo, "abc", 42)
. To schedule the call
foo()
, use loop.call_soon(foo)
. This convention greatly
reduces the number of small lambdas required in typical callback
programming.
This convention specifically does not support keyword arguments. Keyword arguments are used to pass optional extra information about the callback. This allows graceful evolution of the API without having to worry about whether a keyword might be significant to a callee somewhere. If you have a callback that must be called with a keyword argument, you can use a lambda. For example:
loop.call_soon(lambda: foo('abc', repeat=42))
Coroutines and the Scheduler
This is a separate toplevel section because its status is different from the event loop interface. Usage of coroutines is optional, and it is perfectly fine to write code using callbacks only. On the other hand, there is only one implementation of the scheduler/coroutine API, and if you’re using coroutines, that’s the one you’re using.
Coroutines
A coroutine is a generator that follows certain conventions. For
documentation purposes, all coroutines should be decorated with
@asyncio.coroutine
, but this cannot be strictly enforced.
Coroutines use the yield from
syntax introduced in PEP 380,
instead of the original yield
syntax.
The word “coroutine”, like the word “generator”, is used for two different (though related) concepts:
- The function that defines a coroutine (a function definition
decorated with
asyncio.coroutine
). If disambiguation is needed we will call this a coroutine function. - The object obtained by calling a coroutine function. This object represents a computation or an I/O operation (usually a combination) that will complete eventually. If disambiguation is needed we will call it a coroutine object.
Things a coroutine can do:
result = yield from future
– suspends the coroutine until the future is done, then returns the future’s result, or raises an exception, which will be propagated. (If the future is cancelled, it will raise aCancelledError
exception.) Note that tasks are futures, and everything said about futures also applies to tasks.result = yield from coroutine
– wait for another coroutine to produce a result (or raise an exception, which will be propagated). Thecoroutine
expression must be a call to another coroutine.return expression
– produce a result to the coroutine that is waiting for this one usingyield from
.raise exception
– raise an exception in the coroutine that is waiting for this one usingyield from
.
Calling a coroutine does not start its code running – it is just a
generator, and the coroutine object returned by the call is really a
generator object, which doesn’t do anything until you iterate over it.
In the case of a coroutine object, there are two basic ways to start
it running: call yield from coroutine
from another coroutine
(assuming the other coroutine is already running!), or convert it to a
Task (see below).
Coroutines (and tasks) can only run when the event loop is running.
Waiting for Multiple Coroutines
To wait for multiple coroutines or Futures, two APIs similar to the
wait()
and as_completed()
APIs in the concurrent.futures
package are provided:
asyncio.wait(fs, timeout=None, return_when=ALL_COMPLETED)
. This is a coroutine that waits for the Futures or coroutines given byfs
to complete. Coroutine arguments will be wrapped in Tasks (see below). This returns a Future whose result on success is a tuple of two sets of Futures,(done, pending)
, wheredone
is the set of original Futures (or wrapped coroutines) that are done (or cancelled), andpending
is the rest, i.e. those that are still not done (nor cancelled). Note that with the defaults fortimeout
andreturn_when
,done
will always be an empty list. Optional argumentstimeout
andreturn_when
have the same meaning and defaults as forconcurrent.futures.wait()
:timeout
, if notNone
, specifies a timeout for the overall operation;return_when
, specifies when to stop. The constantsFIRST_COMPLETED
,FIRST_EXCEPTION
,ALL_COMPLETED
are defined with the same values and the same meanings as in PEP 3148:ALL_COMPLETED
(default): Wait until all Futures are done (or until the timeout occurs).FIRST_COMPLETED
: Wait until at least one Future is done (or until the timeout occurs).FIRST_EXCEPTION
: Wait until at least one Future is done but not cancelled with an exception set. (The exclusion of cancelled Futures from the condition is surprising, but PEP 3148 does it this way.)
asyncio.as_completed(fs, timeout=None)
. Returns an iterator whose values are Futures or coroutines; waiting for successive values waits until the next Future or coroutine from the setfs
completes, and returns its result (or raises its exception). The optional argumenttimeout
has the same meaning and default as it does forconcurrent.futures.wait()
: when the timeout occurs, the next Future returned by the iterator will raiseTimeoutError
when waited for. Example of use:for f in as_completed(fs): result = yield from f # May raise an exception. # Use result.
Note: if you do not wait for the values produced by the iterator, your
for
loop may not make progress (since you are not allowing other tasks to run).asyncio.wait_for(f, timeout)
. This is a convenience to wait for a single coroutine or Future with a timeout. When a timeout occurs, it cancels the task and raises TimeoutError. To avoid the task cancellation, wrap it inshield()
.asyncio.gather(f1, f2, ...)
. Returns a Future which waits until all arguments (Futures or coroutines) are done and return a list of their corresponding results. If one or more of the arguments is cancelled or raises an exception, the returned Future is cancelled or has its exception set (matching what happened to the first argument), and the remaining arguments are left running in the background. Cancelling the returned Future does not affect the arguments. Note that coroutine arguments are converted to Futures usingasyncio.async()
.asyncio.shield(f)
. Wait for a Future, shielding it from cancellation. This returns a Future whose result or exception is exactly the same as the argument; however, if the returned Future is cancelled, the argument Future is unaffected.A use case for this function would be a coroutine that caches a query result for a coroutine that handles a request in an HTTP server. When the request is cancelled by the client, we could (arguably) want the query-caching coroutine to continue to run, so that when the client reconnects, the query result is (hopefully) cached. This could be written e.g. as follows:
@asyncio.coroutine def handle_request(self, request): ... cached_query = self.get_cache(...) if cached_query is None: cached_query = yield from asyncio.shield(self.fill_cache(...)) ...
Sleeping
The coroutine asyncio.sleep(delay)
returns after a given time delay.
Tasks
A Task is an object that manages an independently running coroutine.
The Task interface is the same as the Future interface, and in fact
Task
is a subclass of Future
. The task becomes done when its
coroutine returns or raises an exception; if it returns a result, that
becomes the task’s result, if it raises an exception, that becomes the
task’s exception.
Cancelling a task that’s not done yet throws an
asyncio.CancelledError
exception into the coroutine. If the
coroutine doesn’t catch this (or if it re-raises it) the task will be
marked as cancelled (i.e., cancelled()
will return True
); but
if the coroutine somehow catches and ignores the exception it may
continue to execute (and cancelled()
will return False
).
Tasks are also useful for interoperating between coroutines and callback-based frameworks like Twisted. After converting a coroutine into a Task, callbacks can be added to the Task.
To convert a coroutine into a task, call the coroutine function and
pass the resulting coroutine object to the loop.create_task()
method. You may also use asyncio.ensure_future()
for this purpose.
You may ask, why not automatically convert all coroutines to Tasks?
The @asyncio.coroutine
decorator could do this. However, this would
slow things down considerably in the case where one coroutine calls
another (and so on), as switching to a “bare” coroutine has much less
overhead than switching to a Task.
The Task
class is derived from Future
adding new methods:
current_task(loop=None)
. A class method returning the currently running task in an event loop. If loop isNone
the method returns the current task for the default loop. Every coroutine is executed inside a task context, either aTask
created usingensure_future()
orloop.create_task()
, or by being called from another coroutine usingyield from
orawait
. This method returnsNone
when called outside a coroutine, e.g. in a callback scheduled usingloop.call_later()
.all_tasks(loop=None)
. A class method returning a set of all active tasks for the loop. This uses the default loop if loop isNone
.
The Scheduler
The scheduler has no public interface. You interact with it by using
yield from future
and yield from task
. In fact, there is no
single object representing the scheduler – its behavior is
implemented by the Task
and Future
classes using only the
public interface of the event loop, so it will work with third-party
event loop implementations, too.
Convenience Utilities
A few functions and classes are provided to simplify the writing of basic stream-based clients and servers, such as FTP or HTTP. These are:
asyncio.open_connection(host, port)
: A wrapper forEventLoop.create_connection()
that does not require you to provide aProtocol
factory or class. This is a coroutine that returns a(reader, writer)
pair, wherereader
is an instance ofStreamReader
andwriter
is an instance ofStreamWriter
(both described below).asyncio.start_server(client_connected_cb, host, port)
: A wrapper forEventLoop.create_server()
that takes a simple callback function rather than aProtocol
factory or class. This is a coroutine that returns aServer
object just ascreate_server()
does. Each time a client connection is accepted,client_connected_cb(reader, writer)
is called, wherereader
is an instance ofStreamReader
andwriter
is an instance ofStreamWriter
(both described below). If the result returned byclient_connected_cb()
is a coroutine, it is automatically wrapped in aTask
.StreamReader
: A class offering an interface not unlike that of a read-only binary stream, except that the various reading methods are coroutines. It is normally driven by aStreamReaderProtocol
instance. Note that there should be only one reader. The interface for the reader is:readline()
: A coroutine that reads a string of bytes representing a line of text ending in'\n'
, or until the end of the stream, whichever comes first.read(n)
: A coroutine that reads up ton
bytes. Ifn
is omitted or negative, it reads until the end of the stream.readexactly(n)
: A coroutine that reads exactlyn
bytes, or until the end of the stream, whichever comes first.exception()
: Return the exception that has been set on the stream usingset_exception()
, or None if no exception is set.
The interface for the driver is:
feed_data(data)
: Appenddata
(abytes
object) to the internal buffer. This unblocks a blocked reading coroutine if it provides sufficient data to fulfill the reader’s contract.feed_eof()
: Signal the end of the buffer. This unblocks a blocked reading coroutine. No more data should be fed to the reader after this call.set_exception(exc)
: Set an exception on the stream. All subsequent reading methods will raise this exception. No more data should be fed to the reader after this call.
StreamWriter
: A class offering an interface not unlike that of a write-only binary stream. It wraps a transport. The interface is an extended subset of the transport interface: the following methods behave the same as the corresponding transport methods:write()
,writelines()
,write_eof()
,can_write_eof()
,get_extra_info()
,close()
. Note that the writing methods are _not_ coroutines (this is the same as for transports, but different from theStreamReader
class). The following method is in addition to the transport interface:drain()
: This should be called withyield from
after writing significant data, for the purpose of flow control. The intended use is like this:writer.write(data) yield from writer.drain()
Note that this is not technically a coroutine: it returns either a Future or an empty tuple (both can be passed to
yield from
). Use of this method is optional. However, when it is not used, the internal buffer of the transport underlying theStreamWriter
may fill up with all data that was ever written to the writer. If an app does not have a strict limit on how much data it writes, it _should_ callyield from drain()
occasionally to avoid filling up the transport buffer.
StreamReaderProtocol
: A protocol implementation used as an adapter between the bidirectional stream transport/protocol interface and theStreamReader
andStreamWriter
classes. It acts as a driver for a specificStreamReader
instance, calling its methodsfeed_data()
,feed_eof()
, andset_exception()
in response to various protocol callbacks. It also controls the behavior of thedrain()
method of theStreamWriter
instance.
Synchronization
Locks, events, conditions and semaphores modeled after those in the
threading
module are implemented and can be accessed by importing
the asyncio.locks
submodule. Queues modeled after those in the
queue
module are implemented and can be accessed by importing the
asyncio.queues
submodule.
In general these have a close correspondence to their threaded
counterparts, however, blocking methods (e.g. acquire()
on locks,
put()
and get()
on queues) are coroutines, and timeout
parameters are not provided (you can use asyncio.wait_for()
to add
a timeout to a blocking call, however).
The docstrings in the modules provide more complete documentation.
Locks
The following classes are provided by asyncio.locks
. For all
these except Event
, the with
statement may be used in
combination with yield from
to acquire the lock and ensure that
the lock is released regardless of how the with
block is left, as
follows:
with (yield from my_lock):
...
Lock
: a basic mutex, with methodsacquire()
(a coroutine),locked()
, andrelease()
.Event
: an event variable, with methodswait()
(a coroutine),set()
,clear()
, andis_set()
.Condition
: a condition variable, with methodsacquire()
,wait()
,wait_for(predicate)
(all three coroutines),locked()
,release()
,notify()
, andnotify_all()
.Semaphore
: a semaphore, with methodsacquire()
(a coroutine),locked()
, andrelease()
. The constructor argument is the initial value (default1
).BoundedSemaphore
: a bounded semaphore; this is similar toSemaphore
but the initial value is also the maximum value.
Queues
The following classes and exceptions are provided by asyncio.queues
.
Queue
: a standard queue, with methodsget()
,put()
(both coroutines),get_nowait()
,put_nowait()
,empty()
,full()
,qsize()
, andmaxsize()
.PriorityQueue
: a subclass ofQueue
that retrieves entries in priority order (lowest first).LifoQueue
: a subclass ofQueue
that retrieves the most recently added entries first.JoinableQueue
: a subclass ofQueue
withtask_done()
andjoin()
methods (the latter a coroutine).Empty
,Full
: exceptions raised whenget_nowait()
orput_nowait()
is called on a queue that is empty or full, respectively.
Miscellaneous
Logging
All logging performed by the asyncio
package uses a single
logging.Logger
object, asyncio.logger
. To customize logging
you can use the standard Logger
API on this object. (Do not
replace the object though.)
SIGCHLD
handling on UNIX
Efficient implementation of the process_exited()
method on
subprocess protocols requires a SIGCHLD
signal handler. However,
signal handlers can only be set on the event loop associated with the
main thread. In order to support spawning subprocesses from event
loops running in other threads, a mechanism exists to allow sharing a
SIGCHLD
handler between multiple event loops. There are two
additional functions, asyncio.get_child_watcher()
and
asyncio.set_child_watcher()
, and corresponding methods on the
event loop policy.
There are two child watcher implementation classes,
FastChildWatcher
and SafeChildWatcher
. Both use SIGCHLD
.
The SafeChildWatcher
class is used by default; it is inefficient
when many subprocesses exist simultaneously. The FastChildWatcher
class is efficient, but it may interfere with other code (either C
code or Python code) that spawns subprocesses without using an
asyncio
event loop. If you are sure you are not using other code
that spawns subprocesses, to use the fast implementation, run the
following in your main thread:
watcher = asyncio.FastChildWatcher()
asyncio.set_child_watcher(watcher)
Wish List
(There is agreement that these features are desirable, but no implementation was available when Python 3.4 beta 1 was released, and the feature freeze for the rest of the Python 3.4 release cycle prohibits adding them in this late stage. However, they will hopefully be added in Python 3.5, and perhaps earlier in the PyPI distribution.)
- Support a “start TLS” operation to upgrade a TCP socket to SSL/TLS.
Former wish list items that have since been implemented (but aren’t specified by the PEP):
- UNIX domain sockets.
- A per-loop error handling callback.
Open Issues
(Note that these have been resolved de facto in favor of the status quo by the acceptance of the PEP. However, the PEP’s provisional status allows revising these decisions for Python 3.5.)
- Why do
create_connection()
andcreate_datagram_endpoint()
have aproto
argument but notcreate_server()
? And why are the family, flag, proto arguments forgetaddrinfo()
sometimes zero and sometimes named constants (whose value is also zero)? - Do we need another inquiry method to tell whether the loop is in the process of stopping?
- A fuller public API for Handle? What’s the use case?
- A debugging API? E.g. something that logs a lot of stuff, or logs unusual conditions (like queues filling up faster than they drain) or even callbacks taking too much time…
- Do we need introspection APIs? E.g. asking for the read callback given a file descriptor. Or when the next scheduled call is. Or the list of file descriptors registered with callbacks. Right now these all require using internals.
- Do we need more socket I/O methods, e.g.
sock_sendto()
andsock_recvfrom()
, and perhaps others likepipe_read()
? I guess users can write their own (it’s not rocket science). - We may need APIs to control various timeouts. E.g. we may want to
limit the time spent in DNS resolution, connecting, ssl/tls handshake,
idle connection, close/shutdown, even per session. Possibly it’s
sufficient to add
timeout
keyword arguments to some methods, and other timeouts can probably be implemented by clever use ofcall_later()
andTask.cancel()
. But it’s possible that some operations need default timeouts, and we may want to change the default for a specific operation globally (i.e., per event loop).
References
- PEP 492 describes the semantics of
async/await
. - PEP 380 describes the semantics of
yield from
. - Greg Ewing’s
yield from
tutorials: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/yield_from.html - PEP 3148 describes
concurrent.futures.Future
. - PEP 3153, while rejected, has a good write-up explaining the need to separate transports and protocols.
- PEP 418 discusses the issues of timekeeping.
- Tulip repo: http://code.google.com/p/tulip/
- PyPI: the Python Package Index at http://pypi.python.org/
- Nick Coghlan wrote a nice blog post with some background, thoughts
about different approaches to async I/O, gevent, and how to use
futures with constructs like
while
,for
andwith
: http://python-notes.boredomandlaziness.org/en/latest/pep_ideas/async_programming.html - TBD: references to the relevant parts of Twisted, Tornado, ZeroMQ, pyftpdlib, libevent, libev, pyev, libuv, wattle, and so on.
Acknowledgments
Apart from PEP 3153, influences include PEP 380 and Greg Ewing’s
tutorial for yield from
, Twisted, Tornado, ZeroMQ, pyftpdlib, and
wattle (Steve Dower’s counter-proposal). My previous work on
asynchronous support in the NDB library for Google App Engine provided
an important starting point.
I am grateful for the numerous discussions on python-ideas from September through December 2012, and many more on python-tulip since then; a Skype session with Steve Dower and Dino Viehland; email exchanges with and a visit by Ben Darnell; an audience with Niels Provos (original author of libevent); and in-person meetings (as well as frequent email exchanges) with several Twisted developers, including Glyph, Brian Warner, David Reid, and Duncan McGreggor.
Contributors to the implementation include Eli Bendersky, Gustavo Carneiro (Gambit Research), Saúl Ibarra Corretgé, Geert Jansen, A. Jesse Jiryu Davis, Nikolay Kim, Charles-François Natali, Richard Oudkerk, Antoine Pitrou, Giampaolo Rodolá, Andrew Svetlov, and many others who submitted bugs and/or fixes.
I thank Antoine Pitrou for his feedback in his role of official PEP BDFL.
Copyright
This document has been placed in the public domain.
Source: https://github.com/python-discord/peps/blob/main/pep-3156.txt
Last modified: 2022-02-27 22:46:36 GMT