PEP 391 – Dictionary-Based Configuration For Logging
- PEP
- 391
- Title
- Dictionary-Based Configuration For Logging
- Author
- Vinay Sajip <vinay_sajip at red-dove.com>
- Status
- Final
- Type
- Standards Track
- Created
- 15-Oct-2009
- Python-Version
- 2.7, 3.2
- Post-History
Abstract
This PEP describes a new way of configuring logging using a dictionary to hold configuration information.
Rationale
The present means for configuring Python’s logging package is either by using the logging API to configure logging programmatically, or else by means of ConfigParser-based configuration files.
Programmatic configuration, while offering maximal control, fixes the configuration in Python code. This does not facilitate changing it easily at runtime, and, as a result, the ability to flexibly turn the verbosity of logging up and down for different parts of a using application is lost. This limits the usability of logging as an aid to diagnosing problems - and sometimes, logging is the only diagnostic aid available in production environments.
The ConfigParser-based configuration system is usable, but does not allow its users to configure all aspects of the logging package. For example, Filters cannot be configured using this system. Furthermore, the ConfigParser format appears to engender dislike (sometimes strong dislike) in some quarters. Though it was chosen because it was the only configuration format supported in the Python standard at that time, many people regard it (or perhaps just the particular schema chosen for logging’s configuration) as ‘crufty’ or ‘ugly’, in some cases apparently on purely aesthetic grounds.
Recent versions of Python include JSON support in the standard library, and this is also usable as a configuration format. In other environments, such as Google App Engine, YAML is used to configure applications, and usually the configuration of logging would be considered an integral part of the application configuration. Although the standard library does not contain YAML support at present, support for both JSON and YAML can be provided in a common way because both of these serialization formats allow deserialization to Python dictionaries.
By providing a way to configure logging by passing the configuration in a dictionary, logging will be easier to configure not only for users of JSON and/or YAML, but also for users of custom configuration methods, by providing a common format in which to describe the desired configuration.
Another drawback of the current ConfigParser-based configuration system is that it does not support incremental configuration: a new configuration completely replaces the existing configuration. Although full flexibility for incremental configuration is difficult to provide in a multi-threaded environment, the new configuration mechanism will allow the provision of limited support for incremental configuration.
Specification
The specification consists of two parts: the API and the format of the dictionary used to convey configuration information (i.e. the schema to which it must conform).
Naming
Historically, the logging package has not been PEP 8 conformant. At some future time, this will be corrected by changing method and function names in the package in order to conform with PEP 8. However, in the interests of uniformity, the proposed additions to the API use a naming scheme which is consistent with the present scheme used by logging.
API
The logging.config module will have the following addition:
- A function, called
dictConfig()
, which takes a single argument - the dictionary holding the configuration. Exceptions will be raised if there are errors while processing the dictionary.
It will be possible to customize this API - see the section on API Customization. Incremental configuration is covered in its own section.
Dictionary Schema - Overview
Before describing the schema in detail, it is worth saying a few words about object connections, support for user-defined objects and access to external and internal objects.
Object connections
The schema is intended to describe a set of logging objects - loggers, handlers, formatters, filters - which are connected to each other in an object graph. Thus, the schema needs to represent connections between the objects. For example, say that, once configured, a particular logger has attached to it a particular handler. For the purposes of this discussion, we can say that the logger represents the source, and the handler the destination, of a connection between the two. Of course in the configured objects this is represented by the logger holding a reference to the handler. In the configuration dict, this is done by giving each destination object an id which identifies it unambiguously, and then using the id in the source object’s configuration to indicate that a connection exists between the source and the destination object with that id.
So, for example, consider the following YAML snippet:
formatters:
brief:
# configuration for formatter with id 'brief' goes here
precise:
# configuration for formatter with id 'precise' goes here
handlers:
h1: #This is an id
# configuration of handler with id 'h1' goes here
formatter: brief
h2: #This is another id
# configuration of handler with id 'h2' goes here
formatter: precise
loggers:
foo.bar.baz:
# other configuration for logger 'foo.bar.baz'
handlers: [h1, h2]
(Note: YAML will be used in this document as it is a little more readable than the equivalent Python source form for the dictionary.)
The ids for loggers are the logger names which would be used
programmatically to obtain a reference to those loggers, e.g.
foo.bar.baz
. The ids for Formatters and Filters can be any string
value (such as brief
, precise
above) and they are transient,
in that they are only meaningful for processing the configuration
dictionary and used to determine connections between objects, and are
not persisted anywhere when the configuration call is complete.
Handler ids are treated specially, see the section on Handler Ids, below.
The above snippet indicates that logger named foo.bar.baz
should
have two handlers attached to it, which are described by the handler
ids h1
and h2
. The formatter for h1
is that described by id
brief
, and the formatter for h2
is that described by id
precise
.
User-defined objects
The schema should support user-defined objects for handlers, filters and formatters. (Loggers do not need to have different types for different instances, so there is no support - in the configuration - for user-defined logger classes.)
Objects to be configured will typically be described by dictionaries
which detail their configuration. In some places, the logging system
will be able to infer from the context how an object is to be
instantiated, but when a user-defined object is to be instantiated,
the system will not know how to do this. In order to provide complete
flexibility for user-defined object instantiation, the user will need
to provide a ‘factory’ - a callable which is called with a
configuration dictionary and which returns the instantiated object.
This will be signalled by an absolute import path to the factory being
made available under the special key '()'
. Here’s a concrete
example:
formatters:
brief:
format: '%(message)s'
default:
format: '%(asctime)s %(levelname)-8s %(name)-15s %(message)s'
datefmt: '%Y-%m-%d %H:%M:%S'
custom:
(): my.package.customFormatterFactory
bar: baz
spam: 99.9
answer: 42
The above YAML snippet defines three formatters. The first, with id
brief
, is a standard logging.Formatter
instance with the
specified format string. The second, with id default
, has a
longer format and also defines the time format explicitly, and will
result in a logging.Formatter
initialized with those two format
strings. Shown in Python source form, the brief
and default
formatters have configuration sub-dictionaries:
{
'format' : '%(message)s'
}
and:
{
'format' : '%(asctime)s %(levelname)-8s %(name)-15s %(message)s',
'datefmt' : '%Y-%m-%d %H:%M:%S'
}
respectively, and as these dictionaries do not contain the special key
'()'
, the instantiation is inferred from the context: as a result,
standard logging.Formatter
instances are created. The
configuration sub-dictionary for the third formatter, with id
custom
, is:
{
'()' : 'my.package.customFormatterFactory',
'bar' : 'baz',
'spam' : 99.9,
'answer' : 42
}
and this contains the special key '()'
, which means that
user-defined instantiation is wanted. In this case, the specified
factory callable will be used. If it is an actual callable it will be
used directly - otherwise, if you specify a string (as in the example)
the actual callable will be located using normal import mechanisms.
The callable will be called with the remaining items in the
configuration sub-dictionary as keyword arguments. In the above
example, the formatter with id custom
will be assumed to be
returned by the call:
my.package.customFormatterFactory(bar='baz', spam=99.9, answer=42)
The key '()'
has been used as the special key because it is not a
valid keyword parameter name, and so will not clash with the names of
the keyword arguments used in the call. The '()'
also serves as a
mnemonic that the corresponding value is a callable.
Access to external objects
There are times where a configuration will need to refer to objects
external to the configuration, for example sys.stderr
. If the
configuration dict is constructed using Python code then this is
straightforward, but a problem arises when the configuration is
provided via a text file (e.g. JSON, YAML). In a text file, there is
no standard way to distinguish sys.stderr
from the literal string
'sys.stderr'
. To facilitate this distinction, the configuration
system will look for certain special prefixes in string values and
treat them specially. For example, if the literal string
'ext://sys.stderr'
is provided as a value in the configuration,
then the ext://
will be stripped off and the remainder of the
value processed using normal import mechanisms.
The handling of such prefixes will be done in a way analogous to
protocol handling: there will be a generic mechanism to look for
prefixes which match the regular expression
^(?P<prefix>[a-z]+)://(?P<suffix>.*)$
whereby, if the prefix
is recognised, the suffix
is processed in a prefix-dependent
manner and the result of the processing replaces the string value. If
the prefix is not recognised, then the string value will be left
as-is.
The implementation will provide for a set of standard prefixes such as
ext://
but it will be possible to disable the mechanism completely
or provide additional or different prefixes for special handling.
Access to internal objects
As well as external objects, there is sometimes also a need to refer
to objects in the configuration. This will be done implicitly by the
configuration system for things that it knows about. For example, the
string value 'DEBUG'
for a level
in a logger or handler will
automatically be converted to the value logging.DEBUG
, and the
handlers
, filters
and formatter
entries will take an
object id and resolve to the appropriate destination object.
However, a more generic mechanism needs to be provided for the case
of user-defined objects which are not known to logging. For example,
take the instance of logging.handlers.MemoryHandler
, which takes
a target
which is another handler to delegate to. Since the system
already knows about this class, then in the configuration, the given
target
just needs to be the object id of the relevant target
handler, and the system will resolve to the handler from the id. If,
however, a user defines a my.package.MyHandler
which has a
alternate
handler, the configuration system would not know that
the alternate
referred to a handler. To cater for this, a
generic resolution system will be provided which allows the user to
specify:
handlers:
file:
# configuration of file handler goes here
custom:
(): my.package.MyHandler
alternate: cfg://handlers.file
The literal string 'cfg://handlers.file'
will be resolved in an
analogous way to the strings with the ext://
prefix, but looking
in the configuration itself rather than the import namespace. The
mechanism will allow access by dot or by index, in a similar way to
that provided by str.format
. Thus, given the following snippet:
handlers:
email:
class: logging.handlers.SMTPHandler
mailhost: localhost
fromaddr: my_app@domain.tld
toaddrs:
- support_team@domain.tld
- dev_team@domain.tld
subject: Houston, we have a problem.
in the configuration, the string 'cfg://handlers'
would resolve to
the dict with key handlers
, the string 'cfg://handlers.email
would resolve to the dict with key email
in the handlers
dict,
and so on. The string 'cfg://handlers.email.toaddrs[1]
would
resolve to 'dev_team.domain.tld'
and the string
'cfg://handlers.email.toaddrs[0]'
would resolve to the value
'support_team@domain.tld'
. The subject
value could be accessed
using either 'cfg://handlers.email.subject'
or, equivalently,
'cfg://handlers.email[subject]'
. The latter form only needs to be
used if the key contains spaces or non-alphanumeric characters. If an
index value consists only of decimal digits, access will be attempted
using the corresponding integer value, falling back to the string
value if needed.
Given a string cfg://handlers.myhandler.mykey.123
, this will
resolve to config_dict['handlers']['myhandler']['mykey']['123']
.
If the string is specified as cfg://handlers.myhandler.mykey[123]
,
the system will attempt to retrieve the value from
config_dict['handlers']['myhandler']['mykey'][123]
, and fall back
to config_dict['handlers']['myhandler']['mykey']['123']
if that
fails.
Handler Ids
Some specific logging configurations require the use of handler levels to achieve the desired effect. However, unlike loggers which can always be identified by their names, handlers have no persistent handles whereby levels can be changed via an incremental configuration call.
Therefore, this PEP proposes to add an optional name
property to
handlers. If used, this will add an entry in a dictionary which maps
the name to the handler. (The entry will be removed when the handler
is closed.) When an incremental configuration call is made, handlers
will be looked up in this dictionary to set the handler level
according to the value in the configuration. See the section on
incremental configuration for more details.
In theory, such a “persistent name” facility could also be provided
for Filters and Formatters. However, there is not a strong case to be
made for being able to configure these incrementally. On the basis
that practicality beats purity, only Handlers will be given this new
name
property. The id of a handler in the configuration will
become its name
.
The handler name lookup dictionary is for configuration use only and will not become part of the public API for the package.
Dictionary Schema - Detail
The dictionary passed to dictConfig()
must contain the following
keys:
- version - to be set to an integer value representing the schema version. The only valid value at present is 1, but having this key allows the schema to evolve while still preserving backwards compatibility.
All other keys are optional, but if present they will be interpreted
as described below. In all cases below where a ‘configuring dict’ is
mentioned, it will be checked for the special '()'
key to see if a
custom instantiation is required. If so, the mechanism described
above is used to instantiate; otherwise, the context is used to
determine how to instantiate.
- formatters - the corresponding value will be a dict in which each
key is a formatter id and each value is a dict describing how to
configure the corresponding Formatter instance.
The configuring dict is searched for keys
format
anddatefmt
(with defaults ofNone
) and these are used to construct alogging.Formatter
instance. - filters - the corresponding value will be a dict in which each key
is a filter id and each value is a dict describing how to configure
the corresponding Filter instance.
The configuring dict is searched for key
name
(defaulting to the empty string) and this is used to construct alogging.Filter
instance. - handlers - the corresponding value will be a dict in which each
key is a handler id and each value is a dict describing how to
configure the corresponding Handler instance.
The configuring dict is searched for the following keys:
class
(mandatory). This is the fully qualified name of the handler class.level
(optional). The level of the handler.formatter
(optional). The id of the formatter for this handler.filters
(optional). A list of ids of the filters for this handler.
All other keys are passed through as keyword arguments to the handler’s constructor. For example, given the snippet:
handlers: console: class : logging.StreamHandler formatter: brief level : INFO filters: [allow_foo] stream : ext://sys.stdout file: class : logging.handlers.RotatingFileHandler formatter: precise filename: logconfig.log maxBytes: 1024 backupCount: 3
the handler with id
console
is instantiated as alogging.StreamHandler
, usingsys.stdout
as the underlying stream. The handler with idfile
is instantiated as alogging.handlers.RotatingFileHandler
with the keyword argumentsfilename='logconfig.log', maxBytes=1024, backupCount=3
. - loggers - the corresponding value will be a dict in which each key
is a logger name and each value is a dict describing how to
configure the corresponding Logger instance.
The configuring dict is searched for the following keys:
level
(optional). The level of the logger.propagate
(optional). The propagation setting of the logger.filters
(optional). A list of ids of the filters for this logger.handlers
(optional). A list of ids of the handlers for this logger.
The specified loggers will be configured according to the level, propagation, filters and handlers specified.
- root - this will be the configuration for the root logger.
Processing of the configuration will be as for any logger, except
that the
propagate
setting will not be applicable. - incremental - whether the configuration is to be interpreted as
incremental to the existing configuration. This value defaults to
False
, which means that the specified configuration replaces the existing configuration with the same semantics as used by the existingfileConfig()
API.If the specified value is
True
, the configuration is processed as described in the section on Incremental Configuration, below. - disable_existing_loggers - whether any existing loggers are to be
disabled. This setting mirrors the parameter of the same name in
fileConfig()
. If absent, this parameter defaults toTrue
. This value is ignored if incremental isTrue
.
A Working Example
The following is an actual working configuration in YAML format (except that the email addresses are bogus):
formatters:
brief:
format: '%(levelname)-8s: %(name)-15s: %(message)s'
precise:
format: '%(asctime)s %(name)-15s %(levelname)-8s %(message)s'
filters:
allow_foo:
name: foo
handlers:
console:
class : logging.StreamHandler
formatter: brief
level : INFO
stream : ext://sys.stdout
filters: [allow_foo]
file:
class : logging.handlers.RotatingFileHandler
formatter: precise
filename: logconfig.log
maxBytes: 1024
backupCount: 3
debugfile:
class : logging.FileHandler
formatter: precise
filename: logconfig-detail.log
mode: a
email:
class: logging.handlers.SMTPHandler
mailhost: localhost
fromaddr: [email protected]
toaddrs:
- [email protected]
- [email protected]
subject: Houston, we have a problem.
loggers:
foo:
level : ERROR
handlers: [debugfile]
spam:
level : CRITICAL
handlers: [debugfile]
propagate: no
bar.baz:
level: WARNING
root:
level : DEBUG
handlers : [console, file]
Incremental Configuration
It is difficult to provide complete flexibility for incremental configuration. For example, because objects such as filters and formatters are anonymous, once a configuration is set up, it is not possible to refer to such anonymous objects when augmenting a configuration.
Furthermore, there is not a compelling case for arbitrarily altering the object graph of loggers, handlers, filters, formatters at run-time, once a configuration is set up; the verbosity of loggers and handlers can be controlled just by setting levels (and, in the case of loggers, propagation flags). Changing the object graph arbitrarily in a safe way is problematic in a multi-threaded environment; while not impossible, the benefits are not worth the complexity it adds to the implementation.
Thus, when the incremental
key of a configuration dict is present
and is True
, the system will ignore any formatters
and
filters
entries completely, and process only the level
settings in the handlers
entries, and the level
and
propagate
settings in the loggers
and root
entries.
It’s certainly possible to provide incremental configuration by other
means, for example making dictConfig()
take an incremental
keyword argument which defaults to False
. The reason for
suggesting that a value in the configuration dict be used is that it
allows for configurations to be sent over the wire as pickled dicts
to a socket listener. Thus, the logging verbosity of a long-running
application can be altered over time with no need to stop and
restart the application.
Note: Feedback on incremental configuration needs based on your practical experience will be particularly welcome.
API Customization
The bare-bones dictConfig()
API will not be sufficient for all
use cases. Provision for customization of the API will be made by
providing the following:
- A class, called
DictConfigurator
, whose constructor is passed the dictionary used for configuration, and which has aconfigure()
method. - A callable, called
dictConfigClass
, which will (by default) be set toDictConfigurator
. This is provided so that if desired,DictConfigurator
can be replaced with a suitable user-defined implementation.
The dictConfig()
function will call dictConfigClass
passing
the specified dictionary, and then call the configure()
method on
the returned object to actually put the configuration into effect:
def dictConfig(config):
dictConfigClass(config).configure()
This should cater to all customization needs. For example, a subclass
of DictConfigurator
could call DictConfigurator.__init__()
in
its own __init__()
, then set up custom prefixes which would be
usable in the subsequent configure() call
. The dictConfigClass
would be bound to the subclass, and then dictConfig()
could be
called exactly as in the default, uncustomized state.
Change to Socket Listener Implementation
The existing socket listener implementation will be modified as
follows: when a configuration message is received, an attempt will be
made to deserialize to a dictionary using the json module. If this
step fails, the message will be assumed to be in the fileConfig format
and processed as before. If deserialization is successful, then
dictConfig()
will be called to process the resulting dictionary.
Configuration Errors
If an error is encountered during configuration, the system will raise
a ValueError
, TypeError
, AttributeError
or ImportError
with a suitably descriptive message. The following is a (possibly
incomplete) list of conditions which will raise an error:
- A
level
which is not a string or which is a string not corresponding to an actual logging level - A
propagate
value which is not a boolean - An id which does not have a corresponding destination
- A non-existent handler id found during an incremental call
- An invalid logger name
- Inability to resolve to an internal or external object
Discussion in the community
The PEP has been announced on python-dev and python-list. While there hasn’t been a huge amount of discussion, this is perhaps to be expected for a niche topic.
Discussion threads on python-dev:
https://mail.python.org/pipermail/python-dev/2009-October/092695.html https://mail.python.org/pipermail/python-dev/2009-October/092782.html https://mail.python.org/pipermail/python-dev/2009-October/093062.html
And on python-list:
https://mail.python.org/pipermail/python-list/2009-October/1223658.html https://mail.python.org/pipermail/python-list/2009-October/1224228.html
There have been some comments in favour of the proposal, no objections to the proposal as a whole, and some questions and objections about specific details. These are believed by the author to have been addressed by making changes to the PEP.
Reference implementation
A reference implementation of the changes is available as a module dictconfig.py with accompanying unit tests in test_dictconfig.py, at:
http://bitbucket.org/vinay.sajip/dictconfig
This incorporates all features other than the socket listener change.
Copyright
This document has been placed in the public domain.
Source: https://github.com/python-discord/peps/blob/main/pep-0391.txt
Last modified: 2022-01-21 11:03:51 GMT