Python 3

PEP 3333 aims to resolve issues with Python 3 and WSGI.

This page is intended to collect ideas and proposals about WSGI amendments for Python 3.

See also Amendments to WSGI 1.0

Latest discussions

Proposals

There’s lots of discussions about the type of data (bytes versus unicode) in various places of the specification.

The actual competitors are:

mod_wsgi
[Ochtman2010]
all unicode
[Ronacher2009]
web3
[McDonough2010]
flat
optimized for ease of validation and low cognitive overhead (inputs are native except for the byte stream, all outputs are bytes)

Here is a summary table which outlines the bytes/unicode differences between these proposals.

  WSGI 1.0 mod_wsgi Unicode web3 flat
environ keys bytes native
CGI values bytes native unicode bytes native (PEP 383)
SCRIPT_NAME, PATH_INFO, QUERY_STRING bytes native unicode (utf-8) bytes native (PEP 383)
wsgi.url_scheme bytes native unicode bytes native
wsgi.input bytes
status line bytes bytes (or native) unicode (or bytes) bytes bytes
headers bytes bytes (or native) unicode or bytes bytes bytes
response iterable bytes bytes (or native) bytes bytes bytes
write() callback bytes bytes (or native) (deprecated) (removed) (removed)

Notes:

  • a native string is the primary string type for a particular Python implementation:
    • for Python 2.x this is a byte string,
    • for Python 3.x this is a Unicode string
  • unless otherwise stated, all unicode strings are decoded using ISO-8859-1
  • when SCRIPT_NAME and PATH_INFO are ‘native’ or ‘unicode’, the environment should contain 2 additional values wsgi.script_name and wsgi.path_info which contain raw-bytes values. (Except in the flat proposal, which assumes CGI variables are decoded as utf-8 using PEP 383 surrogateescape encoding, and that the raw bytes can thus be retrieved by re-encoding.)
  • details about the mod_wsgi proposal:
    • it is already implemented in mod_wsgi 3.0
    • almost entirely compatible with current WSGI 1.0 for Python 2
    • it runs the WSGI 1.0 ‘Hello World!’ unchanged
  • details about the all unicode proposal:
    • the SCRIPT_NAME and PATH_INFO will be decoded as UTF-8. If it fails, they are decoded as ISO-8859-1. The name of the successful codec is stored in wsgi.uri_encoding.
    • the REQUEST_URI variable is optional and stores the full URI as requested by the client.
  • details about the web3 proposal:
    • this proposal does not try to be compatible with WSGI 1.0. It targets Python 2.6+ and Python 3.1+.
    • all wsgi.* variables are intentionally renamed web3.* in the document.