If you wanted to write a Python web application a few years ago, you’d be faced with quite a glut of choices. You’d have to choose among a bunch of great web frameworks, and then figure out a reasonable way to deploy the application in production. It became a running joke that Python was the language of a thousand frameworks.
The Python community had options to solve the problem, cull the number of frameworks, or embrace the diversity. Given the nature of the community, culling didn’t seem like an attractive option, so PEP 333 was written as a way to lower the barriers to using Python as a language to develop for the web and the Web Server Gateway Interface (WSGI) was born.
WSGI separates the web application from the web server, similar to a Java servlet. In this way, web framework authors could worry about the best way to implement a web application, and leave the server implementation details to those working on the opposite side of the WSGI “tube.”
Although the intent of WSGI is to allow web framework developers a way to easily interface with web servers, WSGI is also a pretty fun way to build web applications. Ian Bicking, in his presentation “WSGI: An Introduction” at Pycon 2007, compared WSGI to the early days of CGI programming. It turns out, despite its problems, early CGI was a great encapsulation that provided clean separation between the server and the application. The server was responsible for marshalling some environment variables and passing them to the stdin of the application. The application responded with data (usually HTML) on stdout. Of course, CGI was slow and cumbersome, but it encapsulated things really nicely, and was easy to wrap your head around.
WSGI is similar to CGI in that the interface is simple. So simple, in fact, that it often throws people off. When you assume that deploying web applications is difficult, the reaction to WSGI is usually a shock. Here’s a basic example:
def hi(environ, start_response): start_response('200 OK', [('content-type','text/html')]) return "HI!" from wsgiref.simple_server import make_server make_server('', 8080, hi).serve_forever()
The application is the function “hi”, which takes as arguments the environment (a dictionary), and a function called start_response. The first line of the the hi application “start_response(‘200 OK’, [(‘content-type’,’text/html’)])” declares that the request was good, returning the HTTP response 200, and lets the client know that what follows is the mimetype text/HTML. The application then returns the HTML, in this case the simple phrase “HI!” It’s fairly similar to the CGI way of passing environment in on stdin, and getting a response from stdout.
That function is all that’s required of a full WSGI application. It’s trivial to plug the hi application into a WSGI container and run it. The final two lines of the script do just that:
from wsgiref.simple_server import make_server make_server('', 8080, hi).serve_forever()
I’m using the WSGI reference server, included in the Python standard library since Python 2.4. I could just as easily substitute it with a FastCGI, AJP, SCGI, or Apache container. In that way, it’s a write once, run anywhere…plug and play kind of web application.
Now that you’re over the hello world hump, it’s time to build a useful application. On August 4th, 2007, my wife (Camri) and I had our first child, Mr. William Christopher McAvoy. Since then, we’ve taken thousands of photographs. All of them are stored in a neatly organized series of folders on an external hard drive on my desk. When Camri wants to find pictures to give to the grandparents, she has to wheel herself over to my computer and look through them. We tried a shared drive, but it was just too slow. I did a little bit of looking for a web application that would read a big filesystem of pictures, but couldn’t find any. The existing galleries all wanted you to upload pictures; none assumed a pre-existing series of folders.
I puttered around for a few hours in the airport on a trip, and came up with a relatively usable WSGI application that converts web paths to directory paths, dynamically creates thumbnails, and generally makes it easy to browse a big listing of jpegs. When we got home, I plugged the app into a mod_wsgi container on my desktop installation of Apache, and it ran as well as it did in the WSGI container included in Python 2.4 that I was using for development.
The full source of the application is available on my public Google code page. The guts of the application is the class fsPicture. “A class?” you say, “I thought WSGI apps were supposed to be functions?!” Sort of. They’re supposed to be callable, function-like objects, which is a way of saying that they can be objects, as long as you override the __call__ magic method of the object.
Yhis sounds simple enough, but it really confused me when I first started playing with WSGI, so let me spend a minute on it. If I declare a class that looks like this:
class Something(object): def __call__(self): return "Hi there!"
And then instantiate the class like so:
s = Something()
I can call ‘s’ as if it were a function, like ‘s()’. It’s functionally equivilent to creating an s function, like this:
def s(): return "Hi there!"
This is really great, because it means that you can create objects as WSGI applications, which is a lot cleaner than creating a WSGI application with a function as its base.
So, walk through this very simple class.
class fsPicture(object): def __init__(self, root, template=base_template): self.template = template self.root = root def split_path_from_item(self, item): """removes the root directory from the path. This lets us use the result as a web path.""" return "/" + item.replace(self.root, '') def directory_listing(self, directory, path): """returns html for a directory listing""" files = "" directories = "" for item in glob(directory + "/*"): web_path = self.split_path_from_item(item) if os.path.isdir(item): directories += """<div class="directory"> <a href="%s" class="directory">%s </a></div>""" % (web_path, web_path) elif os.path.isfile(item) and item.lower().endswith('.jpg'): files += """<div class="image"> <img src="%s?thumbnail=200"><br/> <a href="%s">%s</a> </div>""" % (web_path, web_path, web_path) html = "" if directories: html += """<h2>Directories</h2> <div id="directories">%s</div>""" % directories if files: html += """<h2>Pictures</h2> <div id="pictures">%s</div>""" % files return html def picture(self, image, path): """returns raw binary image. If query string of "thumbnail" is passed to the app the image is resized to a maximum of the argument. For instance: /some_image.jpg?thumbnail=100""" i = Image.open(image) if self.query_string: try: size = cgi.parse_qs(self.query_string) size = size['thumbnail'] i.thumbnail((int(size), int(size))) except: pass s = StringIO() i.save(s, 'JPEG') return s.getvalue() def find_object(self, path): """finds the directory or picture referenced, returns the response and the mimetype""" item = os.path.join(self.root, *path.split('/')) if os.path.isdir(item): return ([self.template % self.directory_listing(item, path),], 'text/html') elif os.path.isfile(item) and item.lower().endswith('.jpg'): return ([self.picture(item, path),], 'image/jpeg') else: return ([self.template % 'not found'], 'text/html') def __call__(self, environ, start_response): """the entry point to the application""" self.query_string = environ.get('QUERY_STRING', False) response, mimetype = self.find_object(environ['PATH_INFO']) start_response('200 OK', [('content-type',mimetype)]) return response
__call__ is the main entry point for the WSGI server to call. Like a WSGI callable function, it takes ‘environ’ and ‘start_response’ as arguments, along with the object-required ‘self’ argument. It sets the object property ‘query_string’ to the environ ‘QUERY_STRING’ value, which is everything in the URL after the ?. You’ll use it later to determine whether you should resize a photograph. You get the response, which could be a binary JPEG, or HTML, as well as the mimetype from the call to the object method find_object, passing it the environ variable ‘PATH_INFO’, which is everything in the URL after the domain, and before the ?.
find_object’s job is to translate the ‘PATH_INFO’ into an object on the filesystem. That translation really begins in the object’s __init__ method, where the root property is set. Root, in this context, is the base directory on the filesystem where you want to find all your jpegs. It also takes a template as an argument, or uses the base template (really just a big fat string) declared above. When find_object is called, it combines the web path with the root property (again, the directory on the filesystem that holds all your jpegs) and then determines whether the referenced file is a directory or a jpeg.
If the file is a directory, it calls directory_listing, passing it the directory on the filesystem, as well as the web path, and gets back a nice chunk of HTML that lists the contents of that directory. If the file is a jpeg, it passes the call off to picture, which returns the mimetype of “image/jpeg” as well as the raw jpeg encoded binary data from the jpeg. If the user passed in the query_string “thumbnail” variable, the image is resized on the fly to fit the constraints passed in the query string. This is how you get nice little thumbnails of each photo inside a directory.
In reality, for anything other than a trivial web application, you’re probably better off building on top of an existing web framework, like Django or Pylons, both of which can be served as WSGI applications themselves. That said, building an application with WSGI, rather than a high-level framework, gives you a sense of what’s happening under the covers in your Python web framework of choice. This article focused on building web applications with WSGI, but didn’t touch on WSGI middleware, which allows you to insert chunks of code before or after the server request is processed by the application. If you’re really committed to building a non-trivial application in WSGI, you might want to check out Paste, a series of libraries by Ian Bicking that wrap up a lot of common web patterns in a clean API, and WebOb, also by Ian, a minimalist framework built on top of Paste. Ian even writes a similar file-serving application as a demonstration of WebOb’s use.