Creating Excel Files with Python and Django
When a client needs to manipulate data, it's easiest to give them an Excel spreadsheet. Creating a spreadsheet is easy with Python, and making it available for web download is just as easy with the Django web framework. After a brief tangent into evolutionary history, you'll find out how to do both.
Sharks haven't evolved much in millions of years, mostly because they're wildly efficient. They haven't had to adapt, or change, because they're really good at their job. How could you possibly improve on a shark? They're perfectly terrifying fish eating machines.
Much like sharks, spreadsheets haven't evolved much in the past few decades. Following the shark pattern, the reason for their near-stagnation is because they're really good at their job. Take a bunch of data, present it in a table, allow non-programmers to sort and manipulate the numbers to achieve new views of the data. They're perfectly terrifying data eating machines.
Given the efficiency of spreadsheets, sometimes it pays to acknowledge their power, and not waste your time trying to improve on it. A huge swath of web applications can be described as "taking a database and putting it on the web." In an iterative development environment, the first few cycles are taken by "getting the data into the database." Once that's completed, product owners start asking for views of the data in the web app. This is where it gets sticky.
It's rare to find a client that knows exactly how they want to manipulate the data you're going to be presenting to them. At times, they might think they know what they want, but as soon as you complete the work, it's most likely not really what they wanted.
This isn't some flaw in a product owner's mind; rather, it's a direct result of their probably never having actually seen all the data in one place before. The fact is, once they start playing with this new source of information, patterns they hadn't considered emerge.
Given the above, whenever possible, I like to give a product owner an Excel representation of their data. I usually allow them to download it through a web application. When they've played around with the data in Excel, they'll have a better idea of how they use the data, and then know how they ultimately want to see the data presented. It saves me time, and leads to a better end product.
In many cases, a "download data as Excel" button is also a great way to make data portable. As the application matures away from needing the Excel view, the app will keep its Excel export function as a way for a user to take data offline. Any way you look at it, having the ability to download to Excel is an attractive feature that solves a lot of headaches.
Now that I've established that spreadsheets are great, how do you create them and allow them to be downloaded on the fly from a web application? I'll start with creation, and then explain how to make them available for download.
The following example uses Django as the web application framework, but the general concept is portable to any toolkit you like.
Creating an Excel Spreadsheet with Python
Excel is (obviously) a proprietary format, but it tries its best to be open (sort of). It's pretty easy to open a comma-separated file from Excel and treat it as if it's a spreadsheet. At the same time, it's pretty easy to create a comma-separated file from Python. Clearly, Python and Excel are made for each other, their common language being csv files.
The Python standard library comes with a nice csv writer class. This brief example shows its use:
import csv w = csv.writer(open('output.csv','w')) for i in range(10): w.writerow(range(10))
The contents of output.csv look like:
0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9 0,1,2,3,4,5,6,7,8,9
When the file is opened in Excel, you're presented with a dialog that asks for the separator character. Tell it comma and you're all set.
That said, the extra step of working with a csv file might not be attractive to an end user. Luckily, Roman Kiseliov has created an excellent library for writing binary Excel files with Python. The library is called pyExcelerator and is available from its project page on Sourceforge. The writer doesn't require Windows, or Excel, which makes it even easier to run.
Similar to the previous csv example, you can write a 10 by 10 grid with pyExcelerator:
from pyExcelerator import * wb = Workbook() ws0 = wb.add_sheet('0') for x in range(10): for y in range(10): # writing to a specific x,y ws0.write(x,y,"this is cell %s, %s" % (x,y)) wb.save('output.xls')
The scripts are similar, but there are a few key differences. With pyExcelerator, you have to explicitly create a Workbook object and add a Worksheet object to it. When you write to the Worksheet, you write to a specific x,y location. This gives you much more flexibility, but also requires you to visualize the grid as you write to it.
Although pyExcelerator is trickier to use, it's much more powerful. In addition to writing simple grids of data, you can format cells, insert Excel formulas, and generally take advantage of all the features you'd expect from Excel. With the csv writer, you're limited to simple grids.