Computers have vastly enhanced our ability to collect large amounts of
data. It’s perfectly feasible for most retail establishments to hook up
their network to every cash register these days, and capture the time, date,
and details of every purchase. With customer rewards cards, these can be
matched back to consumer details. Throw in basic weather information, the
checker’s name, the store number, and you can have terabytes of data piling
up in no time.
But there’s a problem with all that data: it’s hard for people to figure out
what it all means. That’s fueled a rise in Business Intelligence (BI)
software: software designed to help extract patterns from large amounts of data.
Most BI software lets you summarize large amounts of data into simple tables and
graphs. A new entrant in the market, Tableau, raises the bar a bit
further. The vendor refers to Tableau as “visual analysis” software. The goal of
Tableau is to take masses of numeric data and turn it into charts and graphs
with minimal user intervention, letting you spot patterns via color, shape, and
placement: skills that most of us are pretty good at. I used version 1 of
Tableau to explore some data, and overall the experience was a good one.
Getting Started with Tableau
Like an Excel pivot chart, a Tableau sheet starts blank, with a batch of
places you can drag data fields to. You connect to a data source (which can be
Access, Excel, Microsoft Analysis Services, MySQL, or various other things)
and it takes a guess which fields are dimensions and which are measures. You
can change the guess if it’s wrong, but it’s usually pretty accurate. Figure 1
shows a new Tableau sheet connected to an Excel worksheet full of sample data.
The left side of the screen shows the various dimensions and measures, and the
right side is waiting for these fields to be dragged to it.
To analyze the data, you just drag and drop. You can drag to a “shelf” to
designate fields for columns, or rows, or filters – but you can also designate
fields for marker shapes, or colors, or sizes. And depending on what you drag
where, and which menu choices you make, Tableau will generate standard text
crosstabs, or amazingly clever graphics.
For a quick start, drag the Date dimension to the Columns shelf, the Sales
dimension to the Rows shelf, and the Product Type dimension to the color shelf.
Tableau turns this into the bar chart shown in Figure 2, complete with legend.
With most BI products, turning a crosstab into a chart would be a several-step
process; Tableau effectively short-circuits the process to let you create the
charts directly.
But just knowing sales by product by year is a very gross measure of
performance. It’s easy to drill in and get more information out of Tableau by
adding more fields to the worksheet. Drag the Market field on the Columns shelf,
and it adds subcolumns to the layout. Similarly, drag the Sales field to the
Rows shelf and the Profit measure to the Size shelf. Figure 3 shows the
result.
Let’s take a moment and look at some of the information that you cna see on
this worksheet now, after just half a dozen drag-and-drop operations:
- In decaf beverages, the profit is all in smoothies.
- Almost all the profit in decaf is coming from the Central and West
regions. - There are no tea sales in the South region.
- The East region sells a lot of espresso but doesn’t manage to make
any money doing it.
These facts would be there whether you were looking at a numeric spreadsheet
or even the raw data, of course. The nice thing about Tableau is that the
automatic use of factors such as size and color makes this sort of information
jump out at the human eye. We’re very good at spotting such visual patterns.
Going Beyond Simplicity
Tableau’s capabilities don’t end with drag-and-drop chart construction,
impressive though that is. For example, you can easily get the details for that
mysteriously underperforming espresso in the East region. Just use Ctrl-click to
select the two bar segments representing that data, then right-click and select
View Underlying Data. Tableau opens up a data sheet, as shown in Figure 4, with
the actual data represented by that portion of the chart. You can also choose to
export the data behind any piece of a Tableau worksheet to an Access
database. Or, if you’re preparing a presentation, you can copy the table as
an image for easy pasting into your slides.
Filtering data is easy as well. You can use the dropdown arrow next to any
field that you’ve dragged to the worksheet to filter on that field, selecting
one, some, or all values from the field to display. You can also drag any other
field to the Filters shelf, and use that field to filter the data without it
otherwise affecting the display. Done with a filter? Just drag it off the
Filters shelf to get rid of it.
For finer distinctions, you can drag a field to the Level of Detail shelf.
For example, drop the Product field there and each bar in the bar charts becomes
segmented to show the contribution of individual products to its size and
height. Hovering the cursor over any segment brings up a little window (similar
to a tooltip) with the full details about what the segment represents.
In addition to displaying and working with data contained in the original
data source, Tableau can also create calculated data fields. There’s a
reasonably rich set of operators for creating calculated fields, including
numeric, date, and string functions. You can also create binned dimensions,
which is useful for building histograms from raw numeric data.
Overall Impressions
All in all, I’m favorably impressed with Tableau as a data analysis tool. For
its core purpose – picking out patterns from large masses of denoramlized data –
it works very well indeed. The tool is easy to learn, and a gallery of sample
charts in the help file makes it easy to figure out how to produce everything
from Gantt charts to simple line graphs from straightforward data sets. I
tested with data up to a few hundred thousand rows on my local LAN, and on
that amount of data Tableau’s response was very fast.
Of course, no tool is perfect, especially in its first release. Perhaps the
most claring limit here is the lack of support for a wide variety of data
sources. Oracle is notably missing from the list of supported databases, and
even the supported types are constrained in which version you can use. SQL
Server 7.0, for example, is not supported (only SQL Server 2000 can be used with
Tableau). If your data warehouse is in an unsupported format you’re looking at a
potentially time-consuming and annoying conversion to pull it into a SQL Server,
Access, or other supported database.
You may also find yourself having to massage your data a bit before moving to
Tableau, since Tableau itself expects to work with a single table, view, or OLAP
cube at a time. If the data in question is spread across a batch of normalized
tables, you need to design the view to denormalize it before your Tableau
session. This is only a minor nuisance, but a nuisance nonetheless.
Still, I think these annoyances are far outweighed by the integration of
graphics with analysis. In most BI products, you massage the data in a purely
numeric grid, and then turn it into a graphic when you’re satisfied with your
results. This means that you spend a lot of time staring at rows and columns of
numbers, trying to make sense out of them and spot patterns. WIth Tableau, you
can use the graphics to help spot and refine the patterns even as you’re slicing
and dicing the data. This makes a huge difference in the ability to find useful
patterns in the first place, and should justify the purchase price for many
organizations.
Pricing and Specifications
Tableau 1.0 is available in three editions. The $999 Standard Edition can
connect to Microsoft Excel, Microsoft Access, or plain text files. The $1299
Professional (MySQL) edition adds MySQL to the list of supported data sources,
while the $1799 Professional edition extends the list to include Microsoft SQL
Server and Microsoft SQL Server Analysis Services. If you have the full
Professional Edition, you can also purchase a separate server-based product to
add connectivity to Hyperion EssBase and IBM DB2 OLAP Server databases, though
they don’t publicize the pricing for that product. All prices include a year
of maintenance.
Tableau runs on Windows XP or Windows 2000. You’ll need 128MB of RAM (though,
as always, more is better) and 50MB of free disk space to install the product.
If you’re interested in evaluating Tableau with your own data, you can apply for
a 30-day free trial at
the Tableau Software Web site.
About the Author
Mike Gunderloy is the author of over 20 books and numerous articles on
development topics, and the lead developer for Larkware. Check out his latest books, Coder to Developer (from which this
article was partially adapted)and Developer to Designer, both from Sybex. When
he’s not writing code, Mike putters in the garden on his farm in eastern
Washington state.