Discover the Network Search Capabilities of the Boost Graph Library
The Boost C++ libraries, or simply Boost, are an unusual beast in the domain of language and library development. First and foremost, Boost provides free peer-reviewed portable C++ source libraries. Unlike application-specific solutions, Boost libraries are intended to be widely useful and usable across a broad spectrum of applications. Boost includes more than just containers, iterators, and algorithms; you also get a powerful Regular Expression (regex), threads, advanced math, I/O, memory management, and parsing librariesjust to name a few. The license encourages both commercial and non-commercial use.
More impressively, Boost has been pushing the C++ standard quite hard in two distinct ways:
- It has legitimized some bleeding-edge techniques, such as template metaprogramming.
- It has led the way for expanding the scope of the C++ Library like no other project has.
As a result, 10 Boost libraries are already included in the C++ Standards Committee's Library Technical Report #1 (TR1), which is a step toward becoming part of a future C++ Standard, and more Boost libraries are proposed for the upcoming TR2. (This push probably is due in large part to the participation of industry heavyweights such as Dave Abrahams, Nicolai Josuttis, and others who participate in ANSI/ISO C++ committees.)
Covering all the Boost libraries is beyond the scope of any single article. This article concentrates on the Boost Graph Library (BGL), which can offer a quick and easy solution for some graph searching problems you encounter. For Win32 developers, getting started requires only downloading and unzipping the source files. Unlike most traditional libraries, you don't need to link with any .LIB or .SO files. Because Boost uses the STL approach, using it is a matter of including the header files and compiling your own code. (Boost newbies should check out Beyond The C++ Standard Library: An Introduction to Boost by Björn Karlsson, 2005.)
Boost works on almost any modern operating system, including UNIX and Windows variants. In fact, most popular Linux and Unix distributions such as Fedora, Debian, and NetBSD come with pre-built Boost packages.
Boost Graph Library: What's a Graph Anyway?
If you expected to read about graphics when you saw Boost Graph Library, you may be disappointed. Graphs are not graphics, but rather a fundamental area of study in discrete mathematics. Basically, a graph is a set of vertices connected by edges. A vertex (or node) might represent a state of operation ("stopped"), a destination ("Los Angeles"), or whatever your imagination requires. The edges (or arcs) connect the nodes and may be one-way ("directed") or two-way ("undirected"). Edges can even be assigned a weight, which might represent distance, cost, or time. Our old friend the binary-tree is but one special example of a graph.
How does this relate to application development today? Well, if you used MapQuest this week, made a call over a long-distance network, used the Linked In contact manager, or ran a "link-checker" on your Web site, you implicitly entered data into someone's graph solver system. Nowadays, data mining has moved past the obvious connections and into the realm of discovering and optimizing relationships. To enable you to bring your applications into this realm, Boost integrates with your code in the same way that other template library algorithms and containers do.
Sample App: Six Degrees of Kevin Bacon
I asked author Jeremy Siek for help in producing a BGL demo that would both solve a problem that many people recognize and have some application to searching networks. We came up with a compact solution for the Six Degrees of Kevin Bacon game. In a nutshell, the idea is to link actor "X" with a movie that Kevin Bacon starred in using the minimum number of links. Confused? Here's an example showing the relationship between Star Wars' Mark Hamill and Kevin Bacon:
- Mark Hamill was in The Little Mermaid (1989) with Jim Cummings.
- Jim Cummings was in Balto (1995) with Kevin Bacon.
Therefore, Mark Hamill has a Bacon number of 2.
Aficionados of graph theory immediately recognize this as a classic breadth-first search problem. The breadth-first search (BFS) exhaustively searches a graph and can solve several queries, including finding the shortest path between two unweighted nodes and testing for bipartiteness. BFS leaves no stone unturned, so it is the perfect algorithm for when you require 100 percent coverage of the graph. An important characteristic of BFS is that it discovers vertices with smaller shortest-path distances before vertices with larger distances.