Programming the Perl DBI
© Copyright O'Reilly & Associates. All rights reserved.
Chapter 4: Programming with DBI
In this chapter, we'll discuss in detail the actual programming nterface defined by the DBI module. We'll start with the very architecture of DBI, continue with explaining how to use the handles that DBI provides to interact with databases, then cover simple tasks such as connecting and disconnecting from databases. Finally, we'll discuss the important topic of error handling and describe some of the DBI's utility methods and functions. Future chapters will discuss how to manipulate data within your databases, as well as other advanced functionality.
The DBI architecture is split into two main groups of software: the DBI itself, and the drivers. The DBI defines the actual DBI programming interface, routes method calls to the appropriate drivers, and provides various support services to them. Specific drivers are implemented for each different type of database and actually perform the operations on the databases. Figure 4-1 illustrates this architecture.
Therefore, if you are authoring software using the DBI programming interface, the method you use is defined within the DBI module. From there, the DBI module works out which driver should handle the execution of the method and passes the method to the appropriate driver for actual execution. This is more obvious when you recognize that the DBI module does not perform any database work itself, nor does it even know about any types of databases whatsoever. Figure 4-2 shows the flow of data from a Perl script through to the database.
Under this architecture, it is relatively straightforward to implement a driver for any type of database. All that is required is to implement the methods defined in the DBI specification, as supported by the DBI module, in a way that is meaningful for that database. The data returned from this module is passed back into the DBI module, and from there it is returned to the Perl program. All the information that passes between the DBI and its drivers is standard Perl datatypes, thereby preserving the isolation of the DBI module from any knowledge of databases.
The separation of the drivers from the DBI itself makes the DBI a powerful programming interface that can be extended to support almost any database available today. Drivers currently exist for many popular databases including Oracle, Informix, mSQL, MySQL, Ingres, Sybase, DB2, Empress, SearchServer, and PostgreSQL. There are even drivers for XBase and CSV files.
These drivers can be used interchangeably with little modification to your programs. Couple this database-level portability with the portability of Perl scripts across multiple operating systems, and you truly have a rapid application development tool worthy of notice.
Drivers are also called database drivers, or DBDs, after the namespace in which they are declared. For example, Oracle uses
DBD::Oracle, Informix uses
DBD::Informix, and so on. A useful tip in remembering the DBI architecture is that DBI can stand for DataBase Independent and DBD can stand for DataBase Dependent.
Because DBI uses Perl's object-orientation features, it is extremely simple to initialize DBI for use within your programs. This can be achieved by adding the line:
to the top of your programs. This line locates and loads the core DBI module. Individual database driver modules are loaded as required, and should generally not be explicitly loaded.
The DBI defines three main types of objects that you may use to interact with databases. These objects are known as handles. There are handles for drivers, which the DBI uses to create handles for database connections, which, in turn, can be used to create handles for individual database commands, known as statements. Figure 4-3 illustrates the overall structure of the way in which handles are related, and their meanings are described in the following sections.
Driver handles represent loaded drivers and are created when the driver is loaded and initialized by the DBI. There is exactly one driver handle per loaded driver. Initially, the driver handle is the only contact the DBI has with the driver, and at this stage, no contact has been made with any database through that driver.
The only two significant methods available through the driver handle are
data_sources(), to enumerate what can be connected to, and
connect(), to actually make a connection. These methods are more commonly invoked as DBI class methods, however, which we will discuss in more detail later in this chapter.
Since a driver handle completely encapsulates a driver, there's no reason why multiple drivers can't be simultaneously loaded. This is part of what makes the DBI such a powerful interface.
For example, if a programmer is tasked with the job of transferring data from an Oracle database to an Informix database, it is possible to write a single DBI program that connects simultaneously to both databases and simply passes the data backwards and forwards as needed. In this case, two driver handles would be created, one for Oracle and one for Informix. No problems arise from this situation, since each driver handle is a completely separate Perl object.
Within the DBI specification, a driver handle is usually referred to as
Driver handles should not normally be referenced within your programs. The actual instantiation of driver handles happens "under the hood" of DBI, typically when
DBI->connect() is called.
Database handles are the first step towards actually doing work with the database, in that they encapsulate a single connection to a particular database. Prior to executing SQL statements within a database, we must actually connect to the database. This is usually achieved through the DBI's
$dbh = DBI->connect( $data_source, ... );
The majority of databases nowadays tend to operate in a multiuser mode, allowing many simultaneous connections, and database handles are designed accordingly. An example might be if you wanted to write a stock-monitoring program that simultaneously monitored data in tables within different user accounts in the database. A DBI script could make multiple connections to the database, one for each user account, and execute SQL statements on each. Database handles are completely encapsulated objects, meaning that transactions from one database handle cannot "cross-over" or "leak" into another.
Database handles are children of their corresponding driver handle, which supports the notion that we could also make multiple simultaneous connections to multiple database types, as well as multiple simultaneous connections to databases of the same type. For example, a more complicated DBI script could make two connections to each of an Oracle and an Informix database to perform the above-mentioned monitoring. Figure 4-3, shown earlier, illustrates the capability of having multiple database handles connecting through a driver handle to an Oracle database.
Keep in mind that had the monitoring program been written in C, two copies of code would be required, one for Oracle's programming interface and one for Informix's. DBI levels the playing field.
Within the DBI specification and sample code, database handles are usually referred to as
Statement handles are the final type of object that DBI defines for database interaction and manipulation. These handles actually encapsulate individual SQL statements to be executed within the database.
Statement handles are children of their corresponding database handle. Since statement handles are objects in their own right, data within one statement is protected from tampering or modification by other statement handles.
For a given database handle, there is no practical limit to the number of statement handles that can be created and executed. Multiple statements can be created and executed within one script, and the data can be processed as it returns. A good example of this might be a data-mining robot that connects to a database, then executes a large number of queries that return all sorts of different types of information. Instead of attempting to write convoluted SQL to correlate the information within the database, the Perl script fetches all the data being returned from the many statements and performs analysis there, using the fully featured text and data manipulation routines that Perl has to offer.
Within the DBI specification and sample code, statement handles are generally referred