Almost every dynamic Web application eventually boils down to accessing, manipulating, and presenting information. A forum is an idea example of this type of application. Users register at the site; their information is added to the database. Someone wants to post a new message? The database is queried to validate the user and again to store the message once the user has finished writing it. Displaying each page, the forum listing, the threads listing, the message listing, and the user profile pages relies on accessing stored information. All apparently separate sets of information are stored together in a single database and it’s all pulled together with languages such as PHP and SQL.
This article is the first of four parts, focusing on building database-driven applications with PHP and MySQL. In this part, you will be introduced to MySQL and SQL, discover what databases look like, and begin to learn the commands to work with them.
There’s a plethora of database products out there, from Oracle, mSQL, and PostgreSQL to MS Access. Even amid all this competition, MySQL has managed to become and remain exceptionally popular in the *nix programming community and the first choice for PHP developers.
There are several reasons for this. The biggest draw is the fact that MySQL is completely free under the GPL license, which gives it an instant price advantage over often-expensive commercial options. It is also open source, enjoying the benefits of community development and quick attention to bugs. And because it is natively supported by PHP without compiling in any additional modules, it is a natural choice for Web hosts that want to offer database support.
These all add up to make MySQL a compelling choice.
Free and accessible do not equate a less capable product, though; for small to medium-large applications, MySQL is also fast, adaptable, and feature-rich. Performance benchmarks against other databases are frequently biased (in one direction or another), but they generally give MySQL high marks. MySQL’s own documentation indicates that, barring an operating system’s own limitations, databases can be as large as 8 million terabytes.
Talking to Databases Using SQL
SQL is an abbreviation for Structured Query Language. It is a surprising natural language that allows us to “talk” to relational databases, such as MySQL, to request and modify data. It also allows us to get information about and modify the framework that contains the data, databases themselves, and tables within them. The queries formed with SQL can range from the simple to the very complex, which makes it an extremely powerful tool for working with data.
SQL is not exclusive to MySQL, nor should it be mistaken as being a part of PHP. It’s also not the only database query language out there. It is, however, the most popular. If you learn how to speak SQL, that knowledge translates to a range of other popular database products.
The Anatomy of a Database
Simply put, a database is a collection of structured information. Relational databases, such as MySQL, allow information to broken into sets called tables. Take a look at this simple illustration of a table with some data in it:
Tables are very similar to associative arrays. This table has three rows of distinct information representing different users. Rows are also known as records. It is also divided into three columns: id, name, and email. The intersection of a row and a column is considered a field (The difference between a column and a field is a technical one and the terms are frequently used interchangeably), which is populated with the actual data.
Keep the structure of this sample database in mind; we’ll be returning to it later in the article to illustrate several examples.
Working with Existing Databases Through PHP
Whether you are using PHP and MySQL, or another combination, the same basic procedure for working with a database applies.
First, a connection to the database server is established. This connection usually lasts for the lifetime of the running script. Additionally, as you will later see, there are instances where they can be contiguous across scripts.
Next, a specific database is selected to work with. Only one database at a time can be selected per connection, but you can switch to different databases without terminating the connection. You can also have multiple connections established within the script, each with an independent selected database.
Finally, once you have a connection established and a database selected, you can begin to work with the tables within them by using SQL to issue various commands. Data can be retrieved, added, modified, or deleted, and changes can be made to the selected database itself.
When finished, the connection to the database server is terminated manually with mysql_close(), or automatically with the end of the script.
Let’s take a closer look at each step separately.
Connecting to MySQL
Just like a multi-user computer system or an FTP server, MySQL allows different user accounts to be established with individual levels of access. To establish a connection to MySQL, three pieces of information are required: the hostname of the database server, the username for the account, and the associated password.
This information is passed to the mysql_connect() function, which attempts to establish the connection:
The function returns a value TRUE on success, and FALSE on failure, so it is useful (though not necessary) to assign the function to a variable. In the above example, $connect is used. It can be evaluated to avoid performing any additional queries if the connection attempt failed:
It is also useful to store the results in a variable because the function returns a unique connection identifier upon success. The variable then can be passed to any of PHP’s MySQL handling functions (typically as the second parameter) to specify that this database connection should be used, as opposed to any other active ones within the script.
Selecting a Database
Once a connection is set up, the next step is to select a database. To do this, all that is needed is the name of an existing database:
As with mysql_connect(), this function returns TRUE or FALSE depending on whether the database was successfully selected. To specify which connection the function should use, we can pass a connection identifier to the function:
Without the second parameter, the function uses the database connection that was last established within the script.
Querying the Database
The mysql_query() function is essentially a PHP wrapper for performing SQL commands. Any valid and complete SQL statement can be passed as a parameter to this function.
Four types of queries make up SQL’s data manipulation statements. They are as follows: SELECT, UPDATE, INSERT, and DELETE.
The SELECT query is used to retrieve data from one or more tables within a database. As you’ll learn later, we can also select data from multiple tables using Table Joins, but the simplest type of SELECT statement works with only one table at a time. The syntax looks like this:
Essentially, this statement says, “Select all the fields in all the rows of the table called users.” The asterisk is a wild card character that tells MySQL that it should grab anything available in the specified tables.
To request this data from a PHP script, we pass the query to the mysql_query() function. Because we are expecting records to be returned from the database, the function must be assigned to a variable. The query is a string, so PHP requires it to be surrounded by quotes. (Variables are often passed as part of queries and SQL requires that string values in WHERE clauses and INSERT statements be surrounded with single quotes. Though single quotes can be used here, it is generally preferable to use double quotes for clarity.)
Just like mysql_connect() and mysql_select_db(), this function will return FALSE if there is an error in the SQL. If the function is successful, however, $result will hold a resource identifier.
Because the variable just holds an identifier, we cannot get any of our records by outputting $result. To get the data, we need to pass the result to another PHP function, mysql_fetch_array(). This function returns an array of each record from a SELECT query, line by line. Unless you know for a fact that your query will only return a single record, it is advisable to use a while loop to itinerate through each row of the result set. The function maintains an internal place mark of which record it is on, so the loop will terminate once all of the records have been returned. Let’s return to our sample database from the start of the article and the select statement above to see how you would output the data:
Notice that the array indexes of $row match the column names of the table. The array generated by mysql_fetch_array() can optionally be an associative or a numeric array. By default, records are accessible by both types of keys. So, the id column (the first column in the table) can be accessed as $row[’id’] or $row;
Limiting Returned Fields
The SELECT query we’ve looked at so far is very simple, but SQL gives us the ability to form more complex statements. The more detail added to the query, the more we drill down our results.
Let’s say we only wanted the id and email of our users. Rather than selecting all the fields in all the records, we only want to retrieve two fields in each record. To do this, one change is made to the query:
Notice that the wildcard character has been replaced with the specific columns we want to return. In large tables, it is always advisable to only select those columns you actually need. This saves on both processing power and memory.
The Power of WHERE and Limiting Returned Records
In addition to limiting a query by which fields are returned, we also can specify which records should be returned. Suppose we only wanted to return two rows from our example table, the one where the email address is [email protected] and the one where the email address is [email protected]. The query would look like this:
The WHERE clause allows logical statements to be appended to the query. These statements limit which records returned. WHERE is also used in DELETE and UPDATE statements to determine which records should be affected. The clause typically includes one or more fields and conditions that the contents of each must meet. In this way, WHERE acts a lot like an If Statement: A record is tested against the conditions in the WHERE clause. If the conditions are found to be true, the record is returned.
Additionally, there are always ways to directly limit the number of records that are returned. Logically enough, this is done by appending LIMIT to the query. Here are a few examples:
When two numbers are included following LIMIT, the first denotes the starting point, or offset. The second number defines how many records from that starting point should be returned. If no offset is specified, the records will be returned starting from the first match.
Sorting Returned Records: ORDER BY
Sometimes, it is useful to be able to sort the rows that are returned in a result. For example: Imagine you had a simple script that displayed messages posted by users on your site; these messages were stored in a table. When a user posts a new message, it gets added to the table. Retrieve the messages from the database and output them, and they will show up in the same order that they were added. The newest message will be shown last because it was added last.
But what if you wanted to display the newest message first? You could load all the results from the query into an array, use an array sorting function, and then output it all. But there is an easier way built right into SQL:
The ORDER BY clause can be appended to any select statement—after the WHERE clause, and before the LIMIT clause—to order the results by one or more columns. In the case of string fields, the records are sorted alphabetically, from a-z, by default. Numeric fields are sorted from smallest to largest by default. Multiple columns can be included in the ORDER BY clause by separating them with commas.
INSERT is used to add a new record to a table. The syntax looks like this:
By now, the first part of this query should look fairly familiar to you. It begins with the SQL command we are going to perform, in this case INSERT. Because we are inserting a new record, INTO is included, followed by the name of the table we are adding it to.
The first set of parentheses surrounds a list of the fields within the table. These must be listed in the same order that they occur in the actual table.
The second portion defines the values that are going to be added. They must match the order of the fields within the query and the table for them to be inserted in the proper location.
Like the field names, SQL requires that each value be surrounded by quotes, because they are strings. If there is no value to be inserted, the position for the value must still be reserved by using empty quotes.
If you are inserting values for all of the fields, the field listing may be left out:
MySQL will insert the values into the columns in the order that they appear.
So far we’ve covered selecting and adding information to a table, but what if you want to change existing data? UPDATE allows us to do just that. The basic syntax for an UPDATE statement looks like this:
In this case, we’re updating table users, and changing the value of the column called email to “[email protected]”. Take another look at the query. You’ll notice that we have not specified which records should have the field email updated. If this query were run, all the fields called “email” in all the records would be changed. To confine the changes to a single (or multiple) specific records, the WHERE clause makes a re-appearance. In the select statement, it was used to evaluate which records should be returned; here, it is used to evaluate which records should be updated:
Now, only records where the id number has a value of one will be updated.
Multiple fields within the same record can also be updated. In this example, both name and email are altered:
What gets added must eventually get deleted. The DELETE query is the simplest query and looks like this:
Because entire records are deleted, there is no need to specify a field list. The WHERE clause is essential; without it, all the records in the table would be removed.
Creating New Databases and Tables
Creating and managing databases can be done using SQL via the command line interface to MySQL, through PHP queries you write yourself (mysql_query()), through popular managers like PhpMyAdmin (http://sourceforge.net/projects/phpmyadmin/), or through MySQL’s own GUI (http://www.mysql.com/products/mysqlcc/). Most Web hosts now include a default installation of PhpMyAdmin.
Because there are numerous options, this section will briefly focus on the SQL commands for creating databases and tables, rather than a specific method of executing them.
Setting up a new database is a two-step process. First, the database itself has to be created:
The new database is completely empty (and relatively useless) until some tables are added to it. Here, things get a little trickier.
Remember, each table is made up of multiple columns (fields). When a new table is created, we need to assign names for all of the fields we want included. Additionally, each must be assigned a MySQL data type and, in some instances, a length (some MySQL data types require a field length to be defined; others do not.)
MySQL’s data types determine what kind of information can be stored in a field. There are a wide range of data types available, which can loosely be classified into separate groups—string, numeric, and date/time types. Some of the most common types include:
- INT(225) (integer)—Holds integer values up to 255 characters. Length must be specified!
- VARCHAR(225)—Holds any string up to 255 characters. Length must be specified!
- TEXT—Holds strings with no maximum lengths
- DATETIME—Holds formatted date and time information.
To create the users table from the original example, the SQL would look like this:
The query begins with the CREATE TABLE command. Next is the unique name of the new table. Within the parentheses, each new field name is specified, followed by the field type and the maximum number of characters, if required.
The Importance of Primary IDs
You’ll notice that the id field is followed by “AUTO_INCREMENT” and that it is also marked as a Primary Key after all the field names are defined.
Good database design says that in almost every instance a table should have an id column that is a unique value. This gives each record an identifier that can be used for editing, deleting and updating, and for joining tables. When AUTO_INCREMENT is applied to a field, MySQL will automatically populate the field with unique number that increments every time a record is inserted. The field that serves as the unique identifier is the table’s primary key, and should be defined as such.
When there is something wrong with your PHP code, the processor automatically spits out an error message along with the line number where the error occurred. This makes troubleshooting exceptionally easy. At first glance, though, no such luck with MySQL queries. As mentioned before, the PHP functions that execute SQL queries just return FALSE when there is a problem. Not very helpful.
Fortunately, you are not left in the dark; there is a way to get more detailed error information. The PHP function mysql_error() returns the exact error message of the last failed query, according to MySQL. You’ll frequently see it used to do quick and dirty error output like this:
In effect, this line of code says “Execute this SQL statement. If the command fails, stop processing the running script and output the error.”
The error message can also be assigned to a variable for use in creating more graceful error handling, or in situations where you do not wish to terminate the script:
This article has introduced you to the basics of using SQL to work with MySQL through PHP. Next time, we’ll build on the concepts introduced here to build a working application.
Liz Fulghum currently lives in Annapolis, MD where she works as a web designer for a custom shirt retailer. Liz was convinced to try PHP as an alternative to Perl; she has been a fan ever since and frequently works as a freelance developer.
# # #