October 31, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

The Basics of SQL Joins in MySQL

  • March 29, 2010
  • By Jason Gilmore
  • Send Email »
  • More Articles »

When making your first forays into relational database development, you can use simple SQL statements to mine your data easily enough. However, as your data grows in both size and breadth, you'll need to begin employing more sophisticated strategies for exploiting increasingly complex data relations. After all, pulling data from a single table is easy, but what if you need to query for data spanning three, four, or even eight different tables?

Retrieving interrelated data stored within multiple tables is most effectively accomplished using a SQL JOIN clause, of which there are several variants. The JOIN clause relies upon related fields found in two tables to determine the commonality of the data stored within each, producing a data set that you can then easily save or further manipulate.

In this article, I introduce three of the most commonly used JOIN variants: the INNER JOIN, OUTER JOIN, and SELF JOIN, and provides MySQL examples for using them. Although these examples will be MySQL-specific, you'll be able to use what you learn here within most -- if not all -- other relational database solutions.

The Inner Join

Suppose you were tasked with creating reports that extracted data from a customer relationship management (CRM) application. Specifically, the sales team wanted a list of all customers and their associated professions. Simplified versions of the customers and professions tables look like this:

CREATE TABLE customers (
 -> id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 -> name VARCHAR(255) NOT NULL,
 -> profession_id INTEGER UNSIGNED NOT NULL
);


CREATE TABLE professions (
 -> id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
 -> name VARCHAR(255) NOT NULL
);


The shared commonality between these two tables is a profession ID, represented by the id column in the professions table, and the profession_id column in the customers table. Knowing this, you can use an INNER JOIN to retrieve each customer name and the associated profession name like this:

SELECT * FROM customers INNER JOIN professions ON customers.profession_id = professions.id;


Executing this will produce a result set containing all of the values found in each interrelated row:

+----+------------+---------------+----+---------------+
| id | name       | profession_id | id | name          |
+----+------------+---------------+----+---------------+
|  1 | Acme, Inc. |             1 |  1 | Manufacturing | 
+----+------------+---------------+----+---------------+


Because you're interested only in the customer and profession names, you can revise the query to look like this:

SELECT customers.name, professions.name FROM customers 
  INNER JOIN professions 
    ON customers.profession_id = professions.id;


Executing the revised query produces the following output:

+------------+---------------+
| name       | name          |
+------------+---------------+
| Acme, Inc. | Manufacturing | 
+------------+---------------+


Filtering Records

You can attach other SQL clauses to JOINs to produce sorted or filtered output. For instance, to retrieve only customers whose professions are "Technical writers," you would use this query:

SELECT customers.name, FROM customers 
  INNER JOIN professions 
    ON customers.profession_id = professions.id
  WHERE professions.name = "Technical writer";


The OUTER JOIN

The INNER JOIN will return rows only when a matching value is found in both tables. However, what if you discovered some data inconsistencies, which arose due to the mass import of customers from another CRM solution, and you needed to know which customers have not yet been assigned a profession? You can use an OUTER JOIN for that job, of which there are two types: the LEFT OUTER JOIN (or LEFT JOIN) and RIGHT OUTER JOIN (RIGHT JOIN).

The LEFT OUTER JOIN will retrieve all rows in the table located on the "left" side of the JOIN, regardless of whether a related row is found in the table located on the JOIN's "right" side. In cases where no matching row is found, NULL will serve as a placeholder. For instance, the following LEFT JOIN will produce a list of all customers and professions, even if no profession assignment has been made:

SELECT customers.name, professions.name FROM customers 
LEFT JOIN professions 
  ON customers.profession_id = professions.id;


Suppose the customer "Taylor Made Teapots" lacked a corresponding profession. Executing a LEFT JOIN would produce the following output:

+---------------------+---------------+
| name                | name          |
+---------------------+---------------+
| Acme, Inc.          | Manufacturing | 
| Taylor Made Teapots | NULL          | 
+---------------------+---------------+


What if you wanted to retrieve just a list of customers lacking a profession designation? You can use the WHERE clause in conjunction with the IS NULL predicate, like this:

SELECT customers.name, professions.name FROM customers 
LEFT JOIN professions 
  ON customers.profession_id = professions.id;
WHERE professions.name IS NULL;


The RIGHT JOIN works identically to the LEFT JOIN, except that all rows on the right side of the JOIN will be returned, regardless of whether a shared row is found in the table residing on the left side of the JOIN. For instance, you could use a RIGHT JOIN to determine which professions are not represented within the customer database:

SELECT professions.name FROM customers 
RIGHT JOIN professions 
  ON customers.profession_id = professions.id 
WHERE customers.name IS NULL;


Executing this RIGHT JOIN produces output similar to the following:

+---------------+
| name          |
+---------------+
| Plumber       | 
| Airline Pilot | 
+---------------+






Page 1 of 2



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel