Since SQL is a set-oriented query language, statements that group or summarize data are its bread and butter. In conjunction with aggregate functions, they are the means by which the real work of SQL queries is performed. Developers familiar with DBMS products that lean more toward single-record handling find this peculiar because they are accustomed to working with data one row at a time. Generating summary information by looping through a table is a common technique in older database products—but not in SQL Server. A single SQL statement can perform tasks that used to require an entire COBOL program to complete. This magic is performed using SELECT's GROUP BY clause and Transact-SQL aggregate functions. Here's an example:
SELECT customers.CustomerNumber,SUM(orders.Amount) AS TotalOrdersFROM customers JOIN ordersON customers.CustomerNumber=orders.CustomerNumberGROUP BY customers.CustomerNumber
This query returns a list of all customers and the total amount of each customer's orders.
-- Bad SQL - don't do thisSELECT customers.CustomerNumber, customers.LastName,SUM(orders.Amount) AS TotalOrdersFROM customers JOIN ordersON customers.CustomerNumber=orders.CustomerNumberGROUP BY customers.CustomerNumber
This query won't execute because it's missing a column in the GROUP BY clause. Instead, it should read:
GROUP BY customers.CustomerNumber, customers.LastName
Note that the addition of the LastName column doesn't really affect the results since CustomerNumber is a unique key. That is, including LastName as a GROUP BY column won't cause any additional grouping levels to be produced since there is only one LastName for each CustomerNumber.
The HAVING clause is used to limit the rows returned by a SELECT with GROUP BY. Its relationship to GROUP BY is similar to the relationship between the WHERE clause and the SELECT itself. Like the WHERE clause, it restricts the rows returned by a SELECT statement. Unlike WHERE, it operates on the rows in the result set rather than the rows in the query's tables. Here's the previous query modified to include a HAVING clause:
SELECT customers.CustomerNumber, customers.LastName, SUM(orders.Amount)AS TotalOrdersFROM customers JOIN ordersON customers.CustomerNumber=orders.CustomerNumberGROUP BY customers.CustomerNumber, customers.LastNameHAVING SUM(orders.Amount) > 700CustomerNumber LastName TotalOrders-------------- -------- -----------3 Citizen 86753.091 Doe 802.35
There is often a better way of qualifying a query than by using a HAVING clause. In general, HAVING is less efficient than WHERE because it qualifies the result set after it's been organized into groups; WHERE does so beforehand. Here's an example that improperly uses the HAVING clause:
-- Bad SQL - don't do thisSELECT customers.LastName, COUNT(*) AS NumberWithNameFROM customersGROUP BY customers.LastNameHAVING customers.LastName<>'Citizen'
Properly written, this query's filter criteria should be in its WHERE clause, like so:
SELECT customers.LastName, COUNT(*) AS NumberWithNameFROM customersWHERE customers.LastName<> 'Citizen'GROUP BY customers.LastName
In fact, SQL Server recognizes this type of HAVING misuse and translates HAVING into WHERE during query execution. Regardless of whether SQL Server catches errors like these, it's always better to write optimal code in the first place.
The ORDER BY clause is used to order the rows returned by a query. It follows the WHERE and GROUP BY clauses (if they exist) and sorts the result set just prior to returning it. Here's an example:
SELECT LastName, StateFROM customersORDER BY State
Here's another example:
SELECT FirstName, LastNameFROM customersORDER BY LastName DESC
Note the use of the DESC keyword to sort the rows in descending order. If not directed otherwise, ORDER BY always sorts in ascending order.
Page 8 of 10