Register | Login 
View Article  

Current Articles | Categories | Search | Syndication

Common Table Expressions

By Scott Klein on Saturday, December 09, 2006 :: 1409 Views :: 0 Comments :: SQL Server

There is a plethora of new T-SQL features and enhancements in SQL Server 2005 and you could probably write a small book talking about all of them.  However, this article will focus on one of them, specifically, Common Table Expressions because personally they have come in handy on many occasions.

I have used a few of the other new features and enhancements and if you have not had a chance to delve into any of them, I would highly recommend familiarizing yourself with some of them, such as the new xml data type, improved error handling, query and event notifications, new DDL (Data Definition Language) triggers, ranking functions, TOP operator enhancements, the PIVOT and UNPIVOT operators, and the new APPLY operator.  There are a few others and they all add tremendous value to T-SQL developers.

This article will introduce common table expressions and go as in-depth as space will allow, showing how CTE’s can be used to improve TSQL’s capabilities and how they can help better your T-SQL.

Common Table Expressions (CTE’s)

A common table expression is expressed as a temporary table or view defined within an executing statement such as SELECT INSERT, UPDATE, or DELETE.  Think of a CTE as somewhat like a derived table in that a CTE only lasts throughout the duration of the executing statement.  When that execution of the statement finishes, the CTE is gone.  However, there are a few differences between a derived table and a CTE:

          *              A CTE can reference itself.  A derived table cannot.

          *              A CTE can be referenced multiple times in the same query.

So what is the difference between a CTE and a true temporary table?  The quick and easy answer is the overhead of having to create a temporary table and the performance consequence of using a temporary table.  Think about what happens when you use a true temporary table.  First you have to define and create the table then populate it, then run queries against it.  When using CTE’s you can forgo the first step and the associated overhead and use the built in CTE which provides all of that functionality for you.

CTE Structure

The structure of CTE is fairly simple.  It basically contains two main parts, the first part being the CTE followed by the second part which is the normal execution statement.  The basic syntax for a CTE is as follows:

WITH common_table_expression_name [column(s)]

AS

(

  query_definition

)

EXECUTION STATEMENT

 

As defined in the syntax above, a common table expression contains several arguments which are defined as follows:

common_table_expression_name: The identifier for the common table expression.  This name must be different than any other table or view used within the CTE.  It can, however, be named the same as a base table or view but when the common table expression is referenced in the normal execution statement the common table expression name must be used, not the base name.

column(s): The column names in the common table expression.  These column names must be unique (meaning, no duplicate names allowed) and must match the same number of columns returned by the query_definition.

query_definition: The SELECT statement in which the results are used to populate the common table expression.  This statement follows the same requirements that the normal execution statement follows.

A CTE expression cannot define another CTE expression.

The following example uses the syntax outlined above to create a very simple common table expression query.  Open a query window in SQL Server Management Studio and execute the following query against the AdventureWorks database.

WITH CTE_ProdMod (ProductModelID, ProductModelCount) AS

(

SELECT   ProductModelID, COUNT(*)

FROM     Production.Product

WHERE    ProductModelID IS NOT NULL

GROUP BY ProductModelID

)

SELECT   ProductModelID, ProductModelCount

FROM     CTE_ProdMod

ORDER BY ProductModelID

 

The example above uses a common table expression to count the number of products for each product model in the Production.Product table.  Run this query and take a look at the results.

A simple example, yet not that impressive because it doesn’t quite showcase the true power and flexibility of what a common table expression can really do.  So, the following example uses a common table expression to return the count of products for each product model.  The count is returned in the CTE, then the CTE is used to filter the results even further based on what was returned in the CTE.

WITH CTE_TranHist (ProductID, OrderQuantity) AS

(

SELECT   ProductID, COUNT(*)

FROM     Production.TransactionHistory

WHERE    TransactionDate > '05/01/2004'

GROUP BY ProductID

)

SELECT cte.ProductID, cte.OrderQuantity, pp.productmodelid, ProductSubcategoryID

FROM Production.Product pp

INNER JOIN CTE_TranHist AS cte ON pp.productID = cte.productid

WHERE cte.productid BETWEEN 500 AND 750

AND ProductModelID IS NOT NULL

ORDER BY cte.ProductID

 

The example above uses a common table expression to count the number of products for each product model in the Production.Product table.  Run this query and look at the results.

Imagine now that syntax you would have had to use to get the same results prior to SQL Server 2005.  It would have looked something like this:

SELECT TranHist.ProductID, TranHist.OrderQuantity, pp.ProductModelID, pp.ProductSubcategoryID

FROM Production.Product pp

INNER JOIN (SELECT ProductID, COUNT(*)

               FROM Production.TransactionHistory

               WHERE TransactionDate > '05/01/2004'

               GROUP BY ProductID) AS TranHist (ProductID, OrderQuantity)

ON pp.productID = TranHist.productid

WHERE TranHist.productid BETWEEN 500 AND 750

AND ProductModelID IS NOT NULL

 

Now, which would you rather read, let alone write?  I thought so.  Common table expressions have a number of advantages including better readability and make it much easier to maintain more complex queries. 

Now that you have a general understanding of CTE’s, the true value of CTE’s appear when you have the need for recursive queries.  This topic is discussed next.

Recursive Queries

If you’re not excited about CTE’s yet then you will be shortly because the true power of CTE’s is the ability to do something that SQL developers have been praying for a very long time, and that is querying hierarchical data.  Didn’t you just cringe when your boss asked you to create a company employee org chart?  At long last the prayers of SQL developers have been answered.

In straightforward terms, a recursive CTE is a CTE that references itself.  It is executed over and over until no more results are returned.  Just like a normal CTE, a recursive CTE can be used with a normal SELECT, UPDATE, INSERT, AND DELETE statement, and even a CREATE VIEW statement.

The general syntax for create a recursive CTE is as follows:

WITH common_table_expression_name [column(s)]

AS

(

  query_definition_anchor_member

  [UNION ALL | UNION | EXCEPT | INTERSECT]

  query_definition_recursive_member

)

EXECUTION STATEMENT

 

A recursive CTE, while similar to that of a normal of a normal CTE, contains three parts versus the one of a normal CTE.  Those three parts are explained below.

common_table_expression_name: No different than that of a normal CTE, this is the identifier for the common table expression.  This name must be different than any other table or view used within the CTE.  It can, however, be named the same as a base table or view but when the common table expression is referenced in the normal execution statement the common table expression name must be used, not the base name.

query_definition_anchor_member: The anchor member is the query definition which forms the base result set.  Anchor members must precede the recursive member definition joined together by a UNION ALL operator. 

query_definition_recursive_member: The recursive query definition includes the query definitions joined to the anchor member by a UNION ALL operator.

A correctly coded recursive CTE must contain at least one anchor member query definition and one recursive member query definition.

 

A Simple Example

To understand the recursive CTE structure and they work, a simple sample is in order.

WITH CTE_Sample (Col1, Col2, Col3) AS

(

SELECT   Column1, Column2, Column3

FROM     Table1

WHERE    Some_Where_Clause

 UNION ALL

SELECT   Column1, Column2, Column3

FROM     Table1

 INNER JOIN CTE_Sample ON Table1.Column1 = CTE_Sample.Column1

)

SELECT Col1, Col2, Col3

FROM CTE_Sample

 

Examining the example above you should see that a recursive CTE works as follows:

First, the anchor query definition returns a base result set.  Next, the recursive query definition queries and returns information based on the anchor query definition, accomplished by joining the recursive query definition to the anchor query definition via a UNION ALL operator.  The recursive query is executed again using the results returned from the first iteration as the base query (or input value to the next iteration of the recursive query).  The recursive query is executed again and again until it finds no more results, at which point the entire result set is returned to the execution statement.

As a more detailed example, consider the following:

WITH CTE_Sample (MngrID, EmpID) AS

(

SELECT   ManagerID, EmployeeID

FROM     HumanResources.Employee

WHERE    ManagerID IS NULL

 UNION ALL

SELECT   ManagerID, EmployeeID

FROM     HumanResources.Employee

 INNER JOIN CTE_Sample ON Employee.ManagerID = CTE_Sample.EmpID

)

SELECT MngrID, EmpID

FROM CTE_Sample

 

This example, a scaled down version of the example found in the SQL Server 2005 BOL, follows the steps outlined in the previous paragraph.  The anchor query returns the base result set, in this case, the big cheese since he has no ManagerID.  The recursive query is then executed, joining on the anchor query to return all employees who has a ManagerID of the person found in the anchor query (in this case, the big cheese).  The recursive query is then executed again using those employees recently found in the first iteration of the recursive query, using those as the base query or input value to the current recursive query.

As stated above, a recursive CTE is executed over and over until no more results are returned. It does this by providing an internal, implicit termination check.  With each iteration of the CTE, it basically asks itself “hey, have we retrieved all the rows?” and if the results of that check are true then it says “OK, we’re done!” and exits the recursion.

During the 1960’s there was a popular TV show called “Lost in Space” in which an unnamed robot and his human counterpart, “Will Robinson” traveled through space tackling one adventure after another.  Part of the responsibility of the robot was to “watch the back” of Will and warn him of any potentially dangerous situations.  When a situation did arise that could be or indeed was dangerous, the robot would utter the phrase “Danger, Will Robinson!” and our hero would spring into action.

Why does this have to do with recursive CTE’s?  Because as cool as recursive CTE’s are, there exists the possibility of mistakenly creating a recursive CTE that results in an infinite loop.  Not good.  “Danger, Will Robinson!”.  This happens when a recursive member of the CTE returns the same value for both the parent and child columns.  Bam, you’re in an infinite loop.

Summary

OK, there you have it.  CTE’s in a nutshell.  The intent of this article is to provide a helpful explanation and overview of common table expressions in SQL Server 2005, and show the true power and flexibility they provide with recursive query capability.  We started out covering simple common table expressions and showed how much easier and clean they are to use over what you used to have to use in previous versions of SQL Server, and from there we discussed recursive CTE’s and how to use them to retrieve hierarchical data.  Very cool stuff.  Try not to drool when you start using common table expressions.  They have that affect.


Previous Page | Next Page

COMMENTS

Currently, there are no comments. Be the first to post one!
Click here to post a comment

Copyright (c) 2008 GSP Developers
Walling Info Systems | Terms Of Use | Privacy Statement