SQL has always had Aggregate functions like SUM(), MAX(); without them, it would be impossible to do many routine SQL queries. The SQL Standard introduced window functions in ANSI SQL 2003 to deal with a range of processing tasks that required aggregating on a partition of a set, and extended in ANSI SQL:2008. In SQL 2005, Microsoft introduced the first of the class of window functions in two flavors: Ranking and Aggregates, and released further functions in subsequent releases.
When were Window Functions introduced in SQL Server?
Let’s take a quick look at some abridged history of the implementation of window functions in SQL Server.
SQL |
FUNCTION Types |
FUNCTIONS |
Remark |
2005 |
Ranking |
ROW_NUMBER, DENSE_RANK, RANK, NTILE |
Supports PARTITION BY (optional) and ORDER BY (required) in the OVER clause. |
Aggregates |
SUM, AVG, COUNT, MIN, MAX |
Requires PARTITION BY in the OVER clause. |
|
2008 |
Ranking |
ROW_NUMBER, DENSE_RANK, RANK, NTILE |
No relevant changes to supported functionality. |
Aggregates |
SUM, AVG, COUNT, MIN, MAX |
||
2012 |
Ranking |
ROW_NUMBER, DENSE_RANK, RANK, NTILE |
No relevant changes to supported functionality. |
Aggregates |
SUM, AVG, COUNT, MIN, MAX |
Window frame capability (ROWS/RANGE) was added. |
|
Analytic |
CUME_DIST, LEAD, FIRST_VALUE, PERCENTILE_CONT, LAG, PERCENTILE_DISC, LAST_VALUE, PERCENT_RANK |
Added |
There are plenty of excellent articles that explain the workings of the SQL window aggregate functions, for example:
- This recent one by Joe Celko: Window Functions in SQL.
- An earlier article by Microsoft Certified Master Wayne Sheffield (recently updated): The new Analytic functions in SQL Server 2012
They’re useful, but do they perform well?
I’ve seen very few discussions of the performance characteristics of these functions, so today we will attempt to do just that. We will assume that you have a basic understanding of how both SQL aggregate and window aggregate functions work. If you are unsure or they are new to you, you should probably read either/both Joe Celko’s or Wayne Sheffield’s articles first.
Before window functions existed, we had to produce quite complex code to do those tasks for which window functions were designed. Nowadays, we find window functions so useful and flexible that their use has become ubiquitous in SQL.
Although there is no doubt that the window functions add richness to the SQL language, greatly simplifying the syntax and queries they appear in, we’re still left with the nagging doubt as to whether they are as fast as the older methods. They’re more easily maintained, but are they faster? This is what we want to find out. We haven’t the space to look at every window function in detail, so we’ll limit our testing to just a few, and offer some references where you can look for additional comparisons.
Sample Data and a Simple Example
The simple table and sample data we’ll use for this example is the same as one that I’ve published in a Simple Talk article in the past. You’ll see the reference when we get around to discussing PERCENTILE_CONT.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
CREATE TABLE #SampleTable ( ID INT ,N INT ); CREATE INDEX i1 ON #SampleTable (ID, N); -- Basic test data INSERT INTO #SampleTable VALUES (1,1),(1,2),(1,3),(1,4),(1,5) ,(2,1),(2,2),(2,3),(2,4) ,(3,10),(3,2),(3,10),(3,4) ,(4,1),(4,5),(4,1),(4,3),(4,3); -- SQL 2005 window aggregate example SELECT ID, N, [Rows]=COUNT(*) OVER (PARTITION BY ID) FROM #SampleTable --WHERE ID = 1; |
If we run this code with the WHERE clause uncommented, it produces this results set showing how the COUNT of the rows within the PARTITION are appended as a column to each row.
1 2 3 4 5 6 |
ID N Rows 1 1 5 1 2 5 1 3 5 1 4 5 1 5 5 |
Even though they’re named the same, the window aggregates operate differently from the original aggregate functions, in that the result adds a column across the entire row set, rather than condensing the set to the partitions that result from the GROUP BY. With just a little extra effort on the coding side, it is possible to use the original SQL aggregate functions to mimic what the window aggregates do.
In SQL 2000 and earlier, the equivalent result could be obtained using the following code pattern:
1 2 3 4 5 6 7 8 9 |
-- SQL 2000 equivalent SELECT a.ID, N, [Rows] FROM #SampleTable a JOIN ( SELECT ID, [Rows]=COUNT(*) FROM #SampleTable b GROUP BY ID ) b ON a.ID = b.ID; |
And when SQL 2005 came along, this could also be done with a CROSS APPLY:
1 2 3 4 5 6 7 8 9 10 |
-- SQL 2005 equivalent SELECT a.ID, N, [Rows] FROM #SampleTable a CROSS APPLY ( SELECT ID, [Rows]=COUNT(*) FROM #SampleTable b WHERE a.ID = b.ID GROUP BY ID ) b; |
When I compared the SQL 2000 pattern and the CROSS APPLY pattern available in SQL 2005, I noted no discernable performance difference. Since I like CROSS APPLY and I’m the writer, ( … and I’m the editor: Ed) our initial comparison will be between the window aggregate COUNT vs. the SQL 2005 CROSS APPLY pattern.
The Performance Test Harness
If we TRUNCATE the data we put into our table initially, we can add a large number of rows to it using the following test harness.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
DECLARE @N INT = 10000; -- 10,000 = ~1,000,000 rows WITH Tally (n) AS ( SELECT TOP (@N) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM sys.all_columns a CROSS JOIN sys.all_columns b ) INSERT INTO #SampleTable SELECT a.n, ABS(CHECKSUM(NEWID()))%@N -- Always create exactly 100 IDs from the Tally table FROM (SELECT TOP 100 n FROM Tally) a CROSS APPLY ( SELECT TOP (@N) n FROM Tally )b; |
Our initial test at 1 million rows, shunting the results into a local variable, generated these timing results for COUNT.
1 2 3 4 5 6 7 8 9 |
SQL 2008 Window Aggregate: COUNT SQL Server Execution Times: CPU time = 3105 ms, elapsed time = 2187 ms. SQL 2005 Equivalent (CROSS APPLY): COUNT SQL Server Execution Times: CPU time = 390 ms, elapsed time = 376 ms. |
You can already see that this particular window aggregate seemingly cannot be classified as “high-performance” SQL. But we don’t want to generalize, so let’s test each of the window aggregate functions vs. its corresponding traditional pattern and graphically demonstrate the performance results.
Performance of Traditional Aggregates vs. Window Aggregates
Since it is easy to replace COUNT(*) with XXX(N) (where XXX=SUM, AVG, MIN and MAX) in each of the two queries we’re testing, we can easily generate queries that compare each aggregate function against its corresponding window aggregate. Resource files (described at the end) are provided if you want to see each of these queries or run these performance tests yourself.
We’ll also include a test case with all of the aggregates in one query, to determine whether the cost is cumulative or not. Those two queries look like this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
SELECT ID, N ,[Rows]=COUNT(*) OVER (PARTITION BY ID) ,[Sum]=SUM(N) OVER (PARTITION BY ID) ,[Average]=AVG(N) OVER (PARTITION BY ID) ,[Minimum]=MIN(N) OVER (PARTITION BY ID) ,[Maximum]=MAX(N) OVER (PARTITION BY ID) FROM #SampleTable; SELECT a.ID, N ,[Rows], [Sum], [Average], [Minimum], [Maximum] FROM #SampleTable a CROSS APPLY ( SELECT ID ,[Rows]=COUNT(*) ,[Sum]=SUM(N) ,[Average]=AVG(N) ,[Minimum]=MIN(N) ,[Maximum]=MAX(N) FROM #SampleTable b WHERE a.ID = b.ID GROUP BY ID ) b; |
The histograms show the comparative elapsed times for each of the aggregates, with the last column showing the case of all aggregates applied to the same partition within one query, with and without the suggested INDEX.
Many of these cases perform better without the index than with it.
This table shows the CPU and elapsed time, indicating the “cost” of each window aggregate compared to the traditional aggregation method.
With INDEX |
|||||||
Measure |
Query |
COUNT |
SUM |
AVG |
MIN |
MAX |
ALL |
CPU (ms) |
Window Aggregate (WA) |
3039 |
3136 |
3167 |
2901 |
3028 |
3556 |
Traditional Aggregate (TA) |
390 |
499 |
531 |
453 |
452 |
858 |
|
Cost |
679% |
528% |
496% |
540% |
570% |
314% |
|
Elapsed (ms) |
Window Aggregate (WA) |
2208 |
2260 |
2419 |
2236 |
2233 |
2676 |
Traditional Aggregate (TA) |
385 |
491 |
525 |
465 |
457 |
855 |
|
Cost |
474% |
360% |
361% |
381% |
389% |
213% |
|
|
|
|
|
|
|
|
|
Without INDEX |
|||||||
Measure |
Query |
COUNT |
SUM |
AVG |
MIN |
MAX |
ALL |
CPU (ms) |
Window Aggregate (WA) |
4945 |
5226 |
5130 |
5257 |
5008 |
6287 |
Traditional Aggregate (TA) |
1141 |
1357 |
1391 |
1200 |
1247 |
1529 |
|
Cost |
333% |
285% |
269% |
338% |
302% |
311% |
|
Elapsed (ms) |
Window Aggregate (WA) |
1933 |
1967 |
1921 |
1663 |
1823 |
2050 |
Traditional Aggregate (TA) |
441 |
440 |
453 |
416 |
423 |
648 |
|
Cost |
338% |
347% |
324% |
300% |
331% |
216% |
Cost in all cases is calculated as: 100*(WA-TA)/TA
I’m not sure where you, my valued reader stands, but this seems to be a pretty steep cost to pay for streamlining the query syntax! These timing results were obtained in SQL 2008 but, sadly, there was no perceptible improvement when running the equivalent test in SQL 2012. One could have hoped that Microsoft noticed or was informed of this lag in performance, and could have done something to improve it.
Redemption in the Ranking Functions
When I first started using the window ranking functions in SQL 2005, I was very impressed. All of them solve some very sticky querying predicaments and they do so with excellent speed. In fact, I’d be hard-pressed to come up with any SQL 2000 alternatives that have even reasonable performance (without using temporary tables) to compare against so I won’t even try.
Kudos to Microsoft on this one!
SQL 2012: Enter the Window Frame for the Aggregate Functions
In SQL versions past, there are a few problems that posed some very difficult cases for producing a high performing query. The most notable of these are the following:
- The Running Totals problem
- Calculating values within a Rolling Window (e.g., getting a rolling 12 months total)
Both of these problems are now quite easily solved by the SQL 2012 window frame approach. Rather than go through all of the details here, I’ve provided some useful links above where you can take a look at these problems yourself. I will summarize the results you can expect here.
- Both problems can be solved using a SQL 2012 window frame applied to the window aggregate SUM function.
- You can expect the window frame approach to be faster than any triangular or semi-triangular JOIN approach (these are explained in the two linked articles above).
- Neither problem can be solved faster with a window frame than using what is known as the Quirky Update (QU). The article by SQL MVP Jeff Moden linked in above describes in great detail all the rules needed to correctly implement the QU. My article on calculating values within a rolling window shows how it is much faster than the window frame approach for that particular problem. Microsoft Certified Master Wayne Sheffield showed how the QU is faster than a SQL 2012 window frame for the running totals problem in his blog Running totals in “Denali” CTP3. Even though that study was done in a pre-release of SQL 2012, it is doubtful the performance characteristic has changed.
Overall, the SQL 2012 window frame offers a welcome addition to the SQL window functions. The performance is better than most methods and given that its behavior is fully documented (unlike the QU), it is a prime choice to use for these problems, unless you absolutely must squeak out the highest possible performance and you’re willing to live with the “undocumented behavior” of the QU approach.
SQL 2012: And Then There are the Analytic Functions
The analytic functions in SQL 2012 offer unprecedented new functionality that is extremely useful in statistical analyses. Just the fact that they exist will save many developers of scientific, statistical and engineering applications lots of headaches trying to replicate the same functionality within complex SQL code. Of course, I’m sure CLR versions exist, but there will be cases when they can’t be used due to development constraints. Overall I have to give kudos to Microsoft for these as well.
But in the world of SQL, it always depends. Sometimes there are alternatives that may be just enough faster that you’d want to consider using them even though they may look more complicated.
PERCENTILE_CONT
A case in point is the PERCENTILE_CONT analytic function. Its most common usage is to calculate the median of a statistical sample. With the PARTITION clause, it can do so against partitioned sets. Let’s compare a couple of queries that calculate median for this case. We can use the same sample data that we created before.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
-- Calculate Median in SQL 2012 SELECT ID, N ,Median=PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY N) OVER (PARTITION BY ID) FROM #SampleTable; -- One Way to Calculate Median in SQL 2008 WITH Counts AS ( SELECT ID, b.c FROM ( SELECT ID, c=COUNT(*) FROM #SampleTable GROUP BY ID ) a CROSS APPLY (VALUES((c+1)/2),(CASE c%2 WHEN 0 THEN 1+c/2 ELSE 0 END)) b(c) ), CalculateMedian AS ( SELECT a.ID, Median=CAST(AVG(0.+b.N) AS FLOAT) FROM ( SELECT ID, N ,rn=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY N) FROM #SampleTable a ) a CROSS APPLY ( SELECT N FROM Counts b WHERE a.ID = b.ID AND a.rn = b.c ) b GROUP BY a.ID ) SELECT a.ID, N, Median FROM #SampleTable a JOIN CalculateMedian b ON a.ID = b.ID; |
While the second query may look extraordinarily complex, the results when we run it are much better than PERCENTILE_CONT.
MEDIAN |
With INDEX |
Without INDEX |
||
CPU (ms) |
Elapsed (ms) |
CPU (ms) |
Elapsed (ms) |
|
SQL 2012: PERCENTILE_CONT |
6520 |
5842 |
13385 |
3786 |
SQL 2008: Complex Median Query |
2340 |
2327 |
5477 |
1897 |
Cost |
179% |
151% |
144% |
100% |
And there are even (much) faster solutions available to calculate median for cases where the suggested index is present. The test harness provided in this article’s resources section is described at the end.
On the other hand, PERCENTILE_CONT does a lot more than just calculate the median. We’ll let you explore the description by Microsoft linked into the function name to find out more.
LAG and LEAD Applied to Finding Gaps in Sequence Numbers
Recently, Simple Talk published an article of mine entitled The SQL of Gaps and Islands in Sequences. In that article I compared 3 “traditional” solutions for the gaps problem to one of my own design. We’ll now take that test harness provided for 1M rows and modify it (see description of the resource files provided at the end) to consider the methods within SQL 2012 where LAG and LEAD can be applied to identify gaps. Oftentimes you see one or the other of these solutions suggested by the author but here we’ll show both.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
SELECT ID, StartGap=SeqNo+1, EndGap=nxt-1 FROM ( SELECT ID, SeqNo ,nxt=LEAD(SeqNo, 1) OVER (PARTITION BY ID ORDER BY SeqNo) FROM dbo.GapsIslands ) a WHERE nxt - SeqNo <> 1; SELECT ID, StartGap=prv+1, EndGap=SeqNo-1 FROM ( SELECT ID, SeqNo ,prv=LAG(SeqNo, 1) OVER (PARTITION BY ID ORDER BY SeqNo) FROM dbo.GapsIslands ) a WHERE SeqNo - prv <> 1; |
Once again, these solutions are nice and concise, but let’s see how they perform in our test harness.
MEDIAN |
CPU (ms) |
Elapsed (ms) |
Gaps Solution #1 from SQL MVP Deep Dives |
780 |
782 |
Gaps Solution #2 from SQL MVP Deep Dives |
4665 |
1451 |
Gaps Solution #3 from SQL MVP Deep Dives |
1233 |
1219 |
CROSS APPLY VALUES – Islands to Gaps |
421 |
462 |
SQL 2012 LEAD |
1747 |
1740 |
SQL 2012 LAG |
1357 |
1363 |
In this case, all of the traditional solutions outperformed the SQL 2012 LEAD function, and all but one outperformed LAG, in elapsed time. Only one traditional solution consumed more CPU time. My conclusion from this is that, while LAG (or LEAD) may improve the clarity of your code, opting for one of the well-documented, traditional approaches to gaps will solve the problem (particularly against very large row sets) more efficiently.
Wayne Sheffield also blogged on some of the existing high performance approaches to gaps in SQL 2012 Performance Test: Gap Detection. In this blog, he compares the performance of LEAD only to a couple of competing high-performance methods that can be run in earlier versions of SQL. His test of LEAD also showed it to be much slower (cost about 250% according to my prior calculation method) than the other approaches that he tested. Using LAG would be expected to have approximately the same performance characteristic (although in my test harness it appears to be slightly better).
Conclusions
Just as new doesn’t always mean better, simpler sometimes does not always translate into better either, particularly where performance is concerned. Ultimately you must be the judge of whether simpler query forms are more desirable from a maintenance perspective than having a better performing solution.
As far as the SQL window ranking functions are concerned, there is probably no better approach available in earlier versions of SQL. On the other hand, the SQL window aggregate functions can be much improved upon with respect to performance by relying on only a slightly less simple code pattern based on standard aggregate functions.
For the SQL 2012 enhancements to the window aggregates (the window frame), for problems such as running totals they’re probably a good alternative if you don’t like or understand how the Quirky Update works. LEAD and LAG show promise, but there remain a few traditional methods that outperform them for problems like finding gaps in sequence numbers.
All of the new SQL 2012 analytic window functions provided unprecedented flexibility. But using PERCENTILE_CONT for calculating median can be improved upon from a performance perspective using more traditional, albeit more complicated code patterns.
In the end, using the latest query forms may offer a shorter path to getting your code working, because the queries tend to be simpler. This article is simply a suggestion that you consider looking into and testing other code patterns in cases where performance is really important to you.
The resource files provided with this article are:
- Window Aggregate Functions Test Harness.sql – This test harness is initially set up to create a 1,000,000 row test table and then display the timing results for window aggregates vs. corresponding traditional aggregations, while suppressing the actual query results by assigning to a local variable. Comments in the beginning will guide you if you want to just display results for the simple test data shown in the initial example.
- Median Test Harness.sql – This test harness is initially set up to create a 1,000,000 row (approximate) test table and then display the timing results for the SQL 2012 vs. an alternative median solution, while suppressing the actual query results by assigning to a local variable. Comments in the beginning will guide you if you want to just display results for the simple test data shown in the initial example.
- Gaps Test Harness.sql – This test harness is initially set up to create a 1,000,000 row (approximate) test table filled with gaps and then test 4 “traditional” gaps solutions vs. the SQL 2012 LEAD and LAG functions, while suppressing the actual query results by assigning to a local variable. Comments in the beginning will guide you if you want to just display results for the simple test data shown in the initial example.
Load comments