Hi Friends,

Here is a query and I want to create the right indexes for it. Let us do things step by step: (I am using AdventureWorks2008 for this)

In the above query we are doing a SUM on SubTotal and grouping by CustomerID – the business requires total sales for all the customers.

When we run the above query, we get the following execution plan (provided you have not done any changes to the default indexes on SalesOrderHeader table)

1_Indexing_for_Aggregates_in_SQL_Server

As you can see it does a clustered index scan which is fine as it needs to read all the records on the pages. But we also see that there is a Hash Match and the moment we see Hash operations, we can try to improve upon it.

Furthermore, my business tells me that they need the reported sorted by customerID. So I modify the query as follows and run it.

When I observe the execution plan, this is what I get.

2_Indexing_for_Aggregates_in_SQL_Server

And rightly so, the optimizer adds a Sort operator. Now there is certainly some room for improvement here – look at the 56% cost of the Sort Operator and we are dealing with 32000 rows approx..

When I run the missing index DMV, it does not suggest me anything at all. You can try it as well to see if you get any recommendations:

Coming back to our example, we shall try to improve the performance of the query by creating the right indexes that cover the query.

Let us try the first option:

This index creates a covering index as it includes SubTotal & CustomerID and covers the query. But notice the order of columns. Let us run the query and see if the optimizer picks up this index or not.

3_Indexing_for_Aggregates_in_SQL_Server

Yes, the index gets picked up. But is there a performance improvement? Run both the queries as a batch and observe their relative cost:

Select both the queries in your query window and press F5. I run the first one as before where the optimizer chooses the best option. I run the second one with the hint to use the clustered index, mimicking the previous behavior. Now, observe their relative cost in the execution plan.

4_Indexing_for_Aggregates_in_SQL_Server

You will observe that there is slight performance improvement. 44% vs 56%.

Is there any scope of further improvement? Yes there is, since we have still not got rid of Sort and Hash operator out there. This improvement is just because we are using a covering index. But this index does not have the columns in the right order (remember the order of the columns when we created the index)

Let us create another index:

This time I change the order of columns ensuring that my index is sorted on CustomerID.

Run the query to see if the optimizer picks up this index. Remember, we have not dropped the other indexes.

5_Indexing_for_Aggregates_in_SQL_Server

Yes, the optimizer picks this up. And most interestingly, it changes the HASH operator to Stream Aggregate and the Sort operator is also gone. Why? Simply because now the data is in sorted on customerID and Stream Aggregate needs that and gets that. But is there any performance improvement?

Let us run this query and the previous query in a batch and observe the relative cost.

Just like before I run the first one (let SQL Server decide the best approach) and the second one with a hint to use Option1. Select both the queries in SSMS and press F5.

6_Indexing_for_Aggregates_in_SQL_Server

Observe the performance improvement. 6% vs 94%.

Having the right indexes is very critical to get the right performance from your queries. I would also like to point out that covering indexes are good but they are covering with respect to a particular query. Creating too many covering indexes to cover your queries is also not the right approach. You need to strike a fine balance here. We could have also created Option2 with included columns (putting SubTotal in include list) which is also the right approach but in this case we would have hardly got any performance improvement given the size of data and also that our index keys are very small. But you can certainly try that option as well – don’t rule that out.

Hope you enjoyed reading this article.

Do post your comments and spread the word.