SQL Server – using OPTIMIZE FOR query hint

Hi Friends,

I really do not know what motivates me or drives me to write this blog post at 12 midnight, just landed from Singapore and waiting for my BA flight to Redmond – I guess it you, the readers and audience of SQLServerGeeks.com that motivates each author who is blogging here – when they see thousands of hits and comments, that really motivates :). I am so sleepy, yet blogging on query hint; if I make mistakes, pardon me 🙂

There is something called parameter sniffing – the query optimizer sniffs the values/constants specified in your query predicates and produces and optimized query plan based on the sniffed value.

For example:

USE AdventureWorks
GO
 
DBCC FREEPROCCACHE;
Go
 
SELECT SalesOrderId, OrderDate
from Sales.SalesOrderHeader
WHERE SalesPersonID=288

When you run the above code snippet with parameter value 288 (turn on actual execution plan), you will observe Index Seek, bookmark lookup and nested loop join. Note that the parameter value 288 produces only 16 rows.

When you run the following code, you will see in the execution plan, Seek, idx_date scan & hash match. Note that this query produces 473 records.

SELECT SalesOrderId, OrderDate
from Sales.SalesOrderHeader
WHERE SalesPersonID=277

The difference between the above two queries is just the constant value. Thus, the optimizer sniffs the values, uses statistics, computes the correct cardinality estimation and produces an optimized plan, which in our case is different.

Now run the same queries again, but this time using a variable and observe the execution plan.

DECLARE @SalesPersonID int;
SET @SalesPersonID = 288;
SELECT SalesOrderId, OrderDate
from Sales.SalesOrderHeader
WHERE SalesPersonID= @SalesPersonID
 
 
DECLARE @SalesPersonID int;
SET @SalesPersonID = 277;
SELECT SalesOrderId, OrderDate
from Sales.SalesOrderHeader
WHERE SalesPersonID= @SalesPersonID

You will observe that both the above queries produce the same hash match plan. Simply because now the optimizer does not get the variable value until runtime and uses default selectivity. In my case, the default selectivity was 223.882 and thus a hash match was a better plan for this value.  You can see this value when you compare estimated number of rows with actual number of rows. Both the above queries produce different actual number of rows, but the same estimated number of rows; and that is because of default selectivity.

   

Now, if you know (from the past usage of the table) that most of the times users punch in a SalesPersonID that produces a Nested Loop plan, you can use OPTIMIZE FOR hint. (I am using the term ‘nested loop plan’ and ‘hash plan’ just to differentiate between the two plans). And it might make business sense that there will be few sales persons who would be star performers and will produce records more than 224; majority will have much lesser sales 🙂 (I don’t deny it could be otherwise)

So, this is how I construct the query:

DECLARE @SalesPersonID int;
SET @SalesPersonID = 277;
SELECT SalesOrderId, OrderDate
from Sales.SalesOrderHeader
WHERE SalesPersonID= @SalesPersonID
OPTION (OPTIMIZE FOR (@SalesPersonID = 288));

Now, whatever be your sales person id, you will always get a nested loop plan. OPTIMIZE FOR hint now instructs the query optimizer to use 288 for @SalesPersonID when the query is compiled and optimized. This value is used only during query optimization, and not during query execution.

Advantage: you get optimized plan for most of your queries.

Downside: some queries that could benefit from the hash loop plan, will have to live with the nested loop plan.

There is always a tradeoff.

Many of you who are reading this now will love to talk more about other solutions and techniques but as you know the concept gets bigger and more complicated when we look into other hints like RECOMPILE, re-usage of plans, plan caching & recompilation, etc. You are most welcome to put in your comments. My intention here was to just introduce the OPTIMIZE FOR hint. And I hope you enjoyed reading this 🙂

 

 

   

About Amit Bansal

Amit Bansal is always brainstorming around SQL Server. Despite working with SQL since 1997, he is amazed that he keeps learning new things every single day. SQL Server is AB's first love, and his wife does not mind that. He tries to share as much and spreads the SQL goodness. Internals and Performance Tuning excites him, and also gives him sleepless nights at times, simply because he is not a genius, but quite a hard worker and does not give up. It has been a long and exciting journey since 1997, you can read here: http://sqlmaestros.com/amit-bansal/ He is on Twitter: https://www.twitter.com/A_Bansal

View all posts by Amit Bansal →

5 Comments on “SQL Server – using OPTIMIZE FOR query hint”

  1. I was thinking where I can use this to get the benefit in my conversion project. Use OPTIMIZE hint in a Stored Procedure or View will be not safe. Based on how much the filter condition is selective the SQL Server prepares the Query Plan. The number of records for a given value may not same all the time. In future when, developers estimation will not match with actual count, this hint may be cause of a bad query performance. For multiple parameters, defining the set of combination will be little difficult. I would like to use this hint in a query which I just prepare for a current situation and not use after that.

    Thanks for nice post.

  2. Hi Amit,

    Few months ago, I was facing the same problem of parameter sniffing. Later I was able to solve this issue by applying the CU2 on SQL Server 2008 R2.

    Please see the snippet from MSDN blog:

    “The change described in KB980653 is implemented via a SQL Server trace flag: 4136. When implemented, SQL Server will use an “average” estimate of selectivity derived from the All Density portion of optimizer statistics. This is identical to the behavior enabled through the new SQL2008 OPTION(OPTIMIZE FOR ‘UNKNOWN’) hint.”

    -Ravi

  3. Voow. I didn’t know about this before and only were using simple JOIN hints to decide HASH or NESTED join to use. Thanks

  4. I really should pay more attention to execution plans, but I really don’t like them because overall, they are generally wrong as far as estimated cost. Comparing fixed numbers the first one cost 9% and the second one 91% and putting the optimization hint, both of course took 50% of the cost. Using getdate() without the optimization hint the first query took 6 milliseconds and the second 206. With optimization 0 and 206. With fixed values 0 and 206.

    A clustered index scan, to me implies a table scan because the index would be ignored and if that really happened, both queries really would cost 50%. Since the cost isn’t anywhere near that what I thought it implied is wrong. It’s obvious the optimize hint, doesn’t improve or worsen the large result query, but it does help the small result, so forcing the one query type makes sense. Can the hint use a variable to define the type of query? IE (OPTIMIZE FOR (@SalesPersonID = @id));

Leave a Reply

Your email address will not be published.