In this blog I’ll look at EXCEPT vs NOT IN performance differences.
EXCEPT command introduced in SQL Server 2005 is used to compare two tables/view/sub queries. It returns distinct rows from left result set which are not present in the right result set. The NOT IN operator does the same however it doesn’t returns the distinct result set. Let’s look at an example
Execute the below script to create dummy tables and insert rows.
-- Create dummy tables
Create table tblone (sno int identity,col1 char(1))
create table tbltwo (sno int identity, col1 char(1))
insert into tblone values('A'),('A'),('B'),('C')
insert into tbltwo values('B'),('E'),('F')
Let’s now find all col1 values from tblone which are not in tbltwo using EXCEPT and NOT IN operator and compare the result.
The EXCEPT returns distinct rows whereas NOT IN didn’t return distinct values. Let’s now analyze the execution plan for both the queries.
As shown in above snapshot, the EXCEPT query is slower than NOT IN. The distinct sort operator in the EXCEPT costs 63% of the total execution time. Let’s create a non-clustered index on tblone.col1 and then analyze the execution plan.
CREATE INDEX ix_col1 on tblone(col1)
As shown in above snapshot, the performance of EXCEPT and NOT IN is same as the distinct sort operator is replace by the index scan on tblone.ix_col1.