Hi Friends,

This one is about SQL Server removing duplicate records in SQL Server 2005 and above. Consider the below table with duplicate records.

The above insert statement is valid only from sql server 2008 onwards. The below code can be used to delete duplicate records from the above table.

  • Query 1: Using Row_Number with CTE

  • Query 2: Using Row_Number in a subquery

The table tbldup now as 5 unique rows has shown below.


Query1 and Query 2 are similar in performance as shown in the below execution plan.


Removing duplicate using above queries can be much slower for large tables having lot of duplicate records. It’s because, lot of T-log entries will be created for the rows being deleted also the triggers and indexes can make delete statement to crawl.The fastest way is to bulk-insert the unique rows into a new table. Bulk-Insert is a minimally logged operation under Bulk-Logged and simple recover model.  The code below bulk-inserts the unique rows from the table tbldup into the table tblUnique.



Ahmad Osama

Like us on FaceBook Follow us on Twitter | Join the fastest growing SQL Server group on FaceBook

Follow me on TwitterFollow me on FaceBook