Click here to Skip to main content
15,867,568 members
Articles / Database Development / SQL Server

Remove Duplicate Rows from a Table in SQL Server

Rate me:
Please Sign up or sign in to vote.
4.84/5 (49 votes)
15 Feb 2011CPOL2 min read 608K   52   50
Easiest way to delete more than one duplicate row from a table in SQL Server

Introduction

Most of the times, we use primary key or unique key for preventing insertion of duplicate rows in SQL Server. But if we don't use these keys, then it's obvious that duplicate rows could be entered by the user. After inserting duplicate rows into table, it becomes a major issue to delete those duplicate rows. In that time, we need to delete those duplicate rows to resolve the issue. So this topic will help us to delete those duplicate rows from the specific table.

Background

I used some basic T-SQL code to accomplish the target. So you don't need to worry to understand this code.

Problem

Firstly, we will create a table, where we will insert some duplicate rows to understand the topic properly. Create a table called ATTENDANCE by using the following code:

SQL
CREATE TABLE [dbo].[ATTENDANCE](
	[EMPLOYEE_ID] [varchar](50) NOT NULL,
	[ATTENDANCE_DATE] [date] NOT NULL
) ON [PRIMARY]   

Now insert some data into this table.

SQL
INSERT INTO dbo.ATTENDANCE (EMPLOYEE_ID,ATTENDANCE_DATE)VALUES
			('A001',CONVERT(DATETIME,'01-01-11',5)) 
INSERT INTO dbo.ATTENDANCE (EMPLOYEE_ID,ATTENDANCE_DATE)VALUES
			('A001',CONVERT(DATETIME,'01-01-11',5)) 
INSERT INTO dbo.ATTENDANCE (EMPLOYEE_ID,ATTENDANCE_DATE)VALUES
			('A002',CONVERT(DATETIME,'01-01-11',5)) 
INSERT INTO dbo.ATTENDANCE (EMPLOYEE_ID,ATTENDANCE_DATE)VALUES
			('A002',CONVERT(DATETIME,'01-01-11',5)) 
INSERT INTO dbo.ATTENDANCE (EMPLOYEE_ID,ATTENDANCE_DATE)VALUES
			('A002',CONVERT(DATETIME,'01-01-11',5)) 
INSERT INTO dbo.ATTENDANCE (EMPLOYEE_ID,ATTENDANCE_DATE)VALUES
			('A003',CONVERT(DATETIME,'01-01-11',5)) 

After inserting the data, check the data of the below table. If we grouped the employee_id and attendance_date, then A001 and A002 become duplicates.

EMPLOYEE_IDATTENDANCE_DATE
A001 2011-01-01
A001 2011-01-01
A002 2011-01-01
A002 2011-01-01
A002 2011-01-01
A003 2011-01-01

So how can we delete those duplicate data?

Solution

First, insert an identity column in that table by using the following code:

SQL
ALTER TABLE dbo.ATTENDANCE ADD AUTOID INT IDENTITY(1,1)  

Now the table data will be like the following table:

EMPLOYEE_IDATTENDANCE_DATEAUTOID
A0012011-01-011
A0012011-01-012
A0022011-01-013
A0022011-01-014
A0022011-01-015
A0032011-01-016

Check the AUTOID column. Now we will start playing the game with this column.

Now use the following code to find out the duplicate rows that exist in the table.

SQL
SELECT * FROM dbo.ATTENDANCE WHERE AUTOID NOT IN (SELECT MIN(AUTOID) _
	FROM dbo.ATTENDANCE GROUP BY EMPLOYEE_ID,ATTENDANCE_DATE)  

The above code will give us the following result:

EMPLOYEE_IDATTENDANCE_DATEAUTOID
A0012011-01-012
A0022011-01-014
A0022011-01-015

Ultimately, these are the duplicate rows which we want to delete to resolve the issue. Use the following code to resolve it.

SQL
DELETE FROM dbo.ATTENDANCE WHERE AUTOID NOT IN (SELECT MIN(AUTOID) _
	FROM dbo.ATTENDANCE GROUP BY EMPLOYEE_ID,ATTENDANCE_DATE) 

Now check the data. No duplicate rows exist in the table.

Is it too complicated?

History

  • 16th February, 2011: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Founder Codexplorer Technologies
Bangladesh Bangladesh
I am:
Founder & Technical Head at Codexplorer Technologies.
IT Consultant at Meridian Group.

I was:
Manager (IT) at Meridian Group.
Assistant Manager (Software Division) at KDS Garment Industries Limited.
Assistant Manager (Software Division) at E-Vision Software Limited.

My blog:
crea8ivecode

My preferred work area:
ASP.NET & SQL SERVER.

My email:
sadeque.sharif@yahoo.com

Follow me:
twitter | facebook | linkedin

Comments and Discussions

 
QuestionWorks like a charm Pin
Member 1365154629-Jan-18 23:45
Member 1365154629-Jan-18 23:45 
GeneralWorth it Pin
Bhuvanesh Mohankumar28-Jul-16 19:28
Bhuvanesh Mohankumar28-Jul-16 19:28 
QuestionRemove Duplicate Rows from a Table in SQL Server Pin
Member 1074885415-Feb-16 3:39
Member 1074885415-Feb-16 3:39 
SuggestionMessage Closed Pin
27-Jul-15 4:02
Member 1129842627-Jul-15 4:02 
GeneralRe: Delete All Rows From All Tables in SQL Server Pin
Sadeque Sharif27-Jul-15 23:19
professionalSadeque Sharif27-Jul-15 23:19 
GeneralRe: Delete All Rows From All Tables in SQL Server Pin
harsimranb18-Nov-15 5:29
harsimranb18-Nov-15 5:29 
GeneralMy vote of 4 Pin
Abhishek Maurya25-Jul-15 0:46
Abhishek Maurya25-Jul-15 0:46 
GeneralRe: My vote of 4 Pin
Sadeque Sharif25-Jul-15 1:19
professionalSadeque Sharif25-Jul-15 1:19 
GeneralThank You Pin
Member 1026850628-Jun-15 10:56
Member 1026850628-Jun-15 10:56 
GeneralComment: Good Helpful Pin
UsmanMalik883129-Dec-14 20:18
UsmanMalik883129-Dec-14 20:18 
GeneralGood Idea to delete the duplicate data Pin
Member 1098618129-Dec-14 13:31
Member 1098618129-Dec-14 13:31 
GeneralMy vote of 4 Pin
ravithejag9-Dec-14 22:15
ravithejag9-Dec-14 22:15 
QuestionThanks a Lot Pin
Er Ayush Gupta14-May-14 23:00
professionalEr Ayush Gupta14-May-14 23:00 
QuestionFor versions from SQL Server 2005 on just use ROW_NUMBER Pin
MartinSmith10000002-Mar-14 0:27
MartinSmith10000002-Mar-14 0:27 
AnswerRe: For versions from SQL Server 2005 on just use ROW_NUMBER Pin
Member 108079769-May-14 18:28
Member 108079769-May-14 18:28 
QuestionAnother way to remove duplicates by using the HashBytes() function Pin
ntissier21-Feb-14 3:49
ntissier21-Feb-14 3:49 
QuestionThanks Pin
e.b_star27-Jan-14 2:07
e.b_star27-Jan-14 2:07 
Questionthanks! Pin
Daan Tuynman17-Nov-13 22:35
Daan Tuynman17-Nov-13 22:35 
GeneralTrue Pin
ksanjay8011-Oct-13 0:41
ksanjay8011-Oct-13 0:41 
GeneralMy vote of 3 Pin
Mas1115-Sep-13 20:03
Mas1115-Sep-13 20:03 
GeneralMy vote of 5 Pin
Niranjan27908-Sep-13 20:17
Niranjan27908-Sep-13 20:17 
GeneralMy vote of 2 Pin
archana naik6-Sep-13 20:19
professionalarchana naik6-Sep-13 20:19 
GeneralRe: My vote of 2 Pin
Sadeque Sharif6-Sep-13 20:28
professionalSadeque Sharif6-Sep-13 20:28 
GeneralMy vote of 5 Pin
ChargerIIC29-Jul-13 6:37
ChargerIIC29-Jul-13 6:37 
SuggestionRemove Duplicate Rows from a Table in SQL Server Pin
shaankundan28-May-13 3:10
professionalshaankundan28-May-13 3:10 
first of all copy your data into new table like this
SQL
select distinct * into new_table from old_table


then delete the old table from sql server database
SQL
drop table old_table


then rename your new_table

SQL
sp_rename 'new_table','old_table'


enjoy!! Smile | :)
Kundan Kumar

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.