Groups | Blog | Home
all groups > sql server data mining > november 2003 >

sql server data mining : How to delete duplicates


Derek Shi
11/26/2003 10:54:24 AM
Hello,

I want to find out which is the best and efficient way to
delete duplicate records in one table? Any ideas?

Thanks,
Vishal Parkar
12/2/2003 2:37:23 AM
Derek,

Refer to following url:
Deleting duplicate rows from the table.

http://support.microsoft.com/default.aspx?scid=KB;en-us;q139444

See following example as well

--if you have primary key defined in the table then you can have query as specified in below example
where filename and filepath are repeating

create table [file](id int ,FileName varchar(500),FilePath varchar(500))

--sample data
insert into [file]
select 1 ,'DailyReport','C:\...\dailyreport.rpt' union all
select 2 ,'DailyReport','C:\...\dailyreport.rpt' union all
select 3 ,'DailyReport','C:\...\dailyreportNew.rpt' union all
select 4 ,'DailyReport','C:\...\dailyreportNew.rpt' union all
select 5 ,'MonthlyReport','C:\...\monthlyreport.rpt' union all
select 6 ,'YearlyReport','C:\...\yearlyreport.rpt'

--query to keep one record out of all the duplicates assuming id as a unique key.

delete from [file]
where id not in
(select min(id) from [file] b where [file].filename=b.filename and
[file].filepath=b.filepath )



--
- Vishal


AddThis Social Bookmark Button