all groups > sql server programming > january 2006 >
You're in the

sql server programming

group:

Selecting Duplicates - all data


Re: Selecting Duplicates - all data Trey Walpole
1/13/2006 3:18:20 PM
sql server programming:
you could use a derived table, e.g.


select <column list>
from yourtable
join (
select lastname, dob, count(*) as dup_count
from yourtable
group by lastname, dob
having count(*)>1
) x
on yourtable.lastname = x.lastname
and yourtable.dob = x.dob
order by yourtable.lastname, yourtable.dob

[quoted text, click to view]
Re: Selecting Duplicates - all data Louis Davidson
1/13/2006 4:37:59 PM
It is almost the same idea, just use a derived table via an EXISTS criteria.
I used my own table names since you didn't give us a create script. The
idea is that I am checking the last two columns for dups, and returning all
rows:

declare @findDups table
(
findDupsId int primary key,
somecolumn varchar(10),
checkColumn1 int,
checkColumn2 int
)
insert into @findDups
select 1,'barney', 1,1
union all
select 2,'barney', 1,2
union all
select 3,'bam-bam',1,1
union all
select 4,'pebbles',1,3
union all
select 5,'pebbles',1,4
union all
select 6,'fred',1,2

--Something like:

select *
from @findDups as mainTable
where exists ( select *
from (
select checkColumn1, checkColumn2
from @findDups
group by checkColumn1, checkColumn2
having count(*) > 1) as dups
where dups.checkColumn1 = mainTable.checkColumn1
and dups.checkColumn2 = mainTable.checkColumn2)
order by checkColumn1, checkColumn2

Which returns:

findDupsId somecolumn checkColumn1 checkColumn2
----------- ---------- ------------ ------------
1 barney 1 1
3 bam-bam 1 1
6 fred 1 2
2 barney 1 2

--
----------------------------------------------------------------------------
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Arguments are to be avoided: they are always vulgar and often convincing."
(Oscar Wilde)

[quoted text, click to view]

Re: Selecting Duplicates - all data Jim Underwood
1/13/2006 4:41:13 PM
There may be an easier way to do this... but this should do the trick...

Select * from SomeTable
where ColA in
(
SELECT ColA
FROM SomeTable
GROUP BY ColA
HAVING count(ColA) > 1
)
or ColB in
(
SELECT ColB
FROM SomeTable
GROUP BY ColB
HAVING count(ColB) > 1
)

Now just add the additional "Or Col# in (subquery)" for each additional
duplicate that you are looking for. The OR statements and (possibly) large
in clauses may cause performance problems, so you may want to change teh
subqueries to exists or joins...

The exist would look like...

Select * from SomeTable Tab
where exists
(
SELECT count(ColA)
FROM SomeTable Tab1
where Tab1.ColA = Tab.ColA
GROUP BY ColA
HAVING count(ColA) > 1
)
or exists
(
SELECT count(ColB)
FROM SomeTable Tab2
where Tab2.ColB = Tab.ColB
GROUP BY ColB
HAVING count(ColB) > 1
)


[quoted text, click to view]

Selecting Duplicates - all data Simon
1/13/2006 8:50:51 PM
Hi all,

I have found many solutions for selecting duplicates with the most
popular being:

SELECT ColA, COUNT(ColA)
FROM SomeTable
GROUP BY ColA
HAVING count(ColA) > 1

However, what I am after is a query that returns all of columns and all
of the duplicate rows where duplicates exist on a number of columns.

eg

MYID FirstName LastName DOB
1 Jon Smith 12/12/2000
2 Brian Smith 12/12/2000

For my logic, duplicates are defined as any rows where LastName and DOB
are the same.

I need to return a result set that displays all rows that are duplicates
including all the fields in these rows. Not just a group by scenario
where I get the last name and a count of how many times it is duplicated.

Hope this makes sense, and thanks in advance.

Re: Selecting Duplicates - all data Simon
1/13/2006 11:27:23 PM
Thanks to all of the respondents. Just what I was looking for and the
problem is solved.

Thanks

Simon

[quoted text, click to view]
AddThis Social Bookmark Button