sql server programming:
you could use a derived table, e.g. select <column list> from yourtable join ( select lastname, dob, count(*) as dup_count from yourtable group by lastname, dob having count(*)>1 ) x on yourtable.lastname = x.lastname and yourtable.dob = x.dob order by yourtable.lastname, yourtable.dob [quoted text, click to view] Simon wrote: > Hi all, > > I have found many solutions for selecting duplicates with the most > popular being: > > SELECT ColA, COUNT(ColA) > FROM SomeTable > GROUP BY ColA > HAVING count(ColA) > 1 > > However, what I am after is a query that returns all of columns and all > of the duplicate rows where duplicates exist on a number of columns. > > eg > > MYID FirstName LastName DOB > 1 Jon Smith 12/12/2000 > 2 Brian Smith 12/12/2000 > > For my logic, duplicates are defined as any rows where LastName and DOB > are the same. > > I need to return a result set that displays all rows that are duplicates > including all the fields in these rows. Not just a group by scenario > where I get the last name and a count of how many times it is duplicated. > > Hope this makes sense, and thanks in advance. >
It is almost the same idea, just use a derived table via an EXISTS criteria. I used my own table names since you didn't give us a create script. The idea is that I am checking the last two columns for dups, and returning all rows: declare @findDups table ( findDupsId int primary key, somecolumn varchar(10), checkColumn1 int, checkColumn2 int ) insert into @findDups select 1,'barney', 1,1 union all select 2,'barney', 1,2 union all select 3,'bam-bam',1,1 union all select 4,'pebbles',1,3 union all select 5,'pebbles',1,4 union all select 6,'fred',1,2 --Something like: select * from @findDups as mainTable where exists ( select * from ( select checkColumn1, checkColumn2 from @findDups group by checkColumn1, checkColumn2 having count(*) > 1) as dups where dups.checkColumn1 = mainTable.checkColumn1 and dups.checkColumn2 = mainTable.checkColumn2) order by checkColumn1, checkColumn2 Which returns: findDupsId somecolumn checkColumn1 checkColumn2 ----------- ---------- ------------ ------------ 1 barney 1 1 3 bam-bam 1 1 6 fred 1 2 2 barney 1 2 -- ---------------------------------------------------------------------------- Louis Davidson - http://spaces.msn.com/members/drsql/ SQL Server MVP "Arguments are to be avoided: they are always vulgar and often convincing." (Oscar Wilde) [quoted text, click to view] "Simon" <simon_nospam_rigby@nodomain.com> wrote in message news:e1u2LMIGGHA.2064@TK2MSFTNGP09.phx.gbl... > Hi all, > > I have found many solutions for selecting duplicates with the most popular > being: > > SELECT ColA, COUNT(ColA) > FROM SomeTable > GROUP BY ColA > HAVING count(ColA) > 1 > > However, what I am after is a query that returns all of columns and all of > the duplicate rows where duplicates exist on a number of columns. > > eg > > MYID FirstName LastName DOB > 1 Jon Smith 12/12/2000 > 2 Brian Smith 12/12/2000 > > For my logic, duplicates are defined as any rows where LastName and DOB > are the same. > > I need to return a result set that displays all rows that are duplicates > including all the fields in these rows. Not just a group by scenario where > I get the last name and a count of how many times it is duplicated. > > Hope this makes sense, and thanks in advance. > > Simon
There may be an easier way to do this... but this should do the trick... Select * from SomeTable where ColA in ( SELECT ColA FROM SomeTable GROUP BY ColA HAVING count(ColA) > 1 ) or ColB in ( SELECT ColB FROM SomeTable GROUP BY ColB HAVING count(ColB) > 1 ) Now just add the additional "Or Col# in (subquery)" for each additional duplicate that you are looking for. The OR statements and (possibly) large in clauses may cause performance problems, so you may want to change teh subqueries to exists or joins... The exist would look like... Select * from SomeTable Tab where exists ( SELECT count(ColA) FROM SomeTable Tab1 where Tab1.ColA = Tab.ColA GROUP BY ColA HAVING count(ColA) > 1 ) or exists ( SELECT count(ColB) FROM SomeTable Tab2 where Tab2.ColB = Tab.ColB GROUP BY ColB HAVING count(ColB) > 1 ) [quoted text, click to view] "Simon" <simon_nospam_rigby@nodomain.com> wrote in message news:e1u2LMIGGHA.2064@TK2MSFTNGP09.phx.gbl... > Hi all, > > I have found many solutions for selecting duplicates with the most > popular being: > > SELECT ColA, COUNT(ColA) > FROM SomeTable > GROUP BY ColA > HAVING count(ColA) > 1 > > However, what I am after is a query that returns all of columns and all > of the duplicate rows where duplicates exist on a number of columns. > > eg > > MYID FirstName LastName DOB > 1 Jon Smith 12/12/2000 > 2 Brian Smith 12/12/2000 > > For my logic, duplicates are defined as any rows where LastName and DOB > are the same. > > I need to return a result set that displays all rows that are duplicates > including all the fields in these rows. Not just a group by scenario > where I get the last name and a count of how many times it is duplicated. > > Hope this makes sense, and thanks in advance. > > Simon
Hi all, I have found many solutions for selecting duplicates with the most popular being: SELECT ColA, COUNT(ColA) FROM SomeTable GROUP BY ColA HAVING count(ColA) > 1 However, what I am after is a query that returns all of columns and all of the duplicate rows where duplicates exist on a number of columns. eg MYID FirstName LastName DOB 1 Jon Smith 12/12/2000 2 Brian Smith 12/12/2000 For my logic, duplicates are defined as any rows where LastName and DOB are the same. I need to return a result set that displays all rows that are duplicates including all the fields in these rows. Not just a group by scenario where I get the last name and a count of how many times it is duplicated. Hope this makes sense, and thanks in advance.
Thanks to all of the respondents. Just what I was looking for and the problem is solved. Thanks Simon [quoted text, click to view] Louis Davidson wrote: > It is almost the same idea, just use a derived table via an EXISTS criteria. > I used my own table names since you didn't give us a create script. The > idea is that I am checking the last two columns for dups, and returning all > rows: > > declare @findDups table > ( > findDupsId int primary key, > somecolumn varchar(10), > checkColumn1 int, > checkColumn2 int > ) > insert into @findDups > select 1,'barney', 1,1 > union all > select 2,'barney', 1,2 > union all > select 3,'bam-bam',1,1 > union all > select 4,'pebbles',1,3 > union all > select 5,'pebbles',1,4 > union all > select 6,'fred',1,2 > > --Something like: > > select * > from @findDups as mainTable > where exists ( select * > from ( > select checkColumn1, checkColumn2 > from @findDups > group by checkColumn1, checkColumn2 > having count(*) > 1) as dups > where dups.checkColumn1 = mainTable.checkColumn1 > and dups.checkColumn2 = mainTable.checkColumn2) > order by checkColumn1, checkColumn2 > > Which returns: > > findDupsId somecolumn checkColumn1 checkColumn2 > ----------- ---------- ------------ ------------ > 1 barney 1 1 > 3 bam-bam 1 1 > 6 fred 1 2 > 2 barney 1 2
Don't see what you're looking for? Try a search.
|