Groups | Blog | Home
all groups > sql server (alternate) > november 2004 >

sql server (alternate) : SQL - filter out duplicates but keeping all columns


skeetx NO[at]SPAM hotmail.com
11/4/2004 10:07:09 PM
Hi guys,

I'm trying to compose a query that will select all columns from a
table, but without any duplicates.

E.g.

table name is 'tblShipment'
columns are:
fldUnique(pk) | fldShipNo | fldDate | fldValue
001 | 123 | 02/02/02 | 2000
002 | 222 | 01/01/01 | 3000
003 | 123 | 03/03/03 | 4000
004 | 444 | 04/04/04 | 5000

I want to be able to select all columns (4 columns) without
duplicates, which should give me a recordset:

fldUnique(pk) | fldShipNo | fldDate | fldValue
001 | 123 | 02/02/02 | 2000
002 | 222 | 01/01/01 | 3000
004 | 444 | 04/04/04 | 5000

** 003 is not selected becoz it's fldShipNo is identical to that of
001.

How should I go about querying that?

I've tried to select it with DISTINCT, but i'm unable to use DISTINCT
to select all columns. It is important that I have all columns
selected, especially the primary key (pk), becoz I need it for
reference.

I'm really having trouble trying to achieve this, so I hope someone
can help me out.

Thanks,
Jesper
11/5/2004 2:01:07 PM
There may be something more elegant than this one, but it should work:

select * from tblShipment
where fldUnique not in ( select fldUnique from tblShipment group by
fldUnique having count(*)>1)
and fldShipNo not in ( select fldShipNo from tblShipment group by fldShipNo
having count(*)>1)
and fldDate not in ( select fldDate from tblShipment group by fldDate having
count(*)>1)
and fldValue not in ( select fldValue from tblShipment group by fldValue
having count(*)>1)
order by fldUnique

And since the pk is unique already, you can eliminate the first where clause
so you end up with:

select * from tblShipment
where fldShipNo not in ( select fldShipNo from tblShipment group by
fldShipNo having count(*)>1)
and fldDate not in ( select fldDate from tblShipment group by fldDate having
count(*)>1)
and fldValue not in ( select fldValue from tblShipment group by fldValue
having count(*)>1)
order by fldUnique


Jesper.

[quoted text, click to view]

Erland Sommarskog
11/5/2004 10:39:25 PM
[posted and mailed, please reply in news]

Shawn (skeetx@hotmail.com) writes:
[quoted text, click to view]

SELECT *
FROM tblShipment t
JOIN (SELECT fldShipNo, fldUnique = MIN(fldUnique)
FROM tblShipment
GROUP BY fldShipNo) AS t1 ON t.fldUnique = t1.fldUnique

This is an untested solution. Had you included CREATE TABLE statements
for your table and your sample data as INSERT statements, you would
have gotten a tested solution.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
skeetx NO[at]SPAM hotmail.com
11/7/2004 4:35:32 PM
[quoted text, click to view]

Thank-you Erland, your query did exactly what I wanted it to do.
I really need to learn more complex SQL queries for my future
projects. :)

[quoted text, click to view]

Unfortunately, that didn't work, because it wasn't giving me the
record i wanted. It did not include any record with multiple identical
fields, which wasn't what I wanted.
AddThis Social Bookmark Button