One way to do it,
Create a temporary table and load it with the duplicate customer numbers,
Create Table #DupCust (Cust# int Primary Key)
Insert #DupCust (Cust#)
Select f.Cust#
From Foo f
Group By f.Cust#
Having Count(*) > 1
Create a second temporary table and load it with duplicate customer numbers
where the rows are not 100% identical
Create Table #DupCustNotIdentical (Cust# int Primary Key)
Insert #DupCustNotIdentical (Cust#)
Select x.Cust#
From (Select Distinct f.Cust#, f.Name, f.Street, f.Town, f.Age
From Foo f) As x
Group By x.Cust#
Having Count(*) > 1
Create a third temporary table and load it with duplicate customer numbers
where one of the non address columns is not identical
Create Table #DupCustNotOnlyAddress (Cust# int Primary Key)
Insert #DupCustNotOnlyAddress (Cust#)
Select x.Cust#
From (Select Distinct f.Cust#, f.Name, f.Age
From Foo f) As x
Group By x.Cust#
Having Count(*) > 1
Now any row with a customer number in the first table is a duplicate. If
the customer number is in the first table, but not the second, it is
category 'A', if it is in the first table and the second table, but not the
third table, it is a category 'B', and all others are category 'C', so you
can get the result you want with
Select f.Cust#, f.Name, f.Street, f.Town, f.Age,
Case
When Not Exists (Select * From #DupCustNotIdentical ni
Where f.Cust# = ni.Cust#) Then 'A'
When Not Exists (Select * From #DupCustNotOnlyAddress noa
Where f.Cust# = noa.Cust#) Then 'B'
Else 'C'
End As Category
From Foo f
Inner Join #DupCust dc On f.Cust# = dc.Cust#
Tom
[quoted text, click to view] "Toe Dipper" <send_rubbish_here734@hotmail.com> wrote in message
news:qWVoh.55292$HV6.269@newsfe1-gui.ntli.net...
> Hi all,
>
> SQL 2000
>
> I am cracking up in work with what appeared to be a simple task.
>
> I have a script in work which returns a data set concerning duplicate
> data. This is by design and a third party then takes the results and
> goes of and decides using their head which duplicates should be deleted.
> They have now asked that we put in some logic which will help them
> decide which duplicates to delete.
>
> SO imagine my select script returns the below. This is not it, there
> are a lot more rows but the logic is the same.
>
> Cust# Name Street Town Age
> 1234 Tom 15 Disneyland LA 25
> 1234 Tom 15 Disneyland LA 25
>
> 3421 Harry 16 Disneyland CA 29
> 3421 Harry 16 Disneyland CA 29
> 3421 Harry 17 Disneyland CA 29
>
> 9876 Dick 10 Disneyland NY 32
> 9876 Dick 10 Disneyland NY 32
> 9876 Dick 10 Disneyland NY 32
>
> Now, as I say someone takes the above and goes off and decides what to
> do with the duplicates.
>
> Now they want me to add in an extra field beside each row and this will
> contain a category off either A, B or C.
>
> The rule is that for each duplicated row if the rows are 100% true
> duplicates then these are category A. So the first cust 1234 is
> duplicated twice and the detail is 100% match so the cat row will have an
> A.
>
> The same applies to the 9876 cust, duplicated 3 times, 100% match so a
> category A
>
> Now, if the rows do not match 100% but the address is different (or
> other rows except address the same) as in the 3421 customer then this
> will be a category B.
>
> Everything else not covered under the A +B rules will be category C.
>
> So the new select would return -
>
> Cust# Name Street Town Age Cat
> 1234 Tom 15 Disneyland LA 25 A
> 1234 Tom 15 Disneyland LA 25 A
>
> 3421 Harry 16 Disneyland CA 29 B
> 3421 Harry 16 Disneyland CA 29 B
> 3421 Harry 17 Disneyland CA 29 B
>
> 9876 Dick 10 Disneyland NY 32 A
> 9876 Dick 10 Disneyland NY 32 A
> 9876 Dick 10 Disneyland NY 32 A
>
>
>
> Any ideas on how I can achieve this? On paper this looked easy but I
> think I've opened a real can of worms.
>
> Thanks in advance
>
> toe