all groups > sql server misc > january 2007 >
You're in the

sql server misc

group:

work problem cracking me up


Re: work problem cracking me up AlterEgo
1/9/2007 4:20:59 PM
sql server misc:
Toe,

This should work for A and B. They are exclusive sets so a union all works.
You are going to have to be more specific about the rules for C. Does the
customer number have to be the same to qualify, what if the name and age and
town are the only common items?

select
CustNo
, CustName
, Street
, Town
, Age
, 'A' Category
from
(
select
CustNo
, CustName
, Street
, Town
, Age
from MyTable
group by
CustNo
, CustName
, Street
, Town
, Age
having count(*) > 1
) t
union all
select
m1.CustNo
, m1.CustName
, m1.Street
, m1.Town
, m1.Age
, 'B' Category
from Mytable m1
inner join MyTable m2 on
m1.CustNo = m2.CustNo
and m1.CustName = m2.CustName
and m1.Town = m2.Town
and m1.Age = m2.Age
where m1.Street <> m2.Street

-- Bill

[quoted text, click to view]

Re: work problem cracking me up Tom Cooper
1/9/2007 8:45:16 PM
One way to do it,

Create a temporary table and load it with the duplicate customer numbers,

Create Table #DupCust (Cust# int Primary Key)
Insert #DupCust (Cust#)
Select f.Cust#
From Foo f
Group By f.Cust#
Having Count(*) > 1

Create a second temporary table and load it with duplicate customer numbers
where the rows are not 100% identical

Create Table #DupCustNotIdentical (Cust# int Primary Key)
Insert #DupCustNotIdentical (Cust#)
Select x.Cust#
From (Select Distinct f.Cust#, f.Name, f.Street, f.Town, f.Age
From Foo f) As x
Group By x.Cust#
Having Count(*) > 1

Create a third temporary table and load it with duplicate customer numbers
where one of the non address columns is not identical

Create Table #DupCustNotOnlyAddress (Cust# int Primary Key)
Insert #DupCustNotOnlyAddress (Cust#)
Select x.Cust#
From (Select Distinct f.Cust#, f.Name, f.Age
From Foo f) As x
Group By x.Cust#
Having Count(*) > 1

Now any row with a customer number in the first table is a duplicate. If
the customer number is in the first table, but not the second, it is
category 'A', if it is in the first table and the second table, but not the
third table, it is a category 'B', and all others are category 'C', so you
can get the result you want with

Select f.Cust#, f.Name, f.Street, f.Town, f.Age,
Case
When Not Exists (Select * From #DupCustNotIdentical ni
Where f.Cust# = ni.Cust#) Then 'A'
When Not Exists (Select * From #DupCustNotOnlyAddress noa
Where f.Cust# = noa.Cust#) Then 'B'
Else 'C'
End As Category
From Foo f
Inner Join #DupCust dc On f.Cust# = dc.Cust#

Tom

[quoted text, click to view]

work problem cracking me up Toe Dipper
1/9/2007 11:53:58 PM
Hi all,

SQL 2000

I am cracking up in work with what appeared to be a simple task.

I have a script in work which returns a data set concerning duplicate
data. This is by design and a third party then takes the results and
goes of and decides using their head which duplicates should be deleted.
They have now asked that we put in some logic which will help them
decide which duplicates to delete.

SO imagine my select script returns the below. This is not it, there
are a lot more rows but the logic is the same.

Cust# Name Street Town Age
1234 Tom 15 Disneyland LA 25
1234 Tom 15 Disneyland LA 25

3421 Harry 16 Disneyland CA 29
3421 Harry 16 Disneyland CA 29
3421 Harry 17 Disneyland CA 29

9876 Dick 10 Disneyland NY 32
9876 Dick 10 Disneyland NY 32
9876 Dick 10 Disneyland NY 32

Now, as I say someone takes the above and goes off and decides what to
do with the duplicates.

Now they want me to add in an extra field beside each row and this will
contain a category off either A, B or C.

The rule is that for each duplicated row if the rows are 100% true
duplicates then these are category A. So the first cust 1234 is
duplicated twice and the detail is 100% match so the cat row will have an A.

The same applies to the 9876 cust, duplicated 3 times, 100% match so a
category A

Now, if the rows do not match 100% but the address is different (or
other rows except address the same) as in the 3421 customer then this
will be a category B.

Everything else not covered under the A +B rules will be category C.

So the new select would return -

Cust# Name Street Town Age Cat
1234 Tom 15 Disneyland LA 25 A
1234 Tom 15 Disneyland LA 25 A

3421 Harry 16 Disneyland CA 29 B
3421 Harry 16 Disneyland CA 29 B
3421 Harry 17 Disneyland CA 29 B

9876 Dick 10 Disneyland NY 32 A
9876 Dick 10 Disneyland NY 32 A
9876 Dick 10 Disneyland NY 32 A



Any ideas on how I can achieve this? On paper this looked easy but I
think I've opened a real can of worms.

Thanks in advance

AddThis Social Bookmark Button