all groups > sql server programming > january 2006 >
You're in the

sql server programming

group:

duplicate number


Re: duplicate number Tom Moreau
1/7/2006 7:24:42 AM
sql server programming: Try:

select
accountnr
, customername
, min (address)
from
Customers
group by
accountnr
, customername

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinpub.com

[quoted text, click to view]
I have a table with customers that is imported from Axapta 3.0. I have
imported them into a table that is a clone of Axapta table.

I see that i have rows where account number is the same. So it appears that
they have inserted same customer with same accountnumber several times.

So lets say the columns are:
accountnr (varchar), customername (varchar), address(varchar)

Typical data is then:

1000, companyA, streetA
1000, companyA, streetB
1001, companyB, streetC
1001, companyB, streetE

So how can i select only one of them so that i have
1000, companyA, streetA
1001, companyB, streetC

DISTINCT wont do it since street is all different. In real table there are
some 93 columns so i simplifyed a lil. It appears to me that in duplicate
company registrations there are only small changes.

It does not matter wich one that gets selected in the SQl expression i am
looking for.

I have an SQL 2000 std server.

best regards
Trond





Re: duplicate number Tom Moreau
1/7/2006 7:35:15 AM
Well, you'll have to come up with some other business rules to break the
ties.

--
Tom

----------------------------------------------------
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinpub.com

[quoted text, click to view]
Well yes i could but i simplifyed the real case since there are so many
columns. In some cases also address is the same. and the only variation
could appear in any of the remaining 90 columns.
So i thin i have to solve it using the accountnr.
Best regards
Trond

[quoted text, click to view]


Re: duplicate number John Bell
1/7/2006 12:34:37 PM
Hi

If you had a method of uniquely identifying each row then you could do
something like say an id column

SELECT A.accountnr, A.customername, A.address
FROM accounts A
JOIN ( SELECT MIN(B.id), B.accountnr, B.customername
FROM Accounts B
GROUP BY B.accountnr, B.customername ) D ON A.accountnr =
D.accountnr AND A.customername = D.customername

Alternatively you can just use (say) MIN for each column that you are not
grouping by:

SELECT A.accountnr, A.customername, MIN(A.address)
FROM accounts A
GROUP BY A.accountnr, A.customername

John

[quoted text, click to view]

duplicate number Trond Hoiberg
1/7/2006 1:18:22 PM
I have a table with customers that is imported from Axapta 3.0. I have
imported them into a table that is a clone of Axapta table.

I see that i have rows where account number is the same. So it appears that
they have inserted same customer with same accountnumber several times.

So lets say the columns are:
accountnr (varchar), customername (varchar), address(varchar)

Typical data is then:

1000, companyA, streetA
1000, companyA, streetB
1001, companyB, streetC
1001, companyB, streetE

So how can i select only one of them so that i have
1000, companyA, streetA
1001, companyB, streetC

DISTINCT wont do it since street is all different. In real table there are
some 93 columns so i simplifyed a lil. It appears to me that in duplicate
company registrations there are only small changes.

It does not matter wich one that gets selected in the SQl expression i am
looking for.

I have an SQL 2000 std server.

best regards
Trond




Re: duplicate number Trond Hoiberg
1/7/2006 1:36:57 PM
Well yes i could but i simplifyed the real case since there are so many
columns. In some cases also address is the same. and the only variation
could appear in any of the remaining 90 columns.
So i thin i have to solve it using the accountnr.
Best regards
Trond

[quoted text, click to view]

Re: duplicate number John Bell
1/7/2006 1:50:07 PM
Hi

It does not matter if the values in the columns are the same as the minimum
of two equal values will be that value. As account number is always constant
(between rows you wish to differentiate) you will either have to use a
different column e.g. something like the id column in my previous post or
generate the means to differentiate the columns such as inserting all the
values into a temporary table that has an id column.

John

[quoted text, click to view]

Re: duplicate number John Bell
1/7/2006 9:45:03 PM
Hi

You may want to look at Itzik's article on assigning Row numbers to
non-unique rows
http://www.windowsitpro.com/Articles/ArticleID/45828/45828.html

John

[quoted text, click to view]

Re: duplicate number Erland Sommarskog
1/7/2006 11:33:57 PM
Trond Hoiberg (trond@montanis.com) writes:
[quoted text, click to view]

Are you sure of that? What if some information is old and no longer
current and some is new?

From the technical presumptions you have given you can do this:

ALTER TABLE tbl ADD ident int IDENTITY

Then:

SELECT a.accountnr, a.customername, ...
FROM tbl a
JOIN (SELECT accountnr, ident = MAX(ident)
FROM tbl
GROUP BY accountnr) AS b ON a.accountnr = b.accountnr
AND a.ident = b.ident

But whether this really is right from a business perspective, I am not
so sure.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
AddThis Social Bookmark Button