all groups > sql server programming > january 2006 >
You're in the

sql server programming

group:

ignore duplicates from joined tables


ignore duplicates from joined tables enrique
1/8/2006 11:17:02 PM
sql server programming:
Hi,

I'm trying to SELECT from two tables with a LEFT OUTER JOIN but I can't seem
to figure out how to return the query without the duplicates.
I know DISTICT works when querying one table but how do I ignore duplicates,
specifically in the primary key (ID) from joined tables?

Here is my basic query:

SELECT t1.nameID,
t1.name,
t2.name
FROM t1 LEFT OUTER JOIN
t2 ON t1.nameID = t2.nameID

Results:
nameID t1.name t2.name
====================================
9 name1 name1
9 name1 name2
9 name1 name3
10 name2 name1
10 name2 name2

How do I ignore the duplicates in the "nameID" column?

Thanks,
Henry
Re: ignore duplicates from joined tables Jens
1/8/2006 11:47:31 PM
Strange how you get this result ?! Normally if you join on equality you
should only get back these rows which are equal. If you use a left join
you will also get the NULL (non matching back). But in this example you
have name1 =3D name2 which couldn=B4t evalute to true. I thikn the query
or the result you posted is not corresponding to each other. Perhaps
you should post some DDL and sample (insert script) data.

http://www.aspfaq.com/5006

HTH, Jens Suessmeyer.
Re: ignore duplicates from joined tables Prospero via SQLMonster.com
1/9/2006 8:18:30 AM
Based on your DDL and the results, it is not possible that nameID is the
primary key on BOTH tables. on table 1 it is but on t2 has 3 names for
nameID 9. With that in mind the result looks like exactly what you asked for.
It gives you all the nameIDs on table 1 that have some relationship on t2.
For example, if t1 is customers and t2 is sales, then this table gives u all
the customers who have made a sale. Its a many to many relationship so one
customer who made multiple sales would be mentioned multiple times on the
sales table and therefore on the join. u can use select distinct if u want
but then it makes me wonder why u wanted a join in the 1st place. or u can
add a count() in the select and a group by if u want to return just how many
for each person in a different column. Whatever I think the problem is that
you arent really wanting a join, u just think you do.

--
Message posted via SQLMonster.com
Re: ignore duplicates from joined tables Uri Dimant
1/9/2006 9:59:45 AM
I have tried to guess what is your tables' structure since yiou have not
provide DDL+ sample data

CREATE TABLE #t1 (col1 INT NOT NULL PRIMARY KEY,col2 INT,col3 CHAR(1))
CREATE TABLE #t2 (col1 INT NOT NULL PRIMARY KEY,col3 CHAR(1))

INSERT INTO #t1 VALUES (1,1,'A')
INSERT INTO #t1 VALUES (2,1,'A')
INSERT INTO #t1 VALUES (3,2,'B')
INSERT INTO #t1 VALUES (4,2,'B')

INSERT INTO #t2 VALUES (1,'A')
INSERT INTO #t2 VALUES (2,'B')
INSERT INTO #t2 VALUES (3,'C')



SELECT #t1.col1,#t1.col2,#t1.col3 as col3_1,#t2.col3 as col3_2
FROM #t1 JOIN #t2
ON #t1.col1=#t2.col1
WHERE #t1.col1 IN
(SELECT B.col1 FROM #t1 A JOIN #t1 B
ON A.col1 > B.col1
AND A.col2 = B.col2
AND A.col3 = B.col3
)

DROP TABLE #t1,#t2



[quoted text, click to view]

Re: ignore duplicates from joined tables Jim Underwood
1/9/2006 11:07:55 AM
I am curious as to what you are trying to do here. t2.name has multiple
values for each t2.nameid, so you will always get back more than one row.
You can't do a distinct, because the values are not distinct. You don't
actually have duplicates, just different rows with the same high level key
(meaning part of the key). Exactly what results do you want to get from
this query?

If you post an example of your data (6 rows from each table) and show what
you want the results to look like, we may be able to help. As it is, we can
only guess at what you are trying to accomplish and probably won't be able
to provide any useful information.

Your query is specifically returning rows showing the discrepencies between
values in the name field based on the NameID field. Without knowing what
you are trying to accomplish, this query looks right.


[quoted text, click to view]

Re: ignore duplicates from joined tables enrique
1/9/2006 11:09:03 AM
My apologies if i wasn't too clear, it was late and I was trying to simplify
my question. :)

Anyhow here's a link for the ddl:
http://www.hrm3.com/example/ddl.txt

And the actual script:
SELECT DL_assets.assetID,
DL_assets.unitNumber,
DL_files.fileTypeID
FROM DL_assets LEFT OUTER JOIN
DL_files ON DL_assets.assetID = DL_files.assetID

And here's a screen shot of my script and results:
http://www.hrm3.com/example/script.jpg

I understand that my query is returning, correctly, what I've scripted but
what I am looking for is a modification that would dislpay one 1 instance of
the assetID.
I'll eventually be joining a few more tables but I wanted to resolve my
duplicates first.

Problem:
Displaying the duplicates of the "assetID" (primary key) column

Solution:
Hide/Ignore duplicates of the "assetID" column in the DL_assets table

Let me know if you need more details.

Thanks for the help!
Henry








[quoted text, click to view]
Re: ignore duplicates from joined tables enrique
1/9/2006 12:10:03 PM
Hi Jens, I ellaborate further in a reply to Jim below.

[quoted text, click to view]
Re: ignore duplicates from joined tables enrique
1/9/2006 12:40:02 PM
Hello Prospero,

I've ellaborated in a reply to Jim below.
The necessity of the join should make more sense.
I'll look into the Count() you suggested. (not sure, yet, how to write it out)

Thanks,
Henry

[quoted text, click to view]
Re: ignore duplicates from joined tables enrique
1/9/2006 3:05:18 PM

I would like this...
assetID unitNumber fileTypeID
--------------------------------------------------------
1 EXH0000001 5
1 EXH0000001 3
1 EXH0000001 4
1 EXH0000001 1
1 EXH0000001 6
1 EXH0000001 8
1 EXH0000001 6
1 EXH0000001 6
465 EXH0000465 8
465 EXH0000465 4
466 ASN0000466
467 EXH0000467 8
467 EXH0000467 1
468 ASN0000468
469 EXH0000469
470 EXH0000470 6
470 EXH0000470 8
471 ASN0000471 8
472 EXH0000472
473 EXH0000473
474 EXH0000474 1
475 EXH0000475
476 EXH0000476
477 EXH0000477
478 ASN0000478
479 ASN0000479
480 EXH0000480

To look like this:
assetID unitNumber fileTypeID
--------------------------------------------------------
1 EXH0000001 5
465 EXH0000465 8
466 ASN0000466
467 EXH0000467 8
468 ASN0000468
469 EXH0000469
470 EXH0000470 8
471 ASN0000471 8
472 EXH0000472
473 EXH0000473
474 EXH0000474 1
475 EXH0000475
476 EXH0000476
477 EXH0000477
478 ASN0000478
479 ASN0000479
480 EXH0000480


I think I know why I'm causing confusion...

I may be approaching this wrong.

Here is my scenario:
I have a Search that needs to query the "assets" table along with the
related tables, categories, files, etc (via the joins). So in essence my
query is exactly where it needs to be I simply need a way to find a way of
hiding the duplicates when I display the results. My app is asp.net/vb.net.

So my first guess to solve this is to handle this from the app side instead
of sql or is there a way to query (during search) and then display the
returned records ignoring the duplicate primary ID's.

Again, I must apologize for my newbie help request.

[quoted text, click to view]
Re: ignore duplicates from joined tables Jim Underwood
1/9/2006 3:29:44 PM
I am still unsure as to what you are trying to get in your result set. The
original result set for assetID = 1 is:

assetID unitNumber fileTypeID
1 EXH0000001 5
1 EXH0000001 2
1 EXH0000001 4
1 EXH0000001 1
1 EXH0000001 6
1 EXH0000001 8
1 EXH0000001 6
1 EXH0000001 6

Now I assume you are trying to get results that look line one fo the
following? If so, let us know which one you want to see in your results, or
give us an example if none of these are correct.

assetID unitNumber fileTypeID
1 EXH0000001 5
1 EXH0000001 2
1 EXH0000001 4
1 EXH0000001 1
1 EXH0000001 6
1 EXH0000001 8

assetID unitNumber fileTypeID
1 EXH0000001 5
2
4
1
6
8

assetID unitNumber fileTypeID
1 EXH0000001 5

[quoted text, click to view]

Re: ignore duplicates from joined tables Jim Underwood
1/10/2006 9:10:45 AM
If I am understanding, all you need is one join from the DL_files table, if
one exists, and you dont care which row gets returned? I think what you
want is is a top rows subquery on the DL_files table, to return only the
first row.

Try something along these lines, and look up Top in SQL Server help to get
more information. Note, I have not tested this, but at least it should give
you an idea of where to start.

SELECT DL_assets.assetID,
DLAssets.unitNumber,
DL_files.fileTypeID
FROM DL_assets LEFT OUTER JOIN
(select top 1 from DL_files where DL_assets.assetID = DL_files.assetID) as
DLAssets
ON DL_assets.assetID = DL_files.assetID

You could also use a view that only selects the max or min fileTypeID and
groups by AssetID, then join to this view instead of the table. Basically
filter the data up before you do any joins.


[quoted text, click to view]
AddThis Social Bookmark Button