Groups | Blog | Home
all groups > sql server data mining > april 2004 >

sql server data mining : Multiple tests against the same subquery.


Eric
4/12/2004 5:48:52 PM
Hi All,

I have a need for a query where the WHERE clause checks multiple fields
against a subquery. The following example returns the results I want:

SELECT
*
FROM
MyTable
WHERE
Column1 IN (subquery) OR
Column2 IN (subquery) OR
Column3 IN (subquery)

In reality, I actually have eight separate tests (rather than three, as in
the example) and "subquery" is a very complicated query (it is the same
every time, however). I would really, really like to be able to do this in
one statement, but I would like to avoid running the same (very complicated)
subquery eight separate times. Because of the nature of the subquery,
placing it in a view is not an option (I'm trying to keep this question
simple so I won't waste your time with an explanation as to why using a view
won't work).

Here's my actual question... in "pseudo SQL", I need something like this:

SELECT
*
FROM
MyTable
WHERE
((Column1 OR Column2 OR Column3) IN (subquery))

If any of you know of a way of doing this, I would be extremely happy! If
not, I suppose I'll have to put the whole thing in a sproc where I would do
the subquery first, place the results in a temp table, and reference the
temp table in my WHERE clause. However, this is a last resort beause I'd
really like to do this in one statement.

Thank you for your help!

Eric

Eric
4/13/2004 8:23:37 AM
Woo hoo! As soon as I read what you posted I slapped my forehead. I *knew*
it had to be possible. I changed what you did just a little (because the
change would work for what I was doing and might be a little more
efficient), but you put the magic in the join clause - and that was the key.
I couldn't quite figure out how to go about doing that, and now I feel
stupid for not seeing it in the first place. If your curious, here's what I
wound up doing:

SELECT

DISTINCT
mt.ID

FROM

MyTable mt
INNER JOIN (subquery) sq
ON
(
mt.Column1 = sq.ID OR
mt.Column2 = sq.ID OR
mt.Column3 = sq.ID OR
....[etc]
)

I'm only selecting the ID column because this whole query is actually a
subquery in yet another query. It works perfectly, I am extremely grateful
for your expertise!

Eric


[quoted text, click to view]

Hugo Kornelis
4/13/2004 9:49:13 AM
[quoted text, click to view]

Eric,

Try it with:

SELECT
DISTINCT MyTable.*
FROM
MyTable
INNER JOIN
(SELECT (...) AS ResultColumn
FROM (rest of subquery) ...) AS sq
ON
sq.ResultColumn IN (Column1, Column2, Column3)

Note: don't forget to add MyTable in the select clause, or else you'll
get the ResultColumn from the subquery as well. Also, don't forget the
DISTINCT, to prevent getting the same row from MyTable twice if the
result from the subquery matches agains two of the three columns.

Best, Hugo
--

Eric
4/13/2004 1:50:39 PM
Yup - I had just worked out that the DISTINCT was no longer necessary and
have since removed it. Once again, thank you for your help!!

Eric

[quoted text, click to view]

Hugo Kornelis
4/13/2004 10:20:04 PM
[quoted text, click to view]

Eric,

Glad I could be of assistance.

Regarding the quote above - if it is a subquery with (NOT) IN, you
don't need the DISTINCT. My guess is that removing distinct might
improve performance (but with performance, the truth is alway in the
testing); if it is a subquery with (NOT) EXISTS, you could also use
SELECT * instead of SELECT <somecolumn>. I don't think it changes
performance on this one, but it often does, so it's a good habit to
always use SELECT * with (NOT) EXISTS.

Best, Hugo
--

AddThis Social Bookmark Button