[quoted text, click to view] On 26 Aug 2005 12:24:43 -0700, stoppal@hotmail.com wrote:
>What is the difference between "IN" and "exists".
>
>THANKS!!!!
Hi stoppal,
In theory:
* IN will first evaluate the subquery and "remember" the results; for
each row in the main query, the value specified before "IN" is compared
to each value in the results. If a match is found, the IN condition
evaluates to TRUE. If no match is found but the subquery returned at
least one NULL, the IN condition evaluates to UNKNOWN. And if no match
is found and there were no NULLs, the IN evaluates to FALSE.
* EXISTS will evaluate the subquery for each row in the outer query. If
the subquery returns at least one row, the EXISTS condition evaluates to
TRUE; otherwise it evaluates to FALSE.
In practice, the optimizer will change the actual way that the query is
evaluated to get the best possible performance. An obvious optimization
is to stop evaluating the subquery of an EXISTS clause as soon as the
first match is found. In simple cases, the optimizer will in fact often
use the same plan for a query with EXISTS and an equivalent query with
IN.
My reasons for preferring EXISTS over IN are:
SQL Server doesn't support row constructors, so you can't write
(a.col1, a.col2) IN (SELECT b.col1, b.col2 FROM ...)
But you can write
EXISTS (SELECT * FROM ... WHERE a.col1 = b.col1 AND a.col2 = b.col2)
The effects of NULLS in an IN expression are hardly noticeable, but that
changes when you use NOT IN. A query with the expression
WHERE a.col1 NOT IN (1, NULL, 3)
will never return any rows - for 1 and 3, the result is FALSE; for all
other values, the result is UNKNOWN as a result of the NULL value. This
"strange" behaviour can be eaasily avoidded if you use [NOT] EXISTS when
you use a subquery, and use [NOT] IN only with a list of constants.
And last but not least, I have often seen the IN and the EXISTS versions
perform the same; I have also often seen the EXISTS version perfor
better, but I've never yet seen the IN version perform better.
Best, Hugo
--