Groups | Blog | Home
all groups > sql server programming > november 2004 >

sql server programming : Hash/Loop join


David Gugick
11/22/2004 3:29:43 PM
[quoted text, click to view]

It sounds like to intermediate tables are using a hash join to form the
final result set. Intermediate tables won't have indexes on them, so a
hash join could be used. A merge join on pre-sorted results would be a
lot faster, but if that's no possible you may see a hash join.

How many rows in the intermediate tables?


--
David Gugick
Imceda Software
www.imceda.com
David Gugick
11/22/2004 7:42:34 PM
[quoted text, click to view]

No, you should look at the rest of your execution plan and see why SQL
Server is only able to reduce the intermediate results to 15,000. Unless
you are actually returning 15,000 rows from the query... Are you?

If you had clustered indexes on the columsn that are eventually joined
together in the intermediate tables, you could possibly get SQL Server
to use a merge join. SQL Server is just doing what is thinks is the best
method to return the data. That is, you have to large result sets with
upwards of 15,000 rows that must be joined. There are no indexes on
these results (because they are not tables) and they are not sorted by
the join keys, so SQL Server is hashing the keys on both sides and
checking for matches which is time consuming. Sometimes you might see a
sort operation on the intermediate results and see the merge join. merge
joins are very fast with large result sets when they are already sorted
by the join keys. If SQL Server has to sort them, it just adds overhead
and in this case it just feels that a hash join is easier.

So I guess the real question is how many rows are you returning to the
client?


--
David Gugick
Imceda Software
www.imceda.com
Leila
11/22/2004 10:23:53 PM
Hi,
I have a query based on some views(joins 8 tables). The execution plan shows
that each pair of tables are joined using loop join which is fine. Suppose
that there're 4 pair of tables. In the rest of plan, the result of each pair
is joined with result of another pair using hash join.
Are my indexes insufficient or the pairs may not necessarily join each other
using a better operator(loop or merge). It is said that loop joins are
better in performance for joining small tables with large tables, but I
cannot force this query to not to use hash join!
Does my query need more indexes?
Thanks,
Leila


Leila
11/23/2004 12:12:45 AM
Thanks David,
(I didn't check the row count but that's around 15,000 I suppose)
Do you mean I must try to create indexes to make optimizer able to use merge
join for intermediate tables?


[quoted text, click to view]

AddThis Social Bookmark Button