Groups | Blog | Home
all groups > sql server (alternate) > july 2004 >

sql server (alternate) : A Range Query Optimization


anindya.roy NO[at]SPAM rave-tech.com
7/29/2004 12:43:27 AM
Hi,

Need help in optimizing a query in SQL Server.
Following is the problem statement.

There are two tables;
1st table (t1) has a KEY ( char(8) ) column, with a clustered index.
this is not the primary key. The table can have billions of records;
in test environment, we are having 3,000,000 records

2nd table (t2) has two columns a from_Range and to_Range, both
char(8). this table has lesser number of records, in thousands.
Clustered index is on the primary key.

However there is no relation whatsoever between the KEY and the
from/to range.

We need to find matching records where Key is found between the from
and to range :

select t1.id, t2.id from t1, t2
where t1.KEY between t2.from_range and t2.to_range

( The ids form part of primary keys in both tables. )

The plan shows a loop, with t1 using clustered index of KEY and t2
using clustered index of the primary key.

This query is taking around 14 seconds on SQL server 2000 in win2kpro
with P4 and 512 MB RAM.

Is there any way this can be reducd to a subsecond performance ? This
query forms the core of most of the processing, and any reduction here
will have recursive effect all over.

Thanks in advance,
Erland Sommarskog
7/29/2004 8:08:49 AM
aroy (anindya.roy@rave-tech.com) writes:
[quoted text, click to view]

In general, it it best to post the CREATE TABLE and CREATE INDEX statements
for the table, as the narrative easily can be misunderstood.

But it sounds like a problem that Mischa Sandberg ran into recently,
and you can review that thread starting on
http://groups.google.com/groups?hl=sv&lr=&ie=UTF-8&selm=ZeKIc.27357%24Rf.25396%40edtnps84


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server SP3 at
Hugo Kornelis
7/29/2004 11:01:31 AM
[quoted text, click to view]

Hi Roy,

The URL Erland gives has lots of potentially useful info that might help
you. But you might also try if it helps to define an index on (from_range,
to_range) in the second table (t2). Try it with both a clustered and a
non-clustered index - I expect a performance gain in both cases, but it's
hard to predict which will be the fastest.

Best, Hugo
--

Mischa Sandberg
7/30/2004 3:53:01 AM
Funny enough, but this is just what I've been working on,
for lookup of about 50M rows in a table of 10M ranges.
(discussion in another thread, in m.p.s.programming).

(As an aside, "KEY" is a lousy column name; you'll be hearing from J.C.)

Given:
T1(id INT, "Key" CHAR(8), ...)
T2(id INT, from_Range CHAR(8), to_Range CHAR(8))

Gert-Jan Strik came up with the single-query answer
(paraphrased from the query framed for my problem)

SELECT T1.id, T2.id
FROM T1
JOIN T2
ON T2.from_Range= (
SELECT MAX(from_Range) FROM T2
WHERE T1.Key BETWEEN T2 .from_Range AND T2 .to_Range
)

[quoted text, click to view]

AddThis Social Bookmark Button