all groups > sql server misc > september 2003 >
You're in the

sql server misc

group:

How do I make this as fast as possible?


How do I make this as fast as possible? Jim Hubbard
9/25/2003 1:16:17 AM
sql server misc: I have a test project that I am trying to code to be as fast as possible.

The project is a lottery speculation project that uses .Net web services to
return an XML recordset.

The goal is to accept any number of 5 possible balls, and return a recordset
of all past draws and possible draws that contain the passed in numbers.

For example, there are 56 balls. The user of the web service should be able
to pass in 1 or more ball numbers (from 1 to 56), that have already been
drawn and get back an XML representation of all past draws that contain the
numbers passed in by the user.

As the number of possible 5-ball combinations in the database is 3,819,816
(some of which will have been chosen before and will have a number
indicating the number of times it has been chosen in the past), and the time
limit to respond is 15 seconds, making the most of the database is an
absolute must.

I have begun by creating a database that has 5 columns (1 for each ball)
plus one numeric column to track the number of times the lottery number in
the record has been chosen previously. What I need to do is accept from 1
to 5 ball numbers and retrieve the records that contain all of the ball
numbers passed in.

I am not certain that using an individual column for each ball is even the
best way to go. After all, a ball number may be drawn in any column which
leads to a rather costly string like "IF BallIn1 = Ball1 or BallIn1 =
Ball2..." etc..

You could do something like "If Ball1 in (BallIn1, BallIn2, BallIn3,
BallIn4, BallIn5) or Ball2 in (BallIn1, BallIn2, BallIn3, BallIn4,
BallIn5)...." and so on.

Concatenation of the ball numbers into a string is also a possibility, but
to check for each ball number passed in would probably involve a "LIKE % ?
%" type search that is very resource intensive and probably not as fast as
it could be.

The goal for this application is to make queries like this as fast as
possible in SQL Server 2000. I can write stored procs and such, but I am
not sure about the fastest search to use to search a database with data such
as this.

Any suggestions you have on the db layout or search queries would be greatly
appreciated.

Thanks!

Jim

Re: How do I make this as fast as possible? Steve Kass
9/25/2003 2:05:43 AM
Jim,

Have you tried indexing each of the BallN columns and testing

....
where Pick1 in (Ball1, Ball2, Ball3, Ball4, Ball5)
and Pick2 in (Ball1, Ball2, Ball3, Ball4, Ball5)
....
-- I'm not sure why you have OR, since you say you want
-- "records that contain all of the ball numbers passed in"

If that doesn't work, two other possibilities would be
worth considering if you can normalize the table:

create table T (
DrawNo int not null,
BallDrawn tinyint not null check (BallDrawn >= 1 and BallDrawn <= 56),
primary key (DrawNo, BallDrawn)
)
....

declare @userPicks varchar(10)
set @userPicks = '0924284351'
select DrawNo
from T, (
select 1 i union all select 3 union all select 5 union all select 7
union all select 9
) N
where i < len(@userPicks)
and BallDrawn = substring(@userPicks,i,2)
group by DrawNo
having count(DrawNo) = len(@userPicks)/2

or

declare @userPicks varchar(10)
set @userPicks = '0924284351'
select DrawNo
from T
where not exists (
select * from (
select 1 i union all select 3 union all select 5 union all select 7
union all select 9
) N
where i < len(@userPicks)
and not exists (
select * from T2
where T2.DrawNo = T.DrawNo
and T2.BallDrawn = substring(@userPicks,i,2)
)
)

-- Steve Kass
-- Drew University
-- Ref: CACC598C-CF57-4A15-A0C1-496CFFB2BD24


[quoted text, click to view]
Re: How do I make this as fast as possible? Jim Hubbard
9/25/2003 2:24:59 AM
Steve,

You're right, I should've said AND instead of OR. And, I forgot to say
that the current ball columns are text columns. This can be easily changed.

However, I think I have a good select statement worked out. When
passing in 2 ball numbers, it returns all requested records in 4 seconds
from SQL Query Analyzer. It should be a bit faster when compiled as a
stored proc.

Using a single ball number, I got back a huge number of records in 13
seconds. (These results are on a 950Mhz processor and 512Mb RAM. This
machine will be upgraded to 2.4Ghz processor and 1Gb RAM next week - so that
should help a little too.)

It looks like...

"select * from drawings where BallIn1 in (Ball1, Ball2, Ball3,
Ball4, Ball5) and BallIn2 in (Ball1, Ball2, Ball3, Ball4, Ball5)"

Looks like this will meet my goals as far as getting the query back, now
I have to get it across the web as XML without breaking the 20 second
barrier.

Each BallN column has been indexed and all BallN columns have been
indexed as a clustered index.

I will also try out your suggestions and get back to you tomorrow with
the results.

Thanks for the quick response!

Jim

[quoted text, click to view]

Re: How do I make this as fast as possible? Wangkhar NO[at]SPAM yahoo.com
10/1/2003 4:08:17 AM
You could try having it ordered, so that you have smaller numbers on
the left.

So you have 5 'slots' (not balls), always in number order from the
left.

Stick em as primary clustered key ints.

write a loop to populate it and do so carefully...

Then take your parameters and order them up the same way. That'll
make it nice n swift i think.

Then you just look for
select x from y where slot1 = param1 and slot2=param2 etc.

Maybe im missing something, but thats what I think you are after...?






[quoted text, click to view]
Re: How do I make this as fast as possible? Jim Hubbard
10/1/2003 7:33:29 PM
I had to abandon this idea.

The number of records being returned cannot be retrieved and sent across the
web in under 20 seconds. Some queries would bring back almost 360,000
records. People with a modem would just be out of luck if they had to wait
on all 360,000 records to be returned in an XML dataset.

The 20 second time limit is a real pain in the *. But, you gotta do what
the user needs. Now, if I can just figure out how.....

Thanks for the input!

Jim


[quoted text, click to view]
Re: How do I make this as fast as possible? jag NO[at]SPAM acm.org
10/3/2003 6:47:01 AM
Here's a solution that represents the balls of each lottery draw in
separate table rows as opposed to separate columns. So all draws of N
lottery balls are represented by a (ball, draw_number) table, as
opposed to a (ball_1, ..., ball_N) table, with all balls in a given
draw having the same unique draw_number value. This makes for a more
scalable normalized solution, e.g., it's straightforward to represent
all 6-ball combinations with no schema change. The key, however, with
this approach is to generate the unique draw_number values. The
solution is implemented by a UDF which takes as an argument the number
of balls to find combinations of, that is, finding all N-ball
combinations from the set of all balls not already drawn is
accomplished by specifying N as the argument to the function.

Let me motivate with an example and follow with the code.

/*
Create the set of lottery balls. Note that this is not hardwired.
In your case, to insert 56 lottery balls, call
EXEC InsertLotteryBalls 56
*/
-- Insert 6 lottery balls labelled 1 through 6
EXEC InsertLotteryBalls 6

-- Insert lottery balls already drawn
-- So the set of lottery balls to find combinations of is {2, 4, 5, 6}
INSERT INTO LotteryBallsDrawn (ball)
VALUES ('1')
INSERT INTO LotteryBallsDrawn (ball)
VALUES ('3')

-- All combinations of 2 lottery balls from those not drawn
-- All balls in the same sequence, or draw, have the same draw number
SELECT number_balls_drawn, ball, draw_number
FROM LotteryBallCombinations(2)
ORDER BY draw_number, ball

number_balls_drawn ball draw_number
2 2 0
2 4 0
2 2 1
2 5 1
2 4 2
2 5 2
2 2 3
2 6 3
2 4 4
2 6 4
2 5 5
2 6 5

/*
Here are some interesting uses of a combinations table
*/
-- Total number of N-ball combinations
DECLARE @n INT
SET @n = 2
SELECT COALESCE(MAX(draw_number) + 1, 0) AS total
FROM LotteryBallCombinations(@n)

total
6

-- All N-ball combinations that include a specific ball
DECLARE @ball CHAR(2), @n INT
SET @ball = '4' -- combos that include this ball
SET @n = 2
SELECT C.number_balls_drawn, C.ball, C.draw_number
FROM (SELECT draw_number
FROM LotteryBallCombinations(@n)
GROUP BY draw_number
HAVING SUM(CASE WHEN ball = @ball THEN 1 ELSE 0 END) = 1) AS Draws
INNER JOIN
LotteryBallCombinations(@n) AS C
ON C.draw_number = Draws.draw_number
ORDER BY C.draw_number, C.ball

number_balls_drawn ball draw_number
2 2 0
2 4 0
2 4 2
2 5 2
2 4 4
2 6 4

-- Code

-- All lottery balls
CREATE TABLE LotteryBalls
(
ball CHAR(2) NOT NULL PRIMARY KEY
)

-- Insert N lottery balls labelled from 1 to N
CREATE PROCEDURE InsertLotteryBalls @number_of_balls INT = 56
AS
DELETE FROM LotteryBalls
DECLARE @ball INT
SET @ball = 1
WHILE @ball <= @number_of_balls
BEGIN
INSERT INTO LotteryBalls (ball)
VALUES (CAST(@ball AS CHAR(2)))
SET @ball = @ball + 1
END

-- Lottery balls that have been drawn
CREATE TABLE LotteryBallsDrawn
(
ball CHAR(2) NOT NULL PRIMARY KEY REFERENCES LotteryBalls (ball)
)

/*
Candidate balls are balls that can be drawn, i.e., balls that haven't
been drawn.
Candidate balls in ascending order are mapped to consecutive natural
numbers.
*/
CREATE VIEW CandidateLotteryBalls (ball, ball_number)
AS
SELECT B1.ball, COUNT(*)
FROM LotteryBalls AS B1
INNER JOIN
LotteryBalls AS B2
ON B2.ball <= B1.ball
WHERE NOT EXISTS (SELECT *
FROM LotteryBallsDrawn
WHERE ball IN (B1.ball, B2.ball))
GROUP BY B1.ball

/*
The total number of combinations of N things taken R at a time is
N!/R!(N-R)!. For example, the total number of combinations of 5 balls
taken 3 at a time is 5!/3!(2!) = 10. So if the 5 balls are named 1,
2, 3, 4, and 5, the combinations of these balls taken 3 at a time, and
listed lexicographically with zero-based sequence numbers, are:
[0] 3, 2, 1
[1] 4, 2, 1
[2] 4, 3, 1
[3] 4, 3, 2
[4] 5, 2, 1
[5] 5, 3, 1
[6] 5, 3, 2
[7] 5, 4, 1
[8] 5, 4, 2
[9] 5, 4, 3

The following view generates unique nonnegative integral numbers to be
associated with each draw of R balls from a set of N balls.
*/
CREATE VIEW CombinationSequenceNumbers (n, r, sequence_number)
AS
SELECT B1.ball_number AS n,
B2.ball_number AS r,
CAST(
ROUND(EXP(SUM(LOG(
CASE WHEN B3.ball_number >
CASE WHEN B2.ball_number <= B1.ball_number / 2
THEN B1.ball_number - B2.ball_number
ELSE B2.ball_number
END
THEN B3.ball_number
ELSE 1
END))) /
EXP(SUM(LOG(
CASE WHEN B3.ball_number <=
CASE WHEN B2.ball_number <= B1.ball_number / 2
THEN B2.ball_number
ELSE B1.ball_number - B2.ball_number
END
THEN B3.ball_number
ELSE 1
END))), 0) AS INT)
FROM CandidateLotteryBalls AS B1
INNER JOIN
CandidateLotteryBalls AS B2
ON B1.ball_number <
(SELECT MAX(ball_number) FROM CandidateLotteryBalls) AND
B2.ball_number <= B1.ball_number
INNER JOIN
CandidateLotteryBalls AS B3
ON B3.ball_number <= B1.ball_number
GROUP BY B1.ball_number, B2.ball_number

/*
The following is a UDF to return all combinations of R lottery balls
from the N lottery balls not previously drawn. Therefore, 0<=R<=N.
Each combination of a given length is assigned a unique nonnegative
integral identifier.
*/
CREATE FUNCTION LotteryBallCombinationsUpToR (@number_balls_to_draw INT)
RETURNS @ball_drawings TABLE
(number_balls_drawn INT NOT NULL,
ball CHAR(2) NOT NULL,
ball_number INT NOT NULL,
draw_number INT NOT NULL,
prev_draw_number INT NOT NULL,
PRIMARY KEY (number_balls_drawn, draw_number, ball_number))
AS
BEGIN
IF @number_balls_to_draw >= 1
BEGIN
DECLARE @ball_n INT, @prev_ball_n INT
SET @ball_n = 1

DECLARE @candidate_lottery_balls TABLE
(ball CHAR(2) NOT NULL,
ball_number INT NOT NULL PRIMARY KEY)
INSERT INTO @candidate_lottery_balls (ball, ball_number)
SELECT ball, ball_number
FROM CandidateLotteryBalls

DECLARE @combination_sequence_numbers TABLE
(n INT NOT NULL,
r INT NOT NULL,
sequence_number INT NOT NULL,
PRIMARY KEY (n, r))
INSERT INTO @combination_sequence_numbers (n, r, sequence_number)
SELECT n, r, sequence_number
FROM CombinationSequenceNumbers
WHERE r <= @number_balls_to_draw

-- Draw first of R balls from candidate balls
INSERT INTO @ball_drawings
(number_balls_drawn, ball, ball_number, draw_number, prev_draw_number)
SELECT @ball_n, ball, ball_number,
COALESCE(S.sequence_number, 0),
COALESCE(S.sequence_number, 0)
FROM @candidate_lottery_balls AS B
LEFT OUTER JOIN
AddThis Social Bookmark Button