Groups | Blog | Home
all groups > sql server programming > july 2003 >

sql server programming : Query, Join on nearest


Ray Higdon
7/9/2003 7:26:46 AM
I would do it using a full outer join and then specify
the "closeness" in the where statement using datediff.

example where datediff(hh,datecolumn1,datecolumn2) <= 1

HTH

Ray Higdon MCSE, MCDBA, CCNA


[quoted text, click to view]
David Portas
7/9/2003 1:31:38 PM
Please post DDL (CREATE TABLE statements) and sample data (INSERT
statements).

--
David Portas
------------
Please reply only to the newsgroup
--


Tommy
7/9/2003 9:57:54 PM
Hello, I am in a position where i need to create a query where i must join
two tables on the nearest value.

Imagine i have 2 tables, tbl_Trade and tbl_Time. For each Time in tbl_Time,
I need to locate the row in tbl_Trade where tbl_Time.LastUpdate is closest
to, but less than the value of tbl_Time.myTime.

I have been using the followng query

Select Avg(s.Price) AS Price , t.myTime
From tbl_Trade s
Inner Join tbl_Times t
On (t.myDate + '2003-07-08') > s.LastUpdate
AND (t.myDate + '2003-07-08' - '1:00 AM') < s.LastUpdate
Inner Join tbl_StockList l
On l.ID = s.StockID
Where l.Code = 'ORI'
Group By t.myTime

But this relies on using an average of all trades in the time period (which
in this case is 1 hour).

Essentially the problem is that each trade can occur at any time, so there
is not necessarily a trade that occurs in that period, nor is an average of
the value of the average of the prices necessarily indicative of the value
we wish to retrieve.

So basically what i want to be able to do is to generate a recordset of the
value of the price of the last trade at each hour O, O is specified in a
table. Is such a join possible? Or is there a work around aside from using
triggers to populate a table of prices for each hour for each commodity?

Cheers

John Gilson
7/10/2003 12:50:23 AM
[quoted text, click to view]

Here's a stab at what I believe you're asking, however, without DDL,
sample data, and an example of what the query should return, this is
just a guess.

-- For each stock, give the most recent trade before a given
-- reference time
CREATE VIEW MostRecentTradeAtTime (stock_id, ref_time, actual_time)
AS
SELECT ST.StockID, T.myTime, ST.LastUpdate
FROM tbl_Time AS T
INNER JOIN
tbl_Trade AS ST
ON ST.LastUpdate < T.myTime AND
NOT EXISTS (SELECT *
FROM tbl_Trade
WHERE StockID = ST.StockID AND
LastUpdate > ST.LastUpdate AND
LastUpdate < T.myTime)

Regards,
jag

Tommy
7/10/2003 4:31:16 PM
I have spent the better part of the afternoon pissfarting around with this
problem and have come up with 2 different solutions. One uses cursors and
is incredibly fast, the other doesnt and is very slow.

I created them as stored procedures so that i could stress test each more
easily.

The first solution, and nicest looking one is....
---------------------------------------------------------------------------
Create procedure Tester2 (@Stockid as int)
As
Select tbl_Stocks.Price, tbl_Stocks.LastUpdate
From tbl_Stocks
Inner Join (
Select Max(s.ID) AS Closest --, l.Code, s.LastUpdate
From tbl_Stocks s
Inner Join tbl_Times t
On s.LastUpdate < (t.myDate + '2003-07-08')
Where s.StockID=@StockID
Group By t.myDate
) dt_Updates
On dt_Updates.Closest = tbl_Stocks.ID

The second, uglier one is...
---------------------------------------------------------------------------
Create procedure Tester1 (@Stockid as int)
As

declare @myTime as smalldatetime
declare @Result Table ( Price int, LastUpdate smalldatetime )


Declare Time_Cursor cursor
For
Select myDate + '2003-07-08' From tbl_Times

Open Time_Cursor

Fetch Next From Time_Cursor Into @myTime
While @@Fetch_Status = 0
Begin
Insert Into @Result
Select Top 1 Price, LastUpdate
From tbl_Stocks
Where StockID=@StockID
And LastUpdate < @myTime
Order By LastUpdate Desc

Fetch Next From Time_Cursor Into @myTime
End
Close Time_Cursor
Deallocate Time_Cursor
Select * From @Result
---------------------------------------------------------------------------

And the results?

I used a silly little bit of SQL to run each 100 times.
Tester1 1 Second
Tester2 148 Seconds

Running Tester1 1000 Times however took 14 seconds, however i think most of
this was spent in network congestion as the CPU's on the server didnt go
over 30%. Tester2 however was using over 80%. So it seems that Tester1
around 100 times faster (which is a pretty nice speed increase really!)...
Also of note, my earlier incarnations of Tester2 (ie the single query
approach) were taking over 20 seconds per SINGLE query.

In this test case, tbl_Stocks has over 2 million records and tbl_Times has
9. The server is a Dual P4 2.4 Xeon, 1GB Ram. The data created from the
queries is going to form the basis for charts.

I hope this is of interest to some of you.

Cheers!


[quoted text, click to view]

Gert-Jan Strik
7/10/2003 10:26:36 PM
Tommy,

I assume that table tblStocks has a clustered unique index on (StockID,
LastUpdate). If not, then that could help. I also assume that LastUpdate
is of datatype datetime (not smalldatetime). Make sure that @StockID has
the same datatype as column StockID.

How does the query below perform?

Declare @d datetime
Set @d='2003-07-08'

SELECT DateAdd(day,myDate,@d) AS ProbeDate,(
SELECT S2.Price
FROM tblStocks S2
WHERE S2.StockID = @StockID
AND S2.LastUpdate = (
SELECT MAX(S1.LastUpdate)
FROM tblStocks S1
WHERE S1.StockID = @StockID
AND S1.LastUpdate < DateAdd(day,myDate,@d)
)
) AS Price,(
SELECT MAX(LastUpdate)
FROM tblStocks S3
WHERE S3.StockID = @StockID
AND S3.LastUpdate < DateAdd(day,myDate,@d)
) AS LastUpdate
FROM tbl_Times

Gert-Jan


[quoted text, click to view]
AddThis Social Bookmark Button