all groups > sql server programming > may 2007 >
You're in the

sql server programming

group:

Complex query question


Complex query question nisheeth29
5/11/2007 4:07:01 PM
sql server programming:
So here's my question:

I have a table like this:

Id OpportunityID TimeStamp Amount
1 A 2007-05-11 15:30 $10
2 A 2007-05-11 12:30 $20
3 A 2007-05-11 10:30 $15
4 A 2007-05-10 18:06 $11
5 A 2007-05-10 10:00 $16
6 A 2007-05-09 23:00 $25
7 B 2007-04-25 15:00 $20
8 B 2007-04-25 10:00 $36
9 B 2007-04-01 12:00 $94
10 B 2007-04-01 10:00 $510
11 B 2007-03-31 09:00 $14
12 C 2007-05-01 23:59 $19
13 C 2007-05-01 00:00 $87
14 C 2007-04-07 12:00 $46
15 C 2007-04-07 00:00 $30
16 C 2007-04-03 00:00 $25
17 D 2006-05-10 00:00 $2
18 D 2006-05-07 13:00 $81
19 D 2006-05-07 00:00 $84
20 D 2006-05-02 00:00 $20


This represents a table in our sales data where it is tracking the dollar
value of a sales opportunity over time.

The Id is the primary key, the OpportunityID defines an opportunity and is a
foreign key in this table. The TimeStamp refers to the actual time where
there was a change in the Amount field of the Opportunity (in another table)
as a result of which it wrote a record to the above table with the TimeStamp
and the new Amount.

Here's my business issue that I'm trying solve – the sum of amounts on all
opportunities at any given point of time – that's we call our pipeline. I
want to be able to write a query on this dataset that can give me snapshots
at different points of time. The trick is to do this in a way that it
reflects the latest timestamp record for each opportunity that is less than
or equal to the snapshot time that I require. The big issue is that I want a
single query (or a combination of a few) to return to me this result not for
a single snapshot at a time but for a series of snapshots – for example,
every Monday for the past 5 years.

So for a snapshot of Jan 1, 2007 – this query should pick up only record 17.
Snapshot of Apr 2, 2007 should return records 17 and 9. And a snapshot for
Apr 29, 2007 should return records 7, 14, and 17.

I'm open to all kinds of suggestions.
Re: Complex query question --CELKO--
5/11/2007 5:13:09 PM
[quoted text, click to view]

Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. It is very hard to debug code when you do not let us
see it.

Since you have a timestamp (a reserved word that is never used for a
column name) the magical, vague "id" column is less than useless.
Let's use relational keys.

CREATE TABLE OpportunityHistory
(opportunity_id CHAR(3) NOT NULL
REFERENCES Opportunities (opportunity_id),
opportunity_starttime DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
opportunity_endtime DATETIME, -- null means current
CHECK (opportunity_starttime < opportunity_endtime),
PRIMARY KEY (opportunity_id, opportunity_time,
opportunity_amt DECIMAL (12,4) NOT NULL
CHECK (opportunity_amt >= 0.0000) ,
etc.
);

[quoted text, click to view]

That is a classic history table. You need to show durations and not
Chronons of time; how long was the "window of opportunity" in the real
world?

[quoted text, click to view]

No, it is not. It is a sequential numbering that is out of
chronological order, used for PHYSICAL access and has nothing to do
with the data model.

[quoted text, click to view]

We need DRI, then ..

[quoted text, click to view]
and the new Amount. <<

you are mimicking a sheet of paper with a log on it, and not using a
relational design. The sign-in/sign-out sheet has "half a fact" on
each line of that paper form. We want a complete fact in each row.

[quoted text, click to view]

You need to construct a Calendar table for reporting and many other
things. With the DDL I have given you, just use a BETWEEN predicate :

WHERE @my_date BETWEEN opportunity_starttime
AND COALESCE (opportunity_endtime,
CURRENT_TIMESTAMP)

[quoted text, click to view]

The most current row is defined by (opportunity_endtime IS NULL) and I
would put that in a VIEW to make life easy

[quoted text, click to view]

Now you need to do some work. Create a table of reporting periods
(period_name, cal_date) or
(period_name, start_date, end_date) so can have things like
('Mondays', ' '2007-05-07') or ('Bikini Season' , '2007-05-01',
'2007-08-30'). You can tune the reporting periods to the exact fit
you need.

SQL is a **data** language and not a procedural or computational
language.
Re: Complex query question TheSQLGuru
5/11/2007 9:17:34 PM
Celko, once again you waste an inordinate amount of bandwidth while
completely failing to address the OP. The ID column is not less than
useless, the OP did not ask you to redesign his table, the ID column IS the
primary key if the table or the user defines it to be so, yada-yada-yada.

--
TheSQLGuru
President
Indicium Resources, Inc.

[quoted text, click to view]

Re: Complex query question TheSQLGuru
5/11/2007 9:48:04 PM
I agree with Celko that you really should post a create table statement and
inserts of the sample data so we can help you more efficiently.

Given the data listed, there isn't likely to be an efficient way to do what
you need. Here is a quickie I slapped together that may or may not be right
(getting late), but should get you pointed in the right direction. It
includes a teensy (and not robust) time table. NOTE that I changed your
data to reduce the number of rows necessary in the time table.

create table #tmp (id smallint, OppID char(1), ModDate smalldatetime, Amount
money)

insert #tmp values (1 ,'A', '2007-05-11 15:30', 10 )
insert #tmp values (2 ,'A', '2007-05-11 12:30', 20 )
insert #tmp values (3 ,'A', '2007-05-11 10:30', 15 )
insert #tmp values (4 ,'A', '2007-05-10 18:06', 11 )
insert #tmp values (5 ,'A', '2007-05-10 10:00', 16 )
insert #tmp values (6 ,'A', '2007-05-09 23:00', 25 )
insert #tmp values (7 ,'B', '2007-05-11 15:00', 20 )
insert #tmp values (8 ,'B', '2007-05-11 10:00', 36 )
insert #tmp values (9 ,'B', '2007-05-02 12:00', 94 )
insert #tmp values (10 ,'B', '2007-05-01 10:00', 510 )
insert #tmp values (11 ,'B', '2007-05-01 09:00', 14 )
insert #tmp values (12 ,'C', '2007-05-01 23:59', 19 )
insert #tmp values (13 ,'C', '2007-05-01 00:00', 87 )
insert #tmp values (14 ,'C', '2007-05-07 12:00', 46 )
insert #tmp values (15 ,'C', '2007-05-07 00:00', 30 )
insert #tmp values (16 ,'C', '2007-05-03 00:00', 25 )
insert #tmp values (17 ,'D', '2007-05-10 00:00', 2 )
insert #tmp values (18 ,'D', '2007-05-07 13:00', 81 )
insert #tmp values (19 ,'D', '2007-05-07 00:00', 84 )
insert #tmp values (20 ,'D', '2007-05-02 00:00', 20 )


CREATE TABLE #Time (
Date smalldatetime NOT NULL ,
Year smallint NOT NULL ,
Month char (3) NOT NULL ,
DayOfWeek tinyint NOT NULL
)

insert #time values ('2007-05-01', 2007, 'May', 3)
insert #time values ('2007-05-02', 2007, 'May', 4)
insert #time values ('2007-05-03', 2007, 'May', 5)
insert #time values ('2007-05-04', 2007, 'May', 6)
insert #time values ('2007-05-05', 2007, 'May', 7)
insert #time values ('2007-05-06', 2007, 'May', 1)
insert #time values ('2007-05-07', 2007, 'May', 2)
insert #time values ('2007-05-08', 2007, 'May', 3)
insert #time values ('2007-05-09', 2007, 'May', 4)
insert #time values ('2007-05-10', 2007, 'May', 5)
insert #time values ('2007-05-11', 2007, 'May', 6)


select tm.date, sum(amount)
--select *
from #tmp t1 inner join #time tm on t1.moddate <= tm.date
where tm.dayofweek = 3
and t1.moddate = (select max(t2.moddate) from #tmp t2 where t2.oppid =
t1.oppid and t2.moddate <= tm.date)
group by tm.date

Here is the select * output
id OppID ModDate Amount Date
Year Month DayOfWeek
------ ----- ----------------------- --------------------- -----------------------
------ ----- ---------
13 C 2007-05-01 00:00:00 87.00 2007-05-01
00:00:00 2007 May 3
9 B 2007-05-02 12:00:00 94.00 2007-05-08
00:00:00 2007 May 3
14 C 2007-05-07 12:00:00 46.00 2007-05-08
00:00:00 2007 May 3
18 D 2007-05-07 13:00:00 81.00 2007-05-08
00:00:00 2007 May 3

and here is the aggregate output

date
----------------------- ---------------------
2007-05-01 00:00:00 87.00
2007-05-08 00:00:00 221.00


--
TheSQLGuru
President
Indicium Resources, Inc.

[quoted text, click to view]

Re: Complex query question Hugo Kornelis
5/13/2007 12:08:28 AM
[quoted text, click to view]
(snip)

Hi nisheeth29,

Why did you repost the question? I just spent a considerate amount of
time replying to your first question, and after reading on I find that
you reposted it and already had an answer from the SQL Guru.

--
Hugo Kornelis, SQL Server MVP
Re: Complex query question nisheeth29
5/15/2007 3:02:02 PM
Thanks for all your responses and apologies if I offended anyone - it was my
first ever post. Plus, I did not realize that posting to 2 different sites
would end up in the same place.

Nevertheless, I had some followup questions. Here is an actual snapshot of
the data. I understood the part about creating a calendar table so that's
good. Now I'm trying to find a more efficient way of writing my queries.

Right now, if I want to get the record with the latest timestamp, I use this
query:

SELECT OpportunityId, MAX(CreatedDate) AS CreatedDate
FROM dbo.OpportunityHistory
GROUP BY OpportunityId

Then, I write another query referencing the results of the one above and
joining it to the original table on the OpportunityId and the CreatedDate to
return the Amount corresponding to the latest timestamp for each
OpportunityId. My question is: Can I achieve this in a single query instead
of two and not by using a sub-query?

Thanks!!

OpportunityId CreatedDate Amount
0060000000000A8AAI 15-Sep-99 9000
0060000000000A8AAI 05-Nov-99 9000
0060000000000AKAAY 15-Sep-99 9000
0060000000000AKAAY 15-Sep-99 9000
0060000000000AKAAY 16-Nov-99 9000
0060000000000CCAAY 09-Dec-99 10000
0060000000000CCAAY 27-Jan-00 18000
0060000000000DCAAY 15-Sep-99 9000
0060000000000DCAAY 05-Jul-00 9000
0060000000000H5AAI 16-Sep-99
0060000000000H5AAI 04-Oct-99 0
0060000000000H5AAI 06-Dec-99 6000
0060000000000H5AAI 21-Dec-99 6000
0060000000000H5AAI 22-Mar-00 3000
0060000000000H8AAI 16-Sep-99
0060000000000H8AAI 16-Sep-99 170000
0060000000000H8AAI 12-Jan-00 170000
0060000000000OJAAY 20-Sep-99
0060000000000OJAAY 20-Sep-99 0
0060000000000OJAAY 13-Oct-99 50000
0060000000000OJAAY 22-Oct-99 50000
0060000000000OJAAY 22-Oct-99 0
0060000000000OJAAY 09-Dec-99 0
0060000000000OJAAY 14-Jan-00 6900
0060000000000OJAAY 27-Mar-00 26400
0060000000000OJAAY 30-Mar-00 26400
0060000000000Q8AAI 20-Sep-99
0060000000000Q8AAI 11-Jan-00 0
0060000000000Q8AAI 11-Jan-00 0
0060000000000Q8AAI 11-Feb-00 0
0060000000000Q8AAI 15-Feb-00 0
0060000000000Q8AAI 17-Feb-00 0
AddThis Social Bookmark Button