sql server data warehouse:
I run a fairly large SQL 2000 db (50GB), processing between 1MM-2MM new
event records (time series) per day....spread out evenly throughout the day.
With this application once I roll to a new month, the data from the prior
month is useless and I need to get it out of the db quickly and efficiently
and without any downtime.
To make matters worse, the main fact table (Events) is moderately
indexed...making DELETEs impractical (it takes about 1 minute to delete
10minutes of data...so it takes many *days* to delete an entire month this
way, as well as severly impacting query perf)....and I can't drop/delete old
recs/re-add indexes as that would basically make the system unusable during
the delete process.
What I'd like to do is have a seperate event table for each month (e.g.
Events200311, Events200312, etc.) so that shortly after a new month I can
simply drop the old table. My users are allowed to submit events as old as
72 hours, so my I can't drop the prior month table until the 4th of each
month. I've been looking at partioned views, but given the insert volume
(50-100 inserts/sec, 24x7), I'm concerned about negatively impacting insert
performance. I have a *lot* of code that does queries against the 'Events'
table, but only 3 storedprocs that actually do the inserting.
Is it feasible to define a view called 'Events' and variable point it at the
current EventsYYYYMM table....queries only need to access the current months
data...then I presume I could add conditional logic to my insert code to
direct the inserts to the correct table.
Ideas on the best way to implement something like this?
Lawrence Baldwin
myNetWatchman.com
Atlanta, GA