all groups > sql server data warehouse > december 2003 >
You're in the

sql server data warehouse

group:

Partioning data by Month


Partioning data by Month Lawrence Baldwin
12/13/2003 11:10:38 AM
sql server data warehouse:
I run a fairly large SQL 2000 db (50GB), processing between 1MM-2MM new
event records (time series) per day....spread out evenly throughout the day.
With this application once I roll to a new month, the data from the prior
month is useless and I need to get it out of the db quickly and efficiently
and without any downtime.

To make matters worse, the main fact table (Events) is moderately
indexed...making DELETEs impractical (it takes about 1 minute to delete
10minutes of data...so it takes many *days* to delete an entire month this
way, as well as severly impacting query perf)....and I can't drop/delete old
recs/re-add indexes as that would basically make the system unusable during
the delete process.

What I'd like to do is have a seperate event table for each month (e.g.
Events200311, Events200312, etc.) so that shortly after a new month I can
simply drop the old table. My users are allowed to submit events as old as
72 hours, so my I can't drop the prior month table until the 4th of each
month. I've been looking at partioned views, but given the insert volume
(50-100 inserts/sec, 24x7), I'm concerned about negatively impacting insert
performance. I have a *lot* of code that does queries against the 'Events'
table, but only 3 storedprocs that actually do the inserting.

Is it feasible to define a view called 'Events' and variable point it at the
current EventsYYYYMM table....queries only need to access the current months
data...then I presume I could add conditional logic to my insert code to
direct the inserts to the correct table.

Ideas on the best way to implement something like this?

Lawrence Baldwin
myNetWatchman.com
Atlanta, GA

Re: Partioning data by Month Kevin
12/13/2003 5:18:35 PM
You're on the right track. There is an excellent white paper describing to
do exactly this, and I use the technique described there every day on
databases much much larger:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/partitionsindw.asp

I would note that I have best luck by creating a non clustered index on the
partitioning key as well as the contraint. Otherwise the optimizer tends to
still consider non-applicable partitions in a query plan.


[quoted text, click to view]

AddThis Social Bookmark Button