all groups > sql server data warehouse > july 2005 >
You're in the

sql server data warehouse

group:

Suggestion


Suggestion Ajay
7/5/2005 7:58:03 AM
sql server data warehouse:
Hi,

We have an application that access multiple databases for generating reports
and running various queries. These databases are segregated as follows:

Server1 DB1 1999-2001
Server2 DB2 2002-2004
Server3 DB3 2005-present

Data is sync-ed up using replication. Database grows at an average 4 million
records per year. The database is not a Warehouse. Database schema is
normalized. Need to keep historical data for generating baseline reports.

This approach has various drawbacks including the major one that if any
report has to span for the entire duration than data has to retrieved from
all databases and then grouped up as summary. This takes lot of time. Also,
over a period of time, we can't just keep adding more hardware and escalate
the current problems.

So, I am looking at a new database architecture which solves the current
problem and is scalable also.

Here are the considerations:
1. New architecture should support fast report generation
2. Be scalable to support new report
3. Easy to Manage and Maintain

Looking forward to suggestions:

Re: Suggestion JT
7/5/2005 1:19:19 PM
Keeping the historical vs. current data partitioned by database is probably
a good idea, because it reduces the seek time for the most frequently used
data, allows more flexibility in placement of data files on disks, and
reduces the size of full database backups, if records for historical years
are static. However, if historical 2004 - 1999 data is rarely used, then
would there be a cost / maintenace justification to keep it hosted on
seperate servers? What percentage of the time would Server1 be sitting idle
waiting for a user to request a report that includes historical data?

As for spanning historical and current records into one query, read up on
"partitioned views" in SQL Server Books Online. For example, the following
implements a view of the Customers table, which is split across 3 servers.

--Partitioned view as defined on Server1
CREATE VIEW Customers
AS
--Select from local member table
SELECT *
FROM CompanyData.dbo.Customers_33
UNION ALL
--Select from member table on Server2
SELECT *
FROM Server2.CompanyData.dbo.Customers_66
UNION ALL
--Select from mmeber table on Server3
SELECT *
FROM Server3.CompanyData.dbo.Customers_99



[quoted text, click to view]

Re: Suggestion Ajay
7/6/2005 6:42:03 AM
i think around 50% of the time the historcal server are sittng idle.

Anyways tell me this if i have to propose a new architecture for ths
requirement then how should it be done. One of my suggeston s lets seperate
OLTP and OLAP and the have reports run on OLAP. What do you thin? What do you
suggest?

How are big databases designed? What are the fundamentals for arhitecting
such database?

apprecate your help

thans
[quoted text, click to view]
Re: Suggestion JT
7/6/2005 10:36:58 AM
To reduce the processing time of your non-OLAP reports, you may want to
implement summary tables with the records aggregated based on the usage
needs of your queries. For example:

insert into SALES_SUMMARY
select
period,
quarter,
region,
category,
sum(sales) as sum_sales
from
Server1.SALES..SALES
union all
Server2.SALES..SALES
union all
Server3.SALES..SALES
group by
period,
quarter,
region,
category


[quoted text, click to view]

Re: Suggestion Jéjé
7/10/2005 10:56:34 AM
you are right,
you have to separate the OLTP system from the OLAP/Reporting system.
you can create some datamarts to solve your queries. To improve the loading
process (and query process), keep historical data into separate tables, but
you can do this on 1 server (histo + current data).

Datamarts allow you to create summarized data and improve the query response
time.
and/or use OLAP cubes to have sub-second response time.
You have to choose the better source regarding the report you have to
generate.

if your most requested report need to summarized information about a
specific customer (total sales, total complaints, total email sent...), so
create a summarized table with preaggregated information then 1 simple query
return all your information.
Detailed information can query your OLTP data sources.

There is many way to create a good responsive solution with and without a
data warehouse. This depends of your knowledge, the budget and time you
have.

[quoted text, click to view]

Re: Suggestion Peter Nolan
7/12/2005 5:20:17 AM
Hi Ajay,

"How are big databases designed? What are the fundamentals for
arhitecting such database?"

Well, what you are dealing with is not a 'big' database any more and
would not have been called so for a while......

Producing reports off operational systems has always been a 'dead end'
path for all but the very smallest of organisations....and maybe not
even then.....

There are plenty of materials around for 'newbies' now. Lots of books,
web sites etc......

I have made an effort to link some of the better sources I know about
on my web site so feel free to go to my web site www.peternolan.com. I
have also published a lot of papers, presentations, code etc...and I
have also linked a lot of information for 'newbies'.....

I am building my site as a 'useful place' for people new to BI/DW to
go......feel free to comment as well... :-)

Best Regards

Peter
AddThis Social Bookmark Button