all groups > sql server data warehouse > september 2007 >
You're in the

sql server data warehouse

group:

Issues with BI DW and Virtualization


Issues with BI DW and Virtualization Nickg
9/14/2007 6:52:03 AM
sql server data warehouse:
We are creating a new environment for our BI DW. This includes servers for
SSIS and one for DW and disks on a SAN. Because of licensing the current
environment uses virtualization.

My question is, what are the drawbacks of using virtualization with MS SQL
server for BI DW?

Re: Issues with BI DW and Virtualization Marco Russo
9/14/2007 3:20:33 PM
[quoted text, click to view]

The big issue are performance, particularly when you process data (ETL
and cube processing are CPU-critical activities). When you virtualize,
you define limits for RAM and CPU usage that limit the scalability of
your solution. Also query are affected (both relational and
multidimensional) when you don't have RAM available for your VM. Today
it's crazy having a SSAS or SQL machine with less than 2 Gb of RAM. If
you install both, a 64 bit server with 8Gb of RAM should be the
minimum for small data warehouses (20-30 Gb or less) build using SQL
Server products.
Another issue is I/O, but using a SAN should already improve that.

This is just my opinion based on experience on data warehouse of very
different size...

Marco Russo
http://www.sqlbi.eu
http://sqlblog.com/blogs/marco_russo
Re: Issues with BI DW and Virtualization Jeje
9/15/2007 12:00:00 AM
Virtualization become better and better everyday.
the performance impact in a virtual machine is minimal, exists, but minimal.
(I'm talking about CPU processing speed and IO access which are near the
same as the host system, the overhead of the virtual env. is really small)

the only big issue today, there is no support of x64 VM in virtual server.
so running large databases in a x32 virtual machine is not so good due to
lack of memory available in the VM.

Win2008 will improve this and in fact will provide more interesting options,
like adding CPU and memory in a VM without stopping it. So you can add cpu
and memory, do you ETL process then remove the added cpu/memory to provide
more power to other virtual machines.

in your scenario, you want to push SSIS in a VM and the database in another.
I think its not practical. SSIS doesn't take any resources until you run
ETLs and communicating between the 2 VM will go through the network instead
of using the shared memory, this will impact the database access
performance.
The only advantage is: you'll have 4gb a ram for SSIS and 4gb of RAM for SQL
Server instead of 4gb of shared RAM (remember the x32 limit)
but in this case, use the host server only and you'll have 8gb of ram all
the time and your SQL server server will have more power when you don't run
the ETLs (8gb memory instead of 4gb will certainly improve the SQL Server
performance)

and... again I'll repeat always the same thing: SAN is NOT A performance
solution until you setup it correctly.
we see a lot of customers which don't understand why their SAN is slower
then direct attached storage for a DW usage.
"I have paid thousand $ for my SAN and the vendor says SAN is the best
performance and now you says to me it's too slow?"
yes...
in a DW you have to use high performance connectivitIES between the server
and the SAN (note the S for the number of connections...)
the customer setup only 1*2Gbps connection, which limit the SAN access to
170mbps only or the equivalent of 3 SAS drives. for a DW usage it's really a
low value. but if you setup 2 or 3*4Gbps channels between the server and the
SAN can provide more than 1GBPS of sequential access.
so having a SAN can be good... but make sure you use more than 1 connection
to make sure you'll have enough performance.


[quoted text, click to view]
Re: Issues with BI DW and Virtualization Marco Russo
9/15/2007 3:43:33 PM
[quoted text, click to view]

Jeje, I completely agree with you for the SAN issues - I've seen the
same, to not mention people who wanted to say that their RAID 5
solution was performant almost the same as a RAID 10. Of course, when
our SSIS start and the SAN cache (2 or 4Gb) is full, their performance
are almost like my desktop with SATA disks.

Regarding VMs, two more things:
1) VMware doesn't have MS Virtual Server limitations, and you can use
it today. Moreover, Windows Server 2008 will not have the have all the
originally planned features in its first version. That said, I see a
cost/benefit issue. There is some advantage in segmenting a physical
server into several virtual servers especially in consolidation
environment: I have a scenario where a single 2 x dual core server
with 8Gb RAM runs 10/12 VMs ranging from 256Mb to 1500Mb of RAM each
one.
2) But what server are you buying to get a single VM with 8Gb of RAM?
I hope at least 32Gb... A better solution I've found is having the
physical server running SQL Server that runs also VMs with other
services (like front ends). In this case, you don't pay the
virtualization cost for SQL Server, but still use server resources for
other purposes when you are not running ETLs.

Marco Russo
http://www.sqlbi.eu
http://sqlblog.com/blogs/marco_russo
AddThis Social Bookmark Button