all groups > sql server programming > september 2007 >
You're in the

sql server programming

group:

SCD Wizard Question


SCD Wizard Question Mike C#
9/8/2007 7:41:59 PM
sql server programming:
Hi all,

I've been playing around with the SSIS SCD Wizard. I noticed that I seem to
be getting better performance updating my SCD's with stored procedures. Is
there a list of best practices concerning the SCD Wizard or something else I
might be missing? Also, I have about 30 Dimension tables, of Types 1 and 2;
running the SCD Wizard 30 times seems a bit unruly to update all my
Dimension tables. Is there a more efficient method or something else I'm
missing? Anyone else encounter this, or am I just missing some key pieces
of information? If someone else is having these problems, how did you
resolve them?

TIA

Re: SCD Wizard Question Mike C#
9/8/2007 10:19:00 PM
[quoted text, click to view]

Thanks for the info, I'm pretty familiar with the benefits of reusing cached
query plans (a benefit you can get from parameterized/prepared statements as
well). I ran across the pages you posted in my previous searches but I
didn't see anything that really addresses my core concerns, however, which
are:

1. The SCD Wizard generates individual SQL statements for each
insert/update instead of doing it in one batch using a set-based approach.
The general idea is that a single set-based statement to update 50,000 rows
is significantly faster than 50,000 individual UPDATE statements.

2. The SCD apparently sends one SELECT query per row for lookup purposes
also, bypassing any memory-based lookup advantages or basic set-based
querying advantages in general.

3. Running the SCD Wizard 30+ times seems like quite a hassle. I also hear
that you lose customized outputs if you re-run the wizard to make changes
later. I haven't tried this yet, but based on what I've read I'm not too
eager go back and redo a bunch of customized outputs via the SCD Wizard
interface, over and over again, during development.

Finally, I'm told that the SCD Wizard is basically a beginner's tool for
people who are unfamiliar with how to load SCD's, and that to get any kind
of performance out of it requires quite a bit of custom coding. My question
is this: if this is true, then why would anyone use the SCD instead of
creating a custom stored procedure from the start? It seems you could get
really bogged down with a lot of additional custom coding to work around the
SCD Wizard's limitations and to make it work as efficiently as it should...
Are there any advantages to using the SCD Wizard as opposed to custom SP's?
What method do you all use, and why did you decide to go that way?

Thanks in advance.

Re: SCD Wizard Question Jeje
9/9/2007 12:00:00 AM
Hi,

Personally I don't use it.
Part of the best pratices...
instead of updating the target table, write the updates into another table:
mytable_UPD
then do a single update statement:
update mytable
set col1 = mytable_UPD.col1
....
from mytable_UPD
where mytable.id = mytable_UPD.id

so you'll do only 1 update instead of 50 000.
Like Marco says, I'm using the tabledifference in few cases.

but if your source data and the destination are on the same server (I'm
using a local staging area which is stored on the same server in my case),
then you can create views to identify new or updated records.

Part of your setup... think SQL 2008...
in SQL 2008 there is a new SQL command designed to do inserts, updates and
deletes in 1 query.

good luck


[quoted text, click to view]
Re: SCD Wizard Question EMartinez
9/9/2007 12:27:26 AM
[quoted text, click to view]


These links might be helpful.
http://sqljunkies.com/WebLog/tpagel/archive/2005/07/24/16195.aspx
http://msdn2.microsoft.com/en-us/library/ms189681.aspx
Also, keep in mind that the use of stored procedures is always better
than queries since they store their query plans in SQL Server,
thereby, producing faster results. Hope this helps.

Regards,

Enrique Martinez
Sr. Software Consultant
Re: SCD Wizard Question Marco Russo
9/9/2007 12:47:31 AM
[quoted text, click to view]

I suggest you to take a look at the Table Difference tool. It is a
fundamental component if you want to achieve high performance with
large volumes of data when you need to mantain a SCD type II (but also
type I) dimension.
This is the link to the free component and its documentation:
http://www.sqlbi.eu/tabledifference.aspx

Marco Russo
http://www.sqlbi.eu
http://sqlblog.com/blogs/marco_russo
AddThis Social Bookmark Button