8GB RAM & 3disk RAID0). Granted that the data we used is artificially
"Raymond Mak [MSFT]" <rmak@online.microsoft.com> wrote in message
news:uc4Q3wSiHHA.1708@TK2MSFTNGP03.phx.gbl...
> Blake, I am probably not reading you previous post (the one before you
> latest response) correctly but it seems to me that doing a (full table)
> bulk update on the pk or uq will incur rather significant resource\locking
> contention at the publisher database, and the logreader\distribution agent
> will need to transfer almost twice the amount of data (delete\insert +
> command formatting overhead) thereby putting further strain on the network
> and the cleanup agent. That seems to be a bit of an extreme measure to
> avoid replication snapshot processing. And as I have mentioned before,
> locking overhead for snapshot generation is much reduced in SQL2005 by
> default and you can use sync_method = 'database snapshot' to reduce that
> further. Paul Ibison also has a nice article up on his website
> (
http://www.replicationanswers.com/BCPPartitioning.asp) talking about an
> enhancement in SQL2005 that allows a large table to be bulk-copied by the
> snapshot agent in parallel. That said, it is not my intention to be pushy
> about this and there are indeed cases where all the enhancement in SQL2005
> snapshot processing will be inadequate to scale (it is not a pretty sight
> but I have seen people managed 100GBs snapshot on SQL2000). I am merely
> ... intrigued... by your comments and I very much appreciate your
> (extremely detailed) feedback.
>
> -Ryamond
>
> "Cqlboy" <Cqlboy@discussions.microsoft.com> wrote in message
> news:51FEF6A7-F5E5-4D54-BB9E-542EE39D5010@microsoft.com...
>> Raymond, thank you for the response and tip regarding snapshot
>> processing.
>> I need to investigate/experiment more with this. I am admittingly not
>> familiar yet with all of SQL 2005's offerings. Also, let me add, I'm not
>> at
>> all averse to replication and used SQL 2000 Transactional replication
>> extensively at my former employer, Experian Corp.
>>
>> Key things noted from my replication experience is this:
>>
>> 1) Taking an initial snap-shot against several large and important
>> order-related tables incurring heavy insert/update/delete activity
>> effectively took the company down until the snap-shot completed. This
>> could
>> be hours, something upper management would never tolerate. Heaven forbid
>> if
>> we lost sync' !
>>
>> 2) Publications containing hot articles presented significant loads to
>> the
>> disk IO subsystem, something that very few DBA's are aware of or
>> understand.
>> For example, if your publication contained articles responsible for much
>> of
>> the server IO you need to be aware that this resource drain will
>> effectively
>> double via the Distributor when replicated. This was a surprize to me
>> when I
>> first discovered it but it makes perfect sense in hindsight.
>>
>> 3) The last issue came from the Distributor clean up job. Whenever this
>> executed, it exercised a massive IO spike against its raid group, thus
>> affecting everything associated with those disks. Yep, I tinkered with
>> the
>> Clean-Up job schedule and tweaked the agent profiles but could never seem
>> to
>> alleviate these massive spikes. I never made it this far, but I was
>> seriously contemplating customizing the code in the Distributor Clean-up
>> proc' to break up and spread the IO demand over a longer interval,
>> anything
>> to avoid or reduce this sharp massive hits against the disks.
>>
>> I learned and became more sensitized to these issues when tasked to
>> re-architect SQL Server on our EMC Clariion CX500 and CX700 SAN. I
>> captured
>> file level IO stats using the little understood/known SQL Server function
>> fn_virtualstats as a source, sampling @ 1 minute intervals. Querying
>> against
>> one weeks data was very enlightening.
>>
>> I am aware of moving the Distributor to its own dedicated box/disks but
>> could never convince management to do so. BTW... it was a heavy bang
>> against
>> the SAN cache and it seemed we had evidence that the Distributor Clean-Up
>> job
>> cause global issues via the SAN. Evidence we out grew our SAN, I guess.
>>
>> I apoligize for this lengthy response but the key answers I am looking
>> for
>> is how to initialize replication for very large and critical,
>> order-related
>> articles without taking the company down. Secondly, is their a practical
>> means of determing the additional burden to the TempDB ? Log ? ... and
>> the
>> underlying disk subsystem. i.e. can I handle the load? I actually have
>> a
>> rough idea on this last point but if you have anything to share I'd love
>> to
>> hear it.
>>
>> Thanks again for taking the time to respond.
>>
>> Blake Colson a.k.a CqlBoy
>> Lead DBA of Operations
>> RealtyTrac, Inc.
>> Irvine, CA
>>
>>
>> "Raymond Mak [MSFT]" wrote:
>>
>>> Hi Blake, being the person responsible for most of snapshot processing
>>> in
>>> transactional\snapshot replication, I must admit that it is a bit
>>> difficult
>>> for me to read your posts. That said, in the interest of product
>>> improvement, I must ask what sort of problems you have experienced in
>>> SQL2000 (I can tell locking being one of them) that causes you to
>>> develop
>>> such an intense aversion against using replication snapshot processing
>>> at
>>> all.
>>>
>>> As Hilary had mentioned, locking during snapshot generation for
>>> transactional replication is much reduced by default in SQL2005, and you
>>> can
>>> further reduce it by using the new 'database snapshot' sync_method if
>>> you
>>> are running Enterprise Edition. In addition, I had also put in quite a
>>> few
>>> performance improvements for snapshot processing in SQL2005 which should
>>> hopefully make replication snapshot processing less painful to use. As
>>> such,
>>> I would really appreciate if you can give replication snapshot
>>> processing a
>>> try in SQL2005.
>>>
>>> Thanks much,
>>>
>>> -Raymond
>>>
>>> "Cqlboy" <Cqlboy@discussions.microsoft.com> wrote in message
>>> news:CC9DE155-8328-4CBF-8BF5-EE68900A59B1@microsoft.com...
>>> > Is it also possible to force at the Publisher like in SQL 2000 to send
>>> > a
>>> > pair
>>> > of DELETE/INSERT statements for an entire row by performing an UPDATE
>>> > on a
>>> > primary key or column that was unique? This was an issue in SQL Server
>>> > 2000
>>> > when the PK was an identiy column, I couldn't update the PK and force
>>> > the