Groups | Blog | Home
all groups > sql server replication > march 2005 >

sql server replication : Initial SnapShot failure....



rob lynch
3/14/2005 5:57:02 AM
I have a series of NT4 systems that I need to set up replication on.
They have sp 6a installed on them
I have done a clean install of SQL 2k Server then added sp3a

Setting up the publication provides no problems, but when I try to run the
snapshot agent, I have been recieving problmes (different on different
systems :) and all have exactly the same hardware/software.

I should also mention that I have it running well on 2 of 5.

The issues seem to be either that I have an "incompatable stub"... or just a
generic "suspect agent" error.

Neither error points to a specific file or stored proc.

Any ideas welcome!!!!

Thanks Rob
Raymond Mak [MSFT]
3/14/2005 10:43:25 AM
Hi Rob,

What you described sounds like some sort of DCOM
configuration problem. Either the replication agent COM
servers are not registered properly on some of your
machines or DCOM is disabled\improperly configured on
them. To diagnose the issue further, you may want to try
the following:

1) Try running snapshot.exe manually from the command
line; snapshot.exe can be found at %ProgramFiles%
\Microsoft SQL Server\80\COM. If running snapshot.exe
without arguments displays the usage screen, chances are
good that it is a DCOM activation issue. To further
confirm that this is indeed the case, you can try running
the other replication agents such as distrib.exe,
logread.exe, qrdrsvc.exe, and replmerg.exe from the same
location.
2) Try starting the other replication agents from
enterprise manager and see if they fail with the same
symptoms as the snapshot agent. If so, chances are real
good that you have a general DCOM activation problem.
3) If you think that you have a general DCOM activation
problem, check out the system\application event log and
see if you can find any messages that will point you in
the right direction of what the problem may be.
4) As a way to give you a better feel of what the problem
may be, you should open taskmanager when you try to start
any replication agent from enterprise manager and see if
the corresponding replication agent process
(snapshot.exe,distrib.exe, ...etc.) is started.

If you suspect that the replication agents are not
registered properly, you can manually registered them by
doing the following from %ProgramFiles%\Microsoft SQL
Server\80 (you may want to try it anyway):

regsvr32 replagnt.dll
snapshot.exe /RegServer
distrib.exe /RegServer
replmerg.exe /RegServer
logread.exe /RegServer
qrdrsvc.exe /RegServer

If nothing that I have suggested so far works for you, you
should check out the Microsoft KB for some of the more
arcane DCOM activation issues.

HTH

-Raymond

[quoted text, click to view]
rob lynch
3/14/2005 11:01:05 AM
Raymond, Thanks, I will try what you suggest.. (it was hitting me as dcom or
a dll issue, but I hate to do a file compare on all the dlls :)


Hillary, no error numbers :) it would be nice, but the errors are too
generic and have no additional information.

I will post back and let you know how it pans out..

Rob

PS After running an @@version on 4 of the effected servers the I got the
results below. (SVR005 is the only one that is working :( ? )

No difference in the install process.


Running command on svr005
;
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 4.0 (Build 1381: Service Pack 6)


Running command on svr001
;
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 4.0 (Build 1381: Service Pack 6)


Running command on svr010
;
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 4.0 (Build 1381: Service Pack 6)


Running command on svr083
;
Microsoft SQL Server 2000 - 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 4.0 (Build 1381: Service Pack 6)




[quoted text, click to view]
rob lynch
3/14/2005 12:17:03 PM
Ok, maybe getting further....
(I registered all exe's and dll's in then COM directory - dug into domcnfg
to no real value - really didn't know what to look for there but teh settings
seemed to be the same as where things are working)

Now I am not seeing the RPC stub error..

What seems to happen now is that I can start the Snapshot agent, but it
timeouts(after an interval of 4 min)

I have just cranked the failure time up to 15 min to see if this helps..

I will post back on success of failure.
Any other ideas welcome (I really don't want to do the replication manually,
but I need to institute a replicating subscriber model on 156 servers from 21
other servers in the next week and at this point using sql's replication
engine is being a ^*^*&^* pain.)

rob lynch
3/14/2005 12:47:03 PM
Ok things are getting more interesting....

When I run the Snapshot from the command prompt I get an instant snapshot
(Using the systemadmin sql login - but running from em gives me a stub error
- after a good long timeout)

I am about to go home and bring down the bumps with some good cold beer..
Let you know how things go tomorrow

Rob
MSCE (nt4) MSCD v6 and net MCDBA 7 & 2k
Hilary Cotter
3/14/2005 12:50:09 PM
can you give me the error message and error number associated with the
incompatible stub message?

the suspect agent problem can be fixed by right clicking on Replication
Monitor and selecting Refresh Rate and Settings, Then set Inactivity
Threshold to something larger. You MUST monitor your agents to ensure that
they are doing something on the subscriber - as they could be hung.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html

Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com

[quoted text, click to view]

rob lynch
3/14/2005 1:01:10 PM
Getting Closer (no beer yet!)

EM fails but Command line always works....

Suggestions?

TIA

Raymond Mak [MSFT]
3/14/2005 8:34:37 PM
Rob, did you see anything interesting in the NT event log?
Did the snapshot agent ever pop up in task manager when
you start it from EM anyway?

One cheap workaround that you can use is to run the
replication agent from the CMD subsystem of SQLServerAgent
but that is strictly the last resort.

HTH
[quoted text, click to view]
Raymond Mak [MSFT]
3/14/2005 8:44:36 PM
Rob, there are a few more things that you can try:

1) If you haven't done this after you re-register the
dlls, stop and restart SQLServerAgent.
2) What is the service account of SQLServerAgent for the
machines that are giving you trouble?
3) Do you have any kind of fire wall running that may
affect DCOM\RPC traffic?

[quoted text, click to view]
rob lynch
3/15/2005 6:09:05 AM
Ok... (sorry went home had a beer and did some sleeping.. Didn't help the
servers, but I feel better :)

Raymond,

I have tried stopping and restarting the sqlserveragent and that had no
effect, the event log shows nothing special - just the incompatible stub
message (message at the bottom).

I don't think it could be a firewall issue as the problem shows up when you
log on locally (but there is no firewall there anyway.)

The account used by SQLServer the SQLServerAgent are both a member of the
domain/administrators group.

I am wondering if it could be an NT4 with SQL2K issue, but both are running
the latest sp's, and of course 1 is working fine :)

Do you know where or how sql runs its replication tasks. Is it via a stored
proc call or does it run via some internal dll code?

By the way, thanks for all your help so far. It is nice to have a fresh set
of brains working on this issue.

rob lynch
3/15/2005 6:09:09 AM
Forgot the message :)

SubSystem Message - Job 'SVR083-Northwind-083LN01E-1'
(0xE090DB0AB794D911AC8700E0B8151568), step 2 - Incompatible version of the
rob lynch
3/15/2005 7:09:03 AM
Ok so now it is getting ugly :)

I compared the files in two of the servers bin and the COM directorys and
have found some small differences.. Wondering if this could be the issue..

There are also some files on the one that doesn't work that are missing on
the one that does, but I doubt that will have much of an impact...

At this moment I have tired moving cmdwrap.exe over and also the contents of
the com dir (and reregistered + stop and start service) still no go...

Rob

(server that works)
(Binn)
04/17/01 11:22p 21,056 cmdwrap.exe
04/27/00 07:25p 4,516 sqlctr.h
04/27/00 07:25p 19,984 sqlctr.ini
08/05/00 08:51p 32,825 sqlboot.dll
12/17/02 05:24p 25,152 opends60.dll
12/17/02 05:25p 33,336 ssmsad70.dll
12/17/02 05:25p 33,336 SSmsVI70.dll
12/17/02 05:25p 33,344 SSmsSH70.dll
12/17/02 05:26p 25,152 xpsqlbot.dll
(Com)
08/06/00 11:17a 91,648 replerrx.dll
08/06/00 01:51a 98,376 repldts.dll
08/06/00 01:50a 28,745 repldsui.dll

(server that doesn't)
(BINN)
04/28/00 02:25a 4,516 sqlctr.h
04/28/00 02:25a 19,984 sqlctr.ini
04/28/00 12:06a 4,576 schema.txt
08/06/00 01:50a 20,545 cmdwrap.exe
08/06/00 01:50a 20,555 ftsetup.exe
08/06/00 01:50a 24,639 opends60.dll
08/06/00 01:51a 24,643 xpsqlbot.dll
08/06/00 01:51a 32,823 ssmsad70.dll
08/06/00 01:51a 32,823 ssmsvi70.dll
08/06/00 01:51a 32,825 sqlboot.dll
08/06/00 01:51a 32,830 ssmssh70.dll
12/17/02 05:25p 197,196 sqlftqry.dll
(COM)
12/17/02 05:24p 33,352 repldsui.dll
12/17/02 05:24p 98,888 repldts.dll
rob lynch
3/15/2005 8:47:05 AM
Well that probably falls under the category of nice try :(

I went throught all the sql direcotorys and looked for file date differences
and copyied them over then registered what I could then tried again..

Same result!

Latest thinking...
If the builtin command fails (replciation command) but the command shell
verison works, then I will change all the replication tasks to use
xp_cmdshell..

YUK...
wil post back results....
Raymond Mak [MSFT]
3/15/2005 10:02:36 AM
Rob, a search for the error message on the microsoft web
site yields some very interesting results:

http://search.microsoft.com/search/results.aspx?
st=b&na=88&View=en-
us&qu=Incompatible+version+of+the+RPC+stub

Most of the links returned mention something along the
line of mismatched OLE\COM dlls installed by various
programs. I would strongly encourage you to check those
out and see if they help.

[quoted text, click to view]
rob lynch
3/15/2005 12:41:01 PM
Final result / solution...
Chaseing down COM dll errors with nonspecific pointers/errors was a hugly
frustrating waste of time.. (Never did find it, reinstalled sp3, upgraded to
MDAC 2.8.... Bottom line, a bust!~

What worked (and seems quite reliable - but I wil continue to watch it)

Step1. Setup Distributer
Step2. Setup Publisher
Setp3. Setup Publication
Step4. (critical) - run exec sp_changepublication
@publication='northwind', @property='independent_agent', @value='TRUE'
(Where northwind is the name of your publication)
Step5. Setup your subscribers - don't sync or initialize the publication
--- Below here is where the automatic agent would fail ----
Step6. Create a task (or execute in QA) the following SQLStatement
(replace your servername etc, all following examples use a publication of
northwind and everything else about it is northwind...)

exec master..xp_cmdshell '"c:\Program Files\Microsoft SQL
Server\80\COM\SNAPSHOT.EXE" -Publisher yourPublicationServerName
-PublisherDB Northwind -Publication northwind -Distributor
yourDistributionServerName-DistributorSecurityMode 1 '

Note DistributorSecurityMode 1 sets up Integrated Secutity
This will successfully create the initial Snapshot..

Step7. Publish your Snapshot (make sure the database exists first - on the
subscriber, this processess won't create a db, just the tables and data)

exec master..xp_cmdshell '"c:\Program Files\Microsoft SQL
Server\80\COM\Distrib.EXE" -subscriberDB northwind -Subscriber YourSubscriber
-Publisher YourPublisher -PublisherDB Northwind -Publication northwind
-Distributor YourDistributer -DistributorSecurityMode 1 '

Now the data will exist in the remote location (you will need 1 statement
per subscriber - a good thing to script)

Step8. Run a log reader to make see if there is any new data to move...
exec master..xp_cmdshell '"c:\Program Files\Microsoft SQL
Server\80\COM\LogRead.EXE" -Publisher YourPublisher -PublisherDB Northwind
-Distributor YourDistributer -DistributorSecurityMode 1 '

At that point you just setup tasks to alternate between step 8 and 7..
YOu can read the return of 8 and based on transactions existing execute 7)

Phew!L@O&*^&^$

What a long 4 days!!

Rob

PS gratitude goes to

fmpia@hotmail.com who has a great resource on replication
http://www.geocities.com/priyadarshu/tech/Repl.htm
also
SQL Server 2000 Security - Part 8 - Introduction to Replication Security
By Marcin Policht
http://www.databasejournal.com/features/mssql/article.php/3383221

Brad Syputa - MS
http://www.windowsforumz.com/-incompatible-rpc-stub-ftopict188785.html
who started my thinking into xp_Cmdshell...

And last, but certainly NOT LEAST Raymond Mak.. Thanks for pointing me at
the Agents in the COM directory. This was the Breakthrought that enabled me
to finally put it all together!!!!


rob lynch
3/15/2005 12:45:03 PM
OOps
1 last thing on reread..

step 4 should be run in the context of the database being replicated...

ie use Nortwind
go
xp_cmdshell 'C:\Prog.......'

rob lynch
3/16/2005 5:57:04 AM
Raymond Mak...

YOU DA MAN!!

Raymond, I got a good night sleep and then chased a lead you gave me
yesterday. It pointed to a file called mcrepair.exe..

It came from one of the MSDN articles and strangly enough is a "Microsoft
Money" tool. I ran it this morning and Bingo!! replce oleaut and a couple
more and we were good to go.

Of course now I neeed to work out what I broke when I replaced those udpated
and later files :)

But I don't care. Now I have replciation working!!!

AddThis Social Bookmark Button