Groups | Blog | Home
all groups > sql server replication > august 2006 >

sql server replication : Snapshot failed during merge replication


Kath
8/16/2006 9:16:41 AM
Hi,

I have merge replication set up with two databases - one is the publisher
and is setup on a server with broadband connection, the other is a subscriber
on a Tablet machine (which uses a 3G/GPRS card and has an intermittent
connection, which is why we chose merge rep). The publisher is the
distributor. We are using SQL Server 2000.

The snaphot agent on the publisher is scheduled to create a snapshot every
week, on Saturday's at 22:50. On Sat 12th, it did successfully create the
snapshot.

The Tablet was connected temporarily to the publisher over the next couple
of days, but didn't attempt to apply the snaphot until 15th (Tues). The
Tablet got as far as dropping and recreating the tables, and bulk copied data
into two system tables before failing. My client then realised he couldn't
log into our software.

The next day, the Tablet was taken into the client's office and they noticed
that one of our database tables was empty, which was why he couldn't log on
to our SW (and presumably, all the tables were empty, as the snapshot never
copied the data). The client manually started synchronisation and the log
shows the snapshot was applied again, and this time appears to have completed
successfully.

However, some data has been lost during this process (although I don't know
exactly when this lost data was applied, it would have been some time before
on on the morning of the 15th). Edits to the data that would have been
applied on the subscriber mahcine appear to have been overwritten by the data
existing on the publisher machine - I presume this is down to the timing of
the snapshot being created, data being edited, and then the snapshot being
applied?

My questions are:

1. What are the main reasons for the snapshot agent to run? What are the
advantages of it? Presumably it isn't a good idea to disable it? If you have
a lot of data in the database (one table has 1 million records), is it better
to run the snapshot less often than one week?

2. Does the snapshot agent work by creating the snapshot and then the
subscriber applies that snapshot at the next available time? It appears that
the Tablet may have synched with the publisher between 12th (snapshot
creation) and 15th (snapshot applied), but didn't attempt to apply the
snapshot until the 15th. This seems a bad idea, as any data changes to the
subscriber would be lost??? Any changes made, at either end, must be stored
and then applied after the snapshot??

3. Why would the snapshot have failed? The log doesn't seem to give me any
hints, it just says it failed. No talk of network error (which would suggest
the connection dropped).

The verbose log from the Tablet at the time of the snapshot failure is:

[...Log shows tables dropped and recreated...]

Bulk copying data into table 'MSmerge_contents'
select * from "dbo"."MSmerge_contents" where 1 = 2
[8/15/2006 11:23:07 AM]RNA004.GlimpseSub3: select * from
"dbo"."MSmerge_contents" where 1 = 2

Percent Complete: 19
Bulk copied data into table 'MSmerge_contents' (19326 rows)
select * from "dbo"."MSmerge_tombstone" where 1 = 2
[8/15/2006 11:39:13 AM]RNA004.GlimpseSub3: select * from
"dbo"."MSmerge_tombstone" where 1 = 2

The process could not deliver the snapshot to the Subscriber.
[8/15/2006 11:39:15 AM]RNA004.GlimpseSub3: {call
sp_reinitmergepullsubscription (N'RNASERVER', N'GlimpsePub2', N'GlimpsePub2',
'FALSE')}
Reinitialized pull subscription to publication 'GlimpsePub2' from publisher
'RNASERVER:GlimpsePub2'Percent Complete: 0
The subscription has been marked for reinitialization. Reinitialization will
occur the next time you synchronize this subscription.
Percent Complete: 0
The process could not deliver the snapshot to the Subscriber.
Percent Complete: 0
Category:NULL
Source: Merge Replication Provider
Number: -2147201001
Message: The process could not deliver the snapshot to the Subscriber.
Percent Complete: 0
The process could not bulk copy into table '"dbo"."MSmerge_tombstone"'.
Percent Complete: 0
Category:AGENT
Source: RNA004
Number: 20037
Message: The process could not bulk copy into table
'"dbo"."MSmerge_tombstone"'.
Percent Complete: 0
Unable to open BCP host data-file
Percent Complete: 0
Category:ODBC
Source: RNA004
Number: 0
Message: Unable to open BCP host data-file
Disconnecting from Subscriber 'RNA004'
Disconnecting from Subscriber 'RNA004'
Disconnecting from Publisher 'RNASERVER'
Disconnecting from Publisher 'RNASERVER'
Disconnecting from Distributor 'RNASERVER'
Microsoft SQL Server Merge Agent 8.00.2039
Copyright (c) 2000 Microsoft Corporation
Microsoft SQL Server Replication Agent: subjob

4. When trying to find out more about why the snapshot application failed,
what agent/job/etc is it on the tablet (subscriber) that applies the
snapshot? The agent which creates the snapshot runs on the publisher, and I
can see the history etc. If I could see what resource attempted to apply the
snapshot, maybe there would be info there about why it failed?

5. As an added complexity, the battery on the motherboard of the Tablet has
failed and so the time has to be reset every time the Tablet is turned on. I
can see in the log that the DT is reporting incorrectly for a while before
being corrected. Could this datetime error possible cause or contribute to a
problem in SQL Server? The time was correct at the time of the snapshot being
applied, but I thought it may be worth mentioning.

Any ideas would be greatly appreciated!!

Thanks,
Paul Ibison
8/16/2006 5:41:20 PM
Kath,
in transactional and merge replication, after the initialization I disable
the snapshot agents. In those cases where you have anonymous subscribers
coming online and you don't control the schedule, I can understand the need
for up to date snapshots, but most of us work with named subscribers and
running the snapshot agents is usually redundant after the initialization.
You can always run a manual snapshot if reinitialization is required.
My next question is why are you reinitializing? Is this being done manually
or is the subscription expiration period being exceeded. What is this value?
Perhaps it needs to be increased?
Normally when reinitializing merge, the option of uploading changes made on
the subscriber is presented, which will avoid the issue you're seeing with
loss of data.
Full logging of the merge agent should present more data about why the
process was truncated (http://support.microsoft.com/?id=312292).
Cheers,
Paul Ibison SQL Server MVP, www.replicationanswers.com

Kath
8/17/2006 1:44:02 AM

Hi Paul,

Thanks for your reply.

From what you say, it's ok for me to disable the snapshot agent. That seems
the best solution for us, as we only have one named subscription, and I am
concerned about the data transfer loads of having to apply a snapshot every
week.

To answer your question of why am I reintializing, I am not! The is just
something SQL Server did, set the subscription as neefing reinitialization. I
thought it must have done this because there was an error applying the
snapshot. The subscription expiration period is set to 'never expire', as
even if the subscription is offline for a week or more we don't want it to
expire.

I do have the verbose logging for the merge agent set to 2 (that's where I
got the log from that I posted a snippet of). Unfortunately it didn't give a
reason fo rthe snapshot failure, just reported that it had failed. I suppose
the log does say 'Unable to open BCP host data-file', this could be the
reason, although I don't know what caused it to not be able to find the data
folder.

So, I think the solution to the problem I had is to disable the snapshot
agent.

Thanks for your help,
Kath

[quoted text, click to view]
Paul Ibison
8/17/2006 9:37:46 PM
Hi Kath,
I am not too sure about your retention period. Please look at
sp_helpmergepublication and check what the value is - it might need
increasing to avoid forcing reinitialization.
Cheers,
Paul Ibison SQL Server MVP, www.replicationanswers.com



Kath
8/18/2006 1:24:02 AM
Hi Paul,

sp_helpmergereplication shows a retention period of 0 - I believe this
corresponds with my publisher setting that 'subscriptions never expire', and
therefore SQL Server shouldn't ever force reinitialization on my subscription.

When I looked into this, an msdn article said it's strongly recommended we
don't use this setting (i.e. never expire), as metadata is never cleaned up,
so perhaps I should change it to something like 30 days? (i.e. a length of
time I don't think a subscription will ever be off-line).

Thanks,
Kath

[quoted text, click to view]
changliw NO[at]SPAM online.microsoft.com
8/22/2006 12:00:00 AM
Hi,

Most publications in production have a retention period of less than 30
days. A period of 10-14 days is recommended.

For more information, you can refer to:
Merge Replication Performance Tuning and Optimization
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/mergperf.mspx

How Merge Replication Works
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/replsql/rep
ltypes_30z7.asp

Sincerely yours,
Charles Wang
Microsoft Online Community Support


AddThis Social Bookmark Button