Groups | Blog | Home
all groups > sql server replication > october 2004 >

sql server replication : distribution agent keeps failing with "general network error". verbose loggin doesn't help much.



Combfilter
10/14/2004 11:48:11 AM
Here are the logs from this error.. This is a 8gb
table i am trying to replicate across a wan. It
usually gets about 1/2 done and then just craps
out and starts over. Our connection is very
stable and so is the subsribers. I cannot figure
out why i get this..

Table "dpvHstGndItem": 100000 row(s) copied.
Total: 100000
[10/14/2004 11:15:15 AM]DFW1-REPL02.vsql16_dist:
{call sp_MSadd_distribution_history(5, 3, ?, ?,
0, 0, 0.00, 0x00, 1, ?, 0, 0x01, 0x01)}
Table "dpvHstGndItem": 100000 row(s) copied.
Total: 200000
[10/14/2004 11:19:25 AM]DFW1-REPL02.vsql16_dist:
{call sp_MSadd_distribution_history(5, 3, ?, ?,
0, 0, 0.00, 0x00, 1, ?, 0, 0x01, 0x01)}
Table "dpvHstGndItem": 100000 row(s) copied.
Total: 300000
[10/14/2004 11:23:00 AM]DFW1-REPL02.vsql16_dist:
{call sp_MSadd_distribution_history(5, 3, ?, ?,
0, 0, 0.00, 0x00, 1, ?, 0, 0x01, 0x01)}
Table "dpvHstGndItem": 100000 row(s) copied.
Total: 400000
[10/14/2004 11:26:49 AM]DFW1-REPL02.vsql16_dist:
{call sp_MSadd_distribution_history(5, 3, ?, ?,
0, 0, 0.00, 0x00, 1, ?, 0, 0x01, 0x01)}
Agent message code 20037. The process could not
bulk copy into table '"dpvHstGndItem"'.
[10/14/2004 11:28:44 AM]DFW1-REPL02.vsql16_dist:
{call sp_MSadd_distribution_history(5, 5, ?, ?,
0, 0, 0.00, 0x01, 1, ?, 6, 0x01, 0x01)}
Adding alert to msdb..sysreplicationalerts:
ErrorId = 1503,
Transaction Seqno = 008ec4ec00000017000100000001,
Command ID = 6
Message: Replication-Replication Distribution
Subsystem: agent VSQL16\ENTERPRISE02-mam01-
MAINSTREET-5 scheduled for retry. The process
could not bulk copy into table '"dpvHstGndItem"'.
[10/14/2004 11:28:44 AM]DFW1-REPL02.vsql16_dist:
{call sp_MSadd_repl_alert(3, 5, 1503, 14152, ?,
6, N'VSQL16\ENTERPRISE02', N'mam01',
N'MAINSTREET', N'alohadb', ?)}
ErrorId = 1503, SourceTypeId = 5
ErrorCode = '11'
ErrorText = 'General network error. Check your
network documentation.'
[10/14/2004 11:28:44 AM]DFW1-REPL02.vsql16_dist:
{call sp_MSadd_repl_error(1503, 0, 5, ?, N'11',
?)}

Category:SQLSERVER
Source: MAINSTREET
Number: 11
Message: General network error. Check your
network documentation.
[10/14/2004 11:28:44 AM]MAINSTREET.alohadb: exec
dbo.sp_MSupdatelastsyncinfo N'VSQL16
\ENTERPRISE02',N'mam01', N'', 0, 5, N'The process
could not bulk copy into table
''"dpvHstGndItem"''.'
Disconnecting from Subscriber 'MAINSTREET'
Disconnecting from Distributor 'DFW1-REPL02'
Disconnecting from Distributor History 'DFW1-
REPL02'
The agent failed with a 'Retry' status. Try to
run the agent at a later time.
Microsoft SQL Server Distribution Agent 8.00.760
Copyright (c) 2000 Microsoft Corporation
Microsoft SQL Server Replication Agent: VSQL16
\ENTERPRISE02-mam01-MAINSTREET-5

Startup Delay: 4371 (msecs)
Connecting to Distributor 'DFW1-REPL02'
Connecting to Distributor 'DFW1-REPL02.'
[10/14/2004 11:29:49 AM]DFW1-REPL02.: exec
sp_helpdistpublisher N'VSQL16\ENTERPRISE02'
[10/14/2004 11:29:49 AM]DFW1-REPL02.vsql16_dist:
Paul Ibison
10/15/2004 4:40:25 AM
Try setting -QueryTimeOut to a large value before
synchronizing. There is some disagreement about whether
setting it to zero means wait indefinitely, but I have
used it assuming that's how it works, so that is another
option.
HTH,
Paul Ibison (SQL Server MVP)

[quoted text, click to view]
Paul Ibison
10/15/2004 8:12:58 AM
3600 is one hour, and I distributed 18GB in less than
that, so it's your choice. In a sense you're trying to
avoid timeouts being an issue, so set it as high as you
want - eg a day (or 0 for infinity, although as I say
there is some dispute about this).
HTH,
Paul Ibison (SQL Server MVP)

(recommended sql server 2000 replication book:
http://www.nwsu.com/0974973602p.html)
Combfilter
10/15/2004 9:54:00 AM
In article <14b601c4b2ab$c34f1c70
$a601280a@phx.gbl>, Paul.Ibison@Pygmalion.Com
says...
[quoted text, click to view]
what would be a good starting point to set it
larger at?

something like 1200?

Thanks Paul.

Combfilter
10/15/2004 12:44:51 PM
In article <042301c4b2c9$747546b0
$a501280a@phx.gbl>, Paul.Ibison@Pygmalion.Com
says...
[quoted text, click to view]
OK thanks Paul for your advice.

Hilary Cotter
10/16/2004 11:49:44 AM
setting LoginTImeOut to 0 will cause the connection to timeout at the ODBC
connection timeout value which is 20s. Setting QueryTimeout to a value of 0
is IMHO not a good idea as it decreases the visibility of the agent failure.

For example there is a heartbeat setting (right click on replication
monitor, and select refresh rate and settings). This setting will mark your
agent as suspect which tells you that the agent is taking a long time. This
is not always a bad thing, it just means your agent it taking a long time.
Setting this value to something large will prevent this error from occurring
and eventually will fail your agent.

This setting is really independent of the QueryTimeout setting which is for
long running queries as opposed to the length of time an agent runs.
Normally the agent will be marked suspect before you get QueryTimeouts, but
not always. Setting QueryTImeouts to 0 will decrease the visibility of
agents hanging especially if you have upped the inactivity level. However,
just like with the ODBC connection timeout value overriding the LoginTimeout
Setting, if you set QueryTimeout to 0, you will eventually get a message
(after 5 minutes) saying "The process is running and is waiting for a
response from a backend connection."

So I like to set QueryTimeout to 300 as opposed to 0. Your results may vary.


[quoted text, click to view]

Combfilter
10/18/2004 11:08:55 AM
In article <O8YjCf5sEHA.2196
@TK2MSFTNGP14.phx.gbl>, hilary.cotter@gmail.com
says...
[quoted text, click to view]
Thanks all for the advice.

AddThis Social Bookmark Button