all groups > sql server clustering > august 2007 >
You're in the

sql server clustering

group:

SQL 2000 Cluster Not Failing over


SQL 2000 Cluster Not Failing over RichardT
8/22/2007 6:00:02 PM
sql server clustering: Hi,

We are running clustered SQL 2000 SP4 on Win2003 SP1 with 4 instances
(DB2,DB3,DB4 on the problem node)
At the time of the events there was a large amount of processing being
undertaken on instance DB03

On this occasion the server hung but didn't failover.
It had to be manually failed over.
I am trying to analyse why?


The sequence of events was:

Event Type: Error
Event Source: NETLOGON
Event Category: None
Event ID: 5783
Date: 20/08/2007
Time: 11:22:15 AM
User: N/A
Computer: SRVCLBPR01
Description:
The session setup to the Windows NT or Windows 2000 Domain Controller
\\GSOCDC1.gso.internal for the domain GSO is not responsive. The current RPC
call from Netlogon on \\SRVCLBPR01 to \\GSOCDC1.gso.internal has been
cancelled.

For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.

Event Type: Error
Event Source: MSSQL$DB04
Event Category: (3)
Event ID: 17052
Date: 20/08/2007
Time: 11:22:26 AM
User: N/A
Computer: SRVCLBPR01
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 9c 42 00 40 01 00 00 00 œB.@....
0008: 0b 00 00 00 53 00 52 00 ....S.R.
0010: 56 00 53 00 51 00 4c 00 V.S.Q.L.
0018: 50 00 52 00 30 00 34 00 P.R.0.4.
0020: 00 00 00 00 00 00 ......

These are generated when the cluster service fails to connect to SQL Server
during its check of the resource.


Event Type: Error
Event Source: MSSQL$DB04
Event Category: (3)
Event ID: 17052
Date: 20/08/2007
Time: 11:22:26 AM
User: N/A
Computer: SRVCLBPR01
Description:
[sqsrvres] printODBCError: sqlstate = 01000; native error = 2746; message =
[Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionRead (recv()).


For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 9c 42 00 40 01 00 00 00 œB.@....
0008: 0b 00 00 00 53 00 52 00 ....S.R.
0010: 56 00 53 00 51 00 4c 00 V.S.Q.L.
0018: 50 00 52 00 30 00 34 00 P.R.0.4.
0020: 00 00 00 00 00 00 ......


Event Type: Error
Event Source: MSSQL$DB04
Event Category: (3)
Event ID: 17052
Date: 20/08/2007
Time: 11:22:26 AM
User: N/A
Computer: SRVCLBPR01
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = b; message =
[Microsoft][ODBC SQL Server Driver][DBNETLIB]General network error. Check
your network documentation.


For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 9c 42 00 40 01 00 00 00 œB.@....
0008: 0b 00 00 00 53 00 52 00 ....S.R.
0010: 56 00 53 00 51 00 4c 00 V.S.Q.L.
0018: 50 00 52 00 30 00 34 00 P.R.0.4.
0020: 00 00 00 00 00 00 ......


Event Type: Error
Event Source: MSSQL$DB04
Event Category: (3)
Event ID: 17052
Date: 20/08/2007
Time: 11:22:26 AM
User: N/A
Computer: SRVCLBPR01
Description:
[sqsrvres] OnlineThread: QP is not online.


For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 9c 42 00 40 01 00 00 00 œB.@....
0008: 0b 00 00 00 53 00 52 00 ....S.R.
0010: 56 00 53 00 51 00 4c 00 V.S.Q.L.
0018: 50 00 52 00 30 00 34 00 P.R.0.4.
0020: 00 00 00 00 00 00 ......


Event Type: Error
Event Source: MSSQL$DB04
Event Category: (3)
Event ID: 17052
Date: 20/08/2007
Time: 11:22:26 AM
User: N/A
Computer: SRVCLBPR01
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 9c 42 00 40 01 00 00 00 œB.@....
0008: 0b 00 00 00 53 00 52 00 ....S.R.
0010: 56 00 53 00 51 00 4c 00 V.S.Q.L.
0018: 50 00 52 00 30 00 34 00 P.R.0.4.
0020: 00 00 00 00 00 00 ......

Event Type: Error
Event Source: MSSQL$DB04
Event Category: (3)
Event ID: 17052
Date: 20/08/2007
Time: 11:22:26 AM
User: N/A
Computer: SRVCLBPR01
Description:
[sqsrvres] printODBCError: sqlstate = 08S01; native error = 0; message =
[Microsoft][ODBC SQL Server Driver]Communication link failure


For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 9c 42 00 40 01 00 00 00 œB.@....
0008: 0b 00 00 00 53 00 52 00 ....S.R.
0010: 56 00 53 00 51 00 4c 00 V.S.Q.L.
0018: 50 00 52 00 30 00 34 00 P.R.0.4.
0020: 00 00 00 00 00 00 ......


Event Type: Error
Event Source: MSSQL$DB04
Event Category: (3)
Event ID: 17052
Date: 20/08/2007
Time: 11:22:26 AM
User: N/A
Computer: SRVCLBPR01
Description:
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed


For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 9c 42 00 40 01 00 00 00 œB.@....
0008: 0b 00 00 00 53 00 52 00 ....S.R.
0010: 56 00 53 00 51 00 4c 00 V.S.Q.L.
0018: 50 00 52 00 30 00 34 00 P.R.0.4.
0020: 00 00 00 00 00 00 ......

and in the SYSTEM log:

Event Type: Warning
Event Source: SQLAgent$DB04
Event Category: Job Engine
Event ID: 208
Date: 20/08/2007
Time: 11:32:47 AM
User: N/A
Computer: SRVSQLPR04
Description:
SQL Server Scheduled Job 'Transaction Log Backup Job for DB Maintenance Plan
'DB Maintenance Plan for PAYROLLINCPROD''
(0xE37FBBB9AA61E64CACB457FC4146EDD8) - Status: Failed - Invoked on:
2007-08-20 11:30:42 - Message: The job failed. Unable to determine if the
owner (QSUPER\naira) of job Transaction Log Backup Job for DB Maintenance
Plan 'DB Maintenance Plan for PAYROLLINCPROD' has server access (reason:
Unable to connect to server - check SQL Server and SQL Server Agent
errorlogs).

For more information, see Help and Support Center at
http://go.microsoft.com/fwlink/events.asp.


Any assistance would be appreciated.

RE: SQL 2000 Cluster Not Failing over RichardT
8/22/2007 6:04:01 PM
Addional SQL Server Logs:

DB04

2007-08-20 11:15:03.04 backup Log backed up: Database: PAYROLLINCPROD,
creation date(time): 2007/04/13(19:11:24), first LSN:

36363:2236:1, last LSN: 36363:2290:1, number of dump devices: 1, device
information: (FILE=1, TYPE=DISK: {'P:\SQL Server

Backup\PAYROLLINCPROD\PAYROLLINCPROD_tlog_200708201115.TRN'}).
2007-08-20 11:23:58.79 logon Login failed for user '(null)'. Reason: Not
associated with a trusted SQL Server connection.
2007-08-20 11:27:07.56 logon Login failed for user '(null)'. Reason: Not
associated with a trusted SQL Server connection.


DB03

2007-08-20 10:46:03.00 spid58 Process ID 60 killed by hostname WS16736,
host process ID 3080.
2007-08-20 11:42:38.68 spid58 Process ID 65 killed by hostname WS16736,
host process ID 3080.


DB04

2007-08-20 11:15:00.63 backup Log backed up: Database: RECPROD, creation
date(time): 2007/03/30(19:16:18), first LSN:

16564:12537:1, last LSN: 16564:13949:1, number of dump devices: 1, device
information: (FILE=1, TYPE=DISK: {'J:\SQL Server

Backup\RECPROD\RECPROD_tlog_200708201115.TRN'}).
2007-08-20 11:25:46.69 logon Login failed for user '(null)'. Reason: Not
associated with a trusted SQL Server connection.
2007-08-20 11:26:52.66 logon Login failed for user '(null)'. Reason: Not
RE: SQL 2000 Cluster Not Failing over (JR) Jorge Rivera
8/23/2007 9:10:03 AM
Hi

Do you have error in a clusterlog from yesterday, there's errors in the
cluster log ? (cluster log is in GMT time)
In the domain controllers you get a problem in the same hour, some error ?
Can you save the basic counters for S.O. mem, cpu, disk, it ocurrs in random
time o is allways at same hours and days ??

[quoted text, click to view]
AddThis Social Bookmark Button