Hi
I have SQL2K SP3 on W2k SP4 in an active passive cluster. Recently I
have had problems with the cluster either failing over or attempting
to and then hanging in an inaccessible state. The following errors
appear in the W2K Application log...
[sqsrvres] CheckQueryProcessorAlive: sqlexecdirect failed
[sqsrvres] printODBCError: sqlstate = HYT00; native error = 0; message
= [Microsoft][ODBC SQL Server Driver]Timeout expired
[sqsrvres] OnlineThread: QP is not online.
The above are repeated for a while and then...
[sqsrvres] ODBC sqldriverconnect failed
[sqsrvres] checkODBCConnectError: sqlstate = 08001; native error = b;
message = [Microsoft][ODBC SQL Server Driver][DBNETLIB]General network
error. Check your network documentation.
The above are repeated for a while and then various other errors
including...
[sqsrvres] CheckServiceAlive: Service is dead
I am also seeing the following repeated many times in the run up to
the shutdown/failover in the SQL Error log...
2004-02-22 11:53:58.56 spid73 WARNING: EC 1aefd588, 0 waited 300
sec. on latch 1a1426c0. Not a BUF latch.
2004-02-22 11:53:58.56 spid73 Waiting for type 0x3, current count
0xa, current owning EC 0x1A197588.
Obviously with different SPIDs and ECs. The errors occur at different
times of the day and there is no common scheduled jobs or maintenance
that I have noticed.
Having searched, I have found many posts from people with similar
problems but have not managed to find a solution. I would really
appreciate any advice anyone can offer because this has been going on
for a while and its really starting to bug me.
Thanks