Groups | Blog | Home
all groups > sql server clustering > february 2004 >

sql server clustering : Good cluster gone bad


JinHe
2/3/2004 12:11:11 PM

1. you mean no groups can failover to passive node or Just the group with sqlserver?
you may try move a group with a disk resource to the passive node to see what happens?
if it can not start on passive node, the problem may be not related to SQLServer but in cluster or cluster configruation.
and take mssqlserver/agent/fulltext resoure offline and try to move it to passive node and what happens then?


2. stop cluster service on the good node ,
what happens to the cluster , can it start up on the passive node?
you may also leave the the good node power down, restart passive node to see if cluster service can start up fine with all the resources.

3. check System/application eventlog in all previous steps to see any apprent problems.
check %clusterlog% on both nodes if needed to see why clussvc can not start up or why SQL resource can not start up on passive node.
Jim Rackley
2/3/2004 12:48:26 PM
First, thanks in advance for any help you can provide!

SITUATION:
Two Win 2003 Enterprise Servers, fully updated
Active/Passive SQL Server 2000 Cluster
Shared RAID device
System has been running without error for 4 months

PROBLEM:
I can no longer failover to my passive server nor can I move my resource
group to the passive server (amounts to same thing).

SYMPTOMS:
I can see the server through Cluster Admin and everything seems to be
running fine
All services are running properly
I did receive an RPC error yesterday, but after a reboot it did not return
(therefore I cannot say exactly what the error was)

CORRECTIVE PROCEDURES ATTEMPTED:
All servers have been rebooted, including the shared RAID device
Searched log files, but nothing was forthcoming
Opened Cluster Admin on passive server and force failure of active server.
Active server failed and cluster became unreachable. Once active server
rebooted, I had to force Cluster Service to start from Cluster Admin
Banged head repeatedly against rack in hopes that causing myself pain would
miraculously fix problem ;o)


Any ideas?

Jim

Jim Rackley
2/4/2004 8:51:16 AM
FYI,

It turned out to be a hardware issue and not a cluster problem. One of the
SCSI controllers in the RAID was partially failing and not allowing my
passive server to access the drive with my SQL Server files on it.

Thanks again for the repsonse.

Jim

[quoted text, click to view]

AddThis Social Bookmark Button