Groups | Blog | Home
all groups > sql server clustering > october 2003 >

sql server clustering : Cluster resource SQL Server (InstanceName) failed to come offline


BB
10/31/2003 9:46:32 AM
I have the similar situation on cluster with 8GB of RAM per server.
2 instances, each is configured to use 3.5 GB (/PAE, /3GB, AWE enabled).
There are ~350 databases on each instance.
When I take an instance offline it gives me same message.

I believe it's related to the Pending Timeout interval set in the cluster
admin. My understanding is that it just takes longer for SQL to shutdown
than the cluster is willing to wait for it... One time I detached most of
the databases and then the problem disappeared. It returned when I
re-attached the DBs back. Unfortunately, it's a production environment, I
can't play much with it. :-(

B.

[quoted text, click to view]

JBailey
10/31/2003 11:05:25 AM
I have a 2 node cluster with 3 instances of SQL. One specific instance, when
I attemtp to take it offline produces the following error:

In cluster administrator is shows the SQL Server Resource as Failed

Event Viewer produces the following:

Source:ClusSvc
Event ID: 1117
Description: Cluster resouce SQL Server(InstanceName) failed to come offline

The SQL service for this instance does stop, and the resource also has no
problems coming back online.
This SQL instance also is non-production, and hasnt had much traffic at all

Any ideas why this is happening?

Thanks,
JBailey

Geoff N. Hiten
10/31/2003 2:33:54 PM
I have a similar problem with both a clustered and a non-clustered system.
Both have 32GB of RAM and most of it is allocated for SQL. The timeout
definitely needs extending for large memory systems.

--
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com




..
[quoted text, click to view]

Allan Hirt
11/9/2003 5:09:49 PM
It is not recommended to change the timeout without
consulting PSS first. Things should work without having
to modify. If you're having problems, contact support.

The delays are due to the nature of AWE/PAE since you are
grabbing the memory and have to allocate or deallocate
it. It's not dynamic. So especially in a failover, you
need to wait for SQL Server to grab the memory.
AddThis Social Bookmark Button