Groups | Blog | Home
all groups > sql server clustering > january 2004 >

sql server clustering : Win 2k AE, SQL 2k EE, Clustering after 1 node failed after a hardware death


Philippe Carignan
1/14/2004 12:21:41 PM
Hi,

I got a problem with a cluster with SQL 2000 EE.

Had two servers, Win 2k Adv Ed., one of them died.
Failover to node 2 worked fine.

Bought some new hardware, installed Win 2k Adv. Ed., SP4,
MSCS.

I EVICTED the old server form the MSCS console from node 2.

I did install the new server with the same computername
and IP adresses then the old, now dead, server.

When I try and run SQL 2k EE, I can't seem to install my
node. All seems well in Cluster Admin console. I see my
two nodes, the resources are all active on node 2 since
the failover.

When I start SQL 2k EE setup, it asks for my Virtual
Server, to which I answer with the name of the virtual
server on node 2 (should that be what I need to do?)

My questions:

1- I should see the Quorum and my Data partiton from my
Fiber Array on my new server, right? Problem is, I don't
see the partitions. That's a problem, right?

2- Can someone please tell me what are the choices I got
to make during the SQL 2000 EE installation?

First - Virtual Server as the choice, and the name of my
clustered virtual server for the answer. NEXT

Second - Upgrade, Remove or Add? Or ADVANCED?

Third - Going for ADVANCED at #2, only choice here is
MAINTAINS A VS FOR FAILOVER CLUSTERING

Fourth - Ask for an IP, I got 1 choice in the bottom box,
whihc is the IP of my virtual server. DO I need to enter
anything, or just click NEXT?

Fifth - In thie Cluster Management window, I see my node 2
as a configured node, and I need to add some more. The new
node, the server I am installing, is "UNAVAILABLE"...
What's wrong? Never got further than this.


Thanks for the help.

P. Carignan
Geoff N. Hiten
1/14/2004 4:16:53 PM
Answers Inline
[quoted text, click to view]
Did you run the SQL 2000 EE installation CD and remove the dead node from
SQL? That is also a necessary step in addition to evicting the node.
[quoted text, click to view]
This is where you need to remove the dead node. Only after it is removed
from SQL can you add it back.

[quoted text, click to view]


--
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com


Philippe Carignan
1/15/2004 3:40:43 PM
Hmmm, I ran the setup on my Node2 (since Node1 failed),
and I don't see anything besides the Node2 in
the "Configured Nodes".

So my old Node1 should be properly evicted form SQL and
from MSCS.

Problem is, I still only see my newly built Node1
as "UNAVAILABLE".

I removed MSCS on Node1, reinstalled, to no avail. I added
my Node1 to my cluster's "Group", still, the new Node 1 is
always "Unavailable" when I run the SQL 2000 EE setup.

The only question I got left since everything seems fine,
should I physically see the two shared drivers I got on my
Fiber Aray in the Windows Explorer of both Node1 and Node2
at the same time? Don't cluster prevent the servers from
actually having the volumes mounted at the same time?

TIA.


P. Carignan


[quoted text, click to view]
Geoff N. Hiten
1/16/2004 12:20:25 PM
You should see the clustered drives in Disk Manager on both systems. MSCS
will arbitrate ownership and only allow you to access a drive from its
current host node.

--
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com


[quoted text, click to view]

P. Carignan
1/20/2004 1:54:12 PM
Hmmm, got a bit further in my tests, but still no go for
SQL.

If I remove MSCS on my replacement node (node 1's hardware
died), and I shutdown Node 2 (who took over when node 1
dies), I see my disks in Node 1.

When I reboot Node 2, it takes back the two shared disks I
got on an FC array.

Now I install MSCS on node 1.

All seems fine.

Trying to run the setup of SQL on node 2 (the up and
running node), I still fail to have my Node 1 as
an "available" server to add to the SQL cluster.

I see my rebuilt node 1 as "unavailable".

Should I have named my new node 1 something else than what
it was named originally?

I did Evict it from MSCS. And the SQL setup doesn't see my
old Node 1's name in the server list that would be
configured for my SQL cluster.

Anything I am missing?


TIA


P. Carignan

[quoted text, click to view]
Geoff N. Hiten
1/21/2004 10:23:46 AM
When you nuked the original Node1. did you remove the tombstone entry from
Active Directory?

[quoted text, click to view]

P. Carignan
1/21/2004 1:06:12 PM
Hmmm, I didn't do that no.

You mean the computer account from AD?

I did rejoin the new Node1, with the same name, and all is
well with AD, I have normal access to shares and resources.


P. Carignan

[quoted text, click to view]
Geoff N. Hiten
1/21/2004 4:36:02 PM
Inline
[quoted text, click to view]
Yep. Sometimes AD can be a bit flakey when a member server dies and comes
back with all new info, but a different SID.

[quoted text, click to view]
Can you create a disk resource and move it back and forth between the nodes?

--
Geoff N. Hiten
Microsoft SQL Server MVP
Senior Database Administrator
Careerbuilder.com


[quoted text, click to view]

P. Carignan
1/25/2004 9:39:10 PM
Hi,

I can't create a new disk resource, because I do not have
any more HD to spare on the cluster. I tried moving a
group, but to no avail. I tried moving the Quorum disk,
and I got an error. Seems the dependencies are all screwed
up, and everything depends on everything else. And since I
do not have SQL on the new Node1, I don't want to move
everything and risk screwing something, and having
downtime.

I'll try and rename the server for Node1, and join the
cluster again. Maybe it's an old conflict with the ex-
Node1's name?


P. Carignan



[quoted text, click to view]
P. Carignan
1/28/2004 1:26:09 PM
Resolution to my problem:

Seems like my SQL cluster was still remembering the old
Node1 I had that crashed in the cluster. I uninstalled
MSCS from the new Node1, renamed it to "Node3" (instead of
using the old Node1's server name), reinstalled MSCS, now
SQL setup was letting me add the node the the SQL cluster.
All is fine since then.


Thanks for the tips that led me to the solution.


P. Carignan

[quoted text, click to view]
AddThis Social Bookmark Button