Groups | Blog | Home
all groups > sql server clustering > august 2006 >

sql server clustering : N+1 Configuration



DTully
8/30/2006 3:49:02 PM
Well - first TIA!

We have an Active/Passive cluster running W2K3 Enterprise X64 Sp1, 32GB RAM
and SQL Server 2005. We're moving to a N+1 configuration (and then maybe N+M
later), but I'm running into an error.

After I cluster the third server and try adding the failover node (it's the
same failover node for the existing cluster), I get the following error in
the log:

2006-08-30 15:04:30.432 [ERR ] SJCFODBMS016M: This node is a member of a
different cluster: "SJCADDBMS014V.corp.shop.com". (hr=0x8007000d,
{EBC8AEFF-10C3-4D5B-AC17-FC0F4C3871B7},
{0B5C1D35-1CA5-4597-B1A7-DCD4D64F55C6}, 1, 1, 1), (null)
2006-08-30 15:04:30.495 [ERR ] SJCKWDBMS011V: Checking cluster membership...
(hr=0x8007000d, {EBC8AEFF-10C3-4D5B-AC17-FC0F4C3871B7},
{06AB21DD-D8C7-453D-845E-50423BEACA38}, 0, 1, 1), (null)
2006-08-30 15:04:30.557 [ERR ] SJCKWDBMS011V: (null) (hr=0x8007000d,
{FDC75680-7DBB-42CA-8003-E4CDB01FF062},
{EBC8AEFF-10C3-4D5B-AC17-FC0F4C3871B7}, 0, 8, 8), (null)
2006-08-30 15:04:30.604 [INFO] [MT] [CTaskAnalyzeClusterBase] Exiting task.
The task was not cancelled.

In the wizard I get the same error (that the other node is a member of a
different cluster) and the install then fails.

I also tried to install the new cluster directly on the failover server
(cleaned it off the third node first) and it won't even istall and returns
the same error.

This can't be right. How could a server not run two clusters in a N+1
configuation. I'm really aiming for N+M with 6 active and 2 dedicated FO
servers. But I'm stuck at the moment.

Thanks and again - TIA!!
Linchi Shea
8/30/2006 4:08:01 PM
[quoted text, click to view]

If I understand this statement correctly, we have some fundamental
problem/misconception here. A node can't be in two different clusters
simultaneously. N+1 simply means that this particular cluster would have as
many as N+1 nodes, and N of those will be configured to have apps (i.e. SQL
isntances in this cases) running in the normal operating mode, and the other
one node will be a spare, to which the other nodes may failover. So, we
definitely do not have two clusters here.

Linchi

[quoted text, click to view]
DTully
8/30/2006 4:52:02 PM
Thanks Linchi!

I think I see the light now. So one clustered OS instance with multiple
nodes, with multiple SQL installations, etc, etc.

Only problem I see then is each SQL named instance would have to use
different drives letters for the SQL's data files, log files, etc since if
two instances failed over to the same failover server the drives would clash.
Correct? Wow, not sure there are enough drive letters for 6 SQL instances
using 3-4 drive letters each.

OK - thanks for getting my head on straight.
Darin

[quoted text, click to view]
Geoff N. Hiten
8/30/2006 7:58:00 PM
You have it exactly right. All clustered instances are "shared nothing"
where each instance is completely independent of each other or of any
specific host node resource.

SQL 2005 can use mount points to "aggregate" clustered disks under a single
drive letter.

--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP




[quoted text, click to view]

Linchi Shea
8/30/2006 8:29:02 PM
To add to what Geoff said about mount points.

If you use one drive letter per SQL instance and mount all other drives for
that instance under that drive letter, you can technically have 20+ instances
in the cluster. But hopefully, you are not considering that many instances in
a single cluster.

I have tested various scenarios of using mounted volumes in a SQL2005
cluster. There is no major issue with using mounted volumes in SQL 2005. But
note that some of the disk space related utilities are not 'mount point
aware', thus won't give you the total space info for the entire drive tree or
may not be aware that there is a separate drive mounted underneath a drive
letter. So you may have to revise your space monitoring setup.

Linchi



[quoted text, click to view]
DTully
8/31/2006 8:45:02 AM
Wow! Thank you both!

The team brought up the idea of using NTFS junction points (the same
thing?). Is there any good documentation of this? So it does appear that no
two instances of SQL can be using the same drive letter though (even if, as
Linchi suggested, mounting all other drives under that drive letter). We have
consistency between servers that all data is on drive F, etc. But that's a
trade off for the ability to have fewer failover servers per grouping of
active servers.

Thanks!
Darin

[quoted text, click to view]
DTully
8/31/2006 1:02:01 PM
Thanks Jasper. I found that doc and it helps.

So, I'm all straight on using mount points and now only a single root LUN
will need to be presented by each new node.

I do have one additional question. The current cluster is Active/Passive.
I'm adding a new third server that has four LUNS presented to only it so far
-not to the failover server at this moment. I've clustered this third server
without any issues. But looking in the cluster admin tool, it did not add
it's LUNs to the cluster. And if I try to add a new physical disk resource,
none of its LUNs are shown in the drop down box to select from. Might this be
because the failover node cannot see it? Do I have to paused (or shut down)
the other nodes first? Even maybe the nodes that will never help in failover
(hope not as those might be in production later)?

I've added LUNS before but only in an Active/Passive configuration and once
it was partitioned I could add it as a clustered resource with no problems.

Can't say thank you enough for all the help.
Darin

[quoted text, click to view]
Geoff N. Hiten
8/31/2006 4:24:10 PM
Any new LUNs that you want to make into clustered resources must be visible
cluster-wide.

--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP


[quoted text, click to view]
Linchi Shea
8/31/2006 6:35:01 PM
There is a dependency issue that is worth pointing out. While you need to
make a mounted volume dependent on its root disk drive, there is no
requirement to have any dependency between the SQL Server resource and a
mounted volume disk resource that you'll place database files on. This is
different from using a drive letter, in which case the SQL Server resource
must depend on the disk resource.

What this means is that you can have a mounted volume that is offline with
some database files on it, and the SQL Server instance can still come online
unless you specify the dependency yourself.

The feedback I got from MS is that they may consider enforcing that
dependency in a future release.

Linchi

[quoted text, click to view]
Jasper Smith
8/31/2006 7:17:45 PM
Every clustered instance will require it's own drive letter but off that you
can hang a bunch of mount points. We tend to have

Mounts\SQLData
Mounts\SQLLogs
Mounts\SQLDumps
Mounts\SQLTemp

We have tested upto 16 instances on a 4 node cluster all using a single
drive letter and then mount points as above for each and not had any
problems

How to configure volume mount points on a Microsoft Cluster Server
http://support.microsoft.com/kb/280297/

--
HTH,
Jasper Smith (SQL Server MVP)
http://www.sqldbatips.com


[quoted text, click to view]

Jasper Smith
9/1/2006 12:00:00 AM
Good point. In some initial testing we forgot to do that and found that SQL
Server would try and start before the Tempdb mount point came fully online.
You get a bunch of IO errors in the errorlog during startup. Interestingly
SQL actually seemed to cope pretty well with it and retried it's operations
and came up normally once the mount point was fully online without any
manual intervention. This is obviously not ideal though so we enforce the
dependencies as part of our install scripts.

--
HTH,
Jasper Smith (SQL Server MVP)
http://www.sqldbatips.com


[quoted text, click to view]
Linchi Shea
9/1/2006 5:13:02 AM
Good point on the impact on tempdb, Jasper!

[quoted text, click to view]

I wonder if that's the result of the built-in read retries or something
special to the startup process. If it's the built-in read retries, it's
conceivable that if the tempdb volume comes up very late, the startup may not
succeed every time. Did it always succeed in the cases you have seen?

Linchi

[quoted text, click to view]
Jasper Smith
9/1/2006 6:16:38 PM
We never had a SQL instance fail to come up and we must have done probably
100 failovers across the 16 instances before we noticed it :-)

--
HTH,
Jasper Smith (SQL Server MVP)
http://www.sqldbatips.com


[quoted text, click to view]
AddThis Social Bookmark Button