Be careful adjusting those numbers. TOo high can cause just as many
problems as too low. Given the frequency of the network outage and the fact
anything. The cluster failover is designed to reduce the typical 30-45
minute human reponse time for a down server. You shouldn't expect the
clustering software do deal with anything beyond that scope. Adjusting the
else. Just document a cluster check as part of your network failure
Geoff N. Hiten
"Dan" <Dan@discussions.microsoft.com> wrote in message
news:34AB2749-41BF-4A68-8E55-04AEAAE75C38@microsoft.com...
> Geoff,
>
> Thanks for the post! I guess I'll just have to make sure the retry &
> timeout are set high.
>
> "Geoff N. Hiten" wrote:
>
> > Nope, that is pretty much expected behavior. The cluster manager will
try
> > and restart the resources on each possible node until the retry count is
> > exhausted. Unfortunately, until the network resource is restored, no
node
> > has the ability to run the SQL group. With the physical network port
> > offline, the IP address(es) will not come online. Nothing dependant on
them
> > will come online, including the Network Name and the SQL Server. If the
> > network comes back before the retry timeout and count is exhausted, the
> > cluster will bring the system online. Otherwise it stays down.
> >
> > --
> > Geoff N. Hiten
> > Microsoft SQL Server MVP
> > Senior Database Administrator
> > Careerbuilder.com
> >
> > I support the Professional Association for SQL Server
> >
www.sqlpass.org > >
> > "Dan" <Dan@discussions.microsoft.com> wrote in message
> > news:74592016-0B61-4834-8C28-1AD1B864B688@microsoft.com...
> > > All,
> > >
> > > We have just rebuilt a SQL 7.0/NT cluster with Windows 2003/SQL2000
in
> > an
> > > active/passive configuration using 2 nodes. During the course of
testing
> > it
> > > we had a general network failure in which the network was unavailable.
> > The
> > > virtual SQL and Windows IP address resources went down and did not
come up
> > > automatically once the network was available again. The nodes are
> > configured
> > > for automatic failback.
> > > I can't imagine that in the 2 1/2 years the original cluster was
running
> > > that we never once had the network go down, but I do know that during
that
> > > time I never had a outage where I had to manually move the cluster
group
> > > (which causes the cluster to re-initialize both resources and brings
> > > everything back to normal).
> > >
> > > I'm thinking that maybe I'm missing a dependency somewhere or
something's
> > > changed between NT and 2003 that I'm not accounting for. Anyone seen
this
> > or
> > > have any tips? Thanks in advance!
> > >
> > > -Dan
> >
> >
> >