We are running the cluster on two Poweredge 6650 servers. Quad Processors
and 8 gigs of ram each. They both connect up to a Dell power vault 220s.
"Geoff N. Hiten" <SRDBA@Careerbuilder.com> wrote in message
news:e7BteLc4DHA.2136@TK2MSFTNGP12.phx.gbl...
> What is your hardware and OS config? I am aware of at least one
combination
> that has problems releasing the SCSI reservation when there are
> communication issues.
>
> --
> Geoff N. Hiten
> Microsoft SQL Server MVP
> Senior Database Administrator
> Careerbuilder.com
>
>
>
>
> "Chris Carmichael" <cc@someisp.com> wrote in message
> news:1010tp39i5pp57b@corp.supernews.com...
> > Thanks for the input. I looked at these scenarios and nothing seems to
> help.
> > In looking at the event logs, it looks as though the second node tried
> > unsuccessfully to take control of the disk array 6 times, then turned
off
> > it's cluster service. Since the ip address had already failed over, that
> > left the server in a half failover state. This does not happen if we
just
> > unplug the network cable though?!?! Any other ideas?
> >
> > Thank You again,
> >
> > Chris
> >
> > "Geoff N. Hiten" <SRDBA@Careerbuilder.com> wrote in message
> > news:%23RB37xS4DHA.1948@TK2MSFTNGP12.phx.gbl...
> > > Do any of the following help?
> > >
http://support.microsoft.com/default.aspx?scid=kb;EN-US;242600 > > >
> > >
http://support.microsoft.com/default.aspx?scid=kb;EN-US;176320 > > >
> > >
> >
>
http://support.microsoft.com/default.aspx?scid=kb;en-us;286342&Product=winsvr2003 > > >
> > > You didn't specify the host OS, so I included links for 2000 and 2003.
> If
> > > you are on NT4, upgrade it now. :)
> > >
> > > My take is that the LooksAlive and Isalive tests fail, forcing the
> > failover,
> > > but the IP address is still alive on the net, preventing the Virtual
> > server
> > > from coming up on the second node. If this is the case, you should
see
> > > duplicate IP address errors in the second node's event log.
> > >
> > > --
> > > Geoff N. Hiten
> > > Microsoft SQL Server MVP
> > > Senior Database Administrator
> > > Careerbuilder.com
> > >
> > >
> > >
> > >
> > > "Chris Carmichael" <cc@someisp.com> wrote in message
> > > news:1010amhmati54e0@corp.supernews.com...
> > > > Hello All,
> > > >
> > > > We have a SQL cluster in active/passive configuration.
Unfortunately,
> > our
> > > > network has been having issues. More specifically, the line
protocol
> on
> > > the
> > > > Cisco 3550 that the primary node on this cluster is connected to
seems
> > to
> > > > drop off temporarily from time to time.
> > > >
> > > > So the problem is that when this happens, failover does not work.
It
> > > hangs
> > > > up on the first node. When this happens, you can't connect to
cluster
> > > > administrator to see what is going on. Yet if you physically tun
off
> > the
> > > > primary node, then failover will complete. We tried testing
failover
> by
> > > > unplugging the network cable from the primary node. When you do
this
> > the
> > > > failover happens without incident.
> > > >
> > > > Our take on it is that clustering will only work if you have a
failure
> > on
> > > > level 1 of the OSI layers?!?! That hardly seems right for a system
as
> > > robust
> > > > as SQL. Yet a level one failure kicks off the failover perfectly.
> Then
> > > if
> > > > we lose the line protocol on the switch (whihc would be layer 3-4,
it
> > > > doesn't. So it appears as though the network connection is up to
the
> > > > server, but you can't communicate with it on the public side. Our
> > > heartbeat
> > > > connection is simply a crossover cable, so I don't think that is the
> > > > problem.
> > > >
> > > > We have been pulling out our hair on this for 2 weeks. does anyone
> here
> > > > have any suggestions. Or, is there a way to force the cluster to
> > failover
> > > > EVERYTHING if any one resource dies?
> > > >
> > > > Thank you!
> > > >
> > > > Chris
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>