Groups | Blog | Home
all groups > sql server clustering > december 2005 >

sql server clustering : Cluster will not fail over.


Admiral
12/20/2005 2:34:47 PM
We had an error over the weekend of mass porportions(Sunday 3pm PST). =
Long story short; the model database was detached and the SQL Server was =
stopped, with it still detached. This happened to happen on our primary =
Production Database Clustered Server which is the bread-n-butter of the =
compay. (OUCH!)

It was time for some fast actions. We started the re-install SQL =
Server. In order to do so, the previous install had to be uninstalled. =
This seemed to go smoothly enough, but when re-applying the SP3a, we =
encountered an error. After researching the error, apparently in a =
clustered environment this will occur since the SP3a files still reside =
on the node(s). Microsoft states that if within a particular log file =
it results with an 'Installation was Successful', to disregard the =
error. I double checked the log file and sure enough the error was =
disregarded.=20

We moved along with the installation. We were able to restore all the =
user databases and all system databases with the exception of the master =
database. Unfortunately, even with starting SQL Server in single-user =
mode, the restore of the master database would not take. So it was not =
restored, but all other databases were. Fortunately, I ran a quick =
script to recover all the user logins previous to the disaster, which I =
reapplied to the new installation of SQL Server. Everything came back =
up and the QA Team successfully tested the production Application =
(Monday 4am PST). (Fhweeh)

After the succesful testing of the production environment, we tested the =
fail-over which resulted in SQL Server not starting on the secondary =
node. All the resources came right up on it, but not SQL Server. The =
only error that was that it was not able to locate the file on =
'O\logs\mastlog.ldf'. This error did not make sense since SQL Server =
uses the same file for the primary node. We were pressed for time since =
it was closing to start of business East Coast time, so we left the =
server as is. =20

Throughout the day there were other issues that arose, one in particular =
was certain systems were not able to connect to the server via TCP/IP. =
In order to have them connect they needed to create an alias of the =
server and use Name Pipes. This seems to be a rising concern because =
there are users who need to connect via ODBC to a widely used particular =
Access Application, which seems to only like the TCP/IP route. I am =
somewhat sure this is related to the cluster failure.

Anyway, this is the first time I've had to take a breathe to revisit the =
problem at hand. We have been dealing with another server that crashed =
on the same day, resulting in a brand new build of a SQL Server Cluster =
environment (completely non related to the issue at hand).

I'm sorry for the long winded story. Would you have any idea as to why =
the cluster would fail on failover along with the TCP/IP issue?

Admiral
12/20/2005 2:41:14 PM
95% of the systems that were not able to connect to the server via TCP/IP
were Windows 2003 Server systems.


[quoted text, click to view]
We had an error over the weekend of mass porportions(Sunday 3pm PST). Long
story short; the model database was detached and the SQL Server was stopped,
with it still detached. This happened to happen on our primary Production
Database Clustered Server which is the bread-n-butter of the compay.
(OUCH!)

It was time for some fast actions. We started the re-install SQL Server.
In order to do so, the previous install had to be uninstalled. This seemed
to go smoothly enough, but when re-applying the SP3a, we encountered an
error. After researching the error, apparently in a clustered environment
this will occur since the SP3a files still reside on the node(s).
Microsoft states that if within a particular log file it results with an
'Installation was Successful', to disregard the error. I double checked the
log file and sure enough the error was disregarded.

We moved along with the installation. We were able to restore all the user
databases and all system databases with the exception of the master
database. Unfortunately, even with starting SQL Server in single-user mode,
the restore of the master database would not take. So it was not restored,
but all other databases were. Fortunately, I ran a quick script to recover
all the user logins previous to the disaster, which I reapplied to the new
installation of SQL Server. Everything came back up and the QA Team
successfully tested the production Application (Monday 4am PST). (Fhweeh)

After the succesful testing of the production environment, we tested the
fail-over which resulted in SQL Server not starting on the secondary node.
All the resources came right up on it, but not SQL Server. The only error
that was that it was not able to locate the file on 'O\logs\mastlog.ldf'.
This error did not make sense since SQL Server uses the same file for the
primary node. We were pressed for time since it was closing to start of
business East Coast time, so we left the server as is.

Throughout the day there were other issues that arose, one in particular was
certain systems were not able to connect to the server via TCP/IP. In order
to have them connect they needed to create an alias of the server and use
Name Pipes. This seems to be a rising concern because there are users who
need to connect via ODBC to a widely used particular Access Application,
which seems to only like the TCP/IP route. I am somewhat sure this is
related to the cluster failure.

Anyway, this is the first time I've had to take a breathe to revisit the
problem at hand. We have been dealing with another server that crashed on
the same day, resulting in a brand new build of a SQL Server Cluster
environment (completely non related to the issue at hand).

I'm sorry for the long winded story. Would you have any idea as to why the
cluster would fail on failover along with the TCP/IP issue?

Thanks in Advanced..




Admiral
12/20/2005 2:52:20 PM
Continued research..

I failed to connect thru isql + IP connection via command line, which leads
me to believe that TCP/IP is not correct.

Thanks again and please bare with me, I haven't slept since Saturday night.



[quoted text, click to view]

Geoff N. Hiten
12/20/2005 5:54:55 PM
First, what you should have done with a blown model DB:

Start SQL Server in single user mode with trace flag -T3608. This stops =
SQL from recovering anything except the master database. Reattach =
Model. If necessary, use files copied from another installation at the =
exact same SP and Hotfix level. Stop SQL Server and restart normally. =
Sorry, but it really is that simple. Oh, and lock whoever detached =
"model" out of the system. HE is too dangerous to allow near your =
system.

You didn't mention whether you blew the cluster away or not or just =
rebuilt SQL. If you blew the cluster away, make sure that each disk =
resource has the same drive letter on all nodes and the disk resources =
fail over correctly from node to node. Stop the resource group, move =
it, and start each disk resource independently to test.

The Named Pipes only issue sounds like an incomplete SP3a install. =
Windows 2003 will prevent TCP/IP access if it detects a pre-SP3a SQL =
installation. Follow this article and re-apply SP3a.

http://support.microsoft.com/default.aspx?scid=3Dkb;en-us;815431


--=20
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP




[quoted text, click to view]
We had an error over the weekend of mass porportions(Sunday 3pm PST). =
Long story short; the model database was detached and the SQL Server was =
stopped, with it still detached. This happened to happen on our primary =
Production Database Clustered Server which is the bread-n-butter of the =
compay. (OUCH!)

It was time for some fast actions. We started the re-install SQL =
Server. In order to do so, the previous install had to be uninstalled. =
This seemed to go smoothly enough, but when re-applying the SP3a, we =
encountered an error. After researching the error, apparently in a =
clustered environment this will occur since the SP3a files still reside =
on the node(s). Microsoft states that if within a particular log file =
it results with an 'Installation was Successful', to disregard the =
error. I double checked the log file and sure enough the error was =
disregarded.=20

We moved along with the installation. We were able to restore all the =
user databases and all system databases with the exception of the master =
database. Unfortunately, even with starting SQL Server in single-user =
mode, the restore of the master database would not take. So it was not =
restored, but all other databases were. Fortunately, I ran a quick =
script to recover all the user logins previous to the disaster, which I =
reapplied to the new installation of SQL Server. Everything came back =
up and the QA Team successfully tested the production Application =
(Monday 4am PST). (Fhweeh)

After the succesful testing of the production environment, we tested =
the fail-over which resulted in SQL Server not starting on the secondary =
node. All the resources came right up on it, but not SQL Server. The =
only error that was that it was not able to locate the file on =
'O\logs\mastlog.ldf'. This error did not make sense since SQL Server =
uses the same file for the primary node. We were pressed for time since =
it was closing to start of business East Coast time, so we left the =
server as is. =20

Throughout the day there were other issues that arose, one in =
particular was certain systems were not able to connect to the server =
via TCP/IP. In order to have them connect they needed to create an =
alias of the server and use Name Pipes. This seems to be a rising =
concern because there are users who need to connect via ODBC to a widely =
used particular Access Application, which seems to only like the TCP/IP =
route. I am somewhat sure this is related to the cluster failure.

Anyway, this is the first time I've had to take a breathe to revisit =
the problem at hand. We have been dealing with another server that =
crashed on the same day, resulting in a brand new build of a SQL Server =
Cluster environment (completely non related to the issue at hand).

I'm sorry for the long winded story. Would you have any idea as to =
why the cluster would fail on failover along with the TCP/IP issue?

Geoff N. Hiten
12/20/2005 5:56:17 PM
This information helps.

Try this article:
http://support.microsoft.com/default.aspx?scid=kb;en-us;555017

--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP


[quoted text, click to view]

Admiral
12/21/2005 7:11:18 AM
Geoff,

I truly appreciate your response. I actually did try to reattach the model
using this method, unfortunately everytime I would attemp to attach the
model db, SQL Server would immediately turn off. At the time I was really
pressed for time, which led to the decision for a re-install. I do believe
there is more to the story that was not told to me.

Also, I do apologize if I led you to believe that the cluster is on a
Windows 2003 platform. It actually resides on a Windows 2000 Advanced
Server. The link you provided me, should still work for W2k? Like I
mentioned when it rains it pours and I've been putting out too many fires
for a Christmas week. I can't thank you enough for the response.




[quoted text, click to view]
First, what you should have done with a blown model DB:

Start SQL Server in single user mode with trace flag -T3608. This stops SQL
from recovering anything except the master database. Reattach Model. If
necessary, use files copied from another installation at the exact same SP
and Hotfix level. Stop SQL Server and restart normally. Sorry, but it
really is that simple. Oh, and lock whoever detached "model" out of the
system. HE is too dangerous to allow near your system.

You didn't mention whether you blew the cluster away or not or just rebuilt
SQL. If you blew the cluster away, make sure that each disk resource has
the same drive letter on all nodes and the disk resources fail over
correctly from node to node. Stop the resource group, move it, and start
each disk resource independently to test.

The Named Pipes only issue sounds like an incomplete SP3a install. Windows
2003 will prevent TCP/IP access if it detects a pre-SP3a SQL installation.
Follow this article and re-apply SP3a.

http://support.microsoft.com/default.aspx?scid=kb;en-us;815431


--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP




[quoted text, click to view]
We had an error over the weekend of mass porportions(Sunday 3pm PST).
Long story short; the model database was detached and the SQL Server was
stopped, with it still detached. This happened to happen on our primary
Production Database Clustered Server which is the bread-n-butter of the
compay. (OUCH!)

It was time for some fast actions. We started the re-install SQL Server.
In order to do so, the previous install had to be uninstalled. This seemed
to go smoothly enough, but when re-applying the SP3a, we encountered an
error. After researching the error, apparently in a clustered environment
this will occur since the SP3a files still reside on the node(s).
Microsoft states that if within a particular log file it results with an
'Installation was Successful', to disregard the error. I double checked the
log file and sure enough the error was disregarded.

We moved along with the installation. We were able to restore all the
user databases and all system databases with the exception of the master
database. Unfortunately, even with starting SQL Server in single-user mode,
the restore of the master database would not take. So it was not restored,
but all other databases were. Fortunately, I ran a quick script to recover
all the user logins previous to the disaster, which I reapplied to the new
installation of SQL Server. Everything came back up and the QA Team
successfully tested the production Application (Monday 4am PST). (Fhweeh)

After the succesful testing of the production environment, we tested the
fail-over which resulted in SQL Server not starting on the secondary node.
All the resources came right up on it, but not SQL Server. The only error
that was that it was not able to locate the file on 'O\logs\mastlog.ldf'.
This error did not make sense since SQL Server uses the same file for the
primary node. We were pressed for time since it was closing to start of
business East Coast time, so we left the server as is.

Throughout the day there were other issues that arose, one in particular
was certain systems were not able to connect to the server via TCP/IP. In
order to have them connect they needed to create an alias of the server and
use Name Pipes. This seems to be a rising concern because there are users
who need to connect via ODBC to a widely used particular Access Application,
which seems to only like the TCP/IP route. I am somewhat sure this is
related to the cluster failure.

Anyway, this is the first time I've had to take a breathe to revisit the
problem at hand. We have been dealing with another server that crashed on
the same day, resulting in a brand new build of a SQL Server Cluster
environment (completely non related to the issue at hand).

I'm sorry for the long winded story. Would you have any idea as to why
the cluster would fail on failover along with the TCP/IP issue?

Thanks in Advanced..




Admiral
12/21/2005 8:41:23 AM
I will give comclust.exe an attempt. I am looking into a solution for this
and of course if needed we will open a PSS case. I do have the impression
that SP3 was not fully installed even though it stated on the Microsoft site
otherwise. Is it worth a try to use the suggested stated on the link you
provided?

I honestly did not have time to investigate as to why the cluster would not
fail over. I quickly did give it a try to make sure the disks failed over
successfully, but will do so again tonight.

I thank you again for your continued help.



[quoted text, click to view]

Geoff N. Hiten
12/21/2005 10:45:10 AM
The process is unnecessary for Windows 2000. The enhanced security of
Windows 2003 requires the extra steps.

Try and run comclust.exe on each node to set MSDTC to cluster mode. It
can't hurt and it may help. It definitely sounds like a "I didn't touch
anything" situation where you aren't getting the full story.

This one may be worth opening a PSS case. They can send you a diagnostic
package that will tell them exactly what is broken and they can walk you
through fixing it.

--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP



[quoted text, click to view]

Geoff N. Hiten
12/21/2005 12:57:02 PM
The named pipes alias is necessary for Windows 2003. It is not necessary
for Windows 2000. I would definitely try and reinstall SP3a and see if it
helps.

--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP




[quoted text, click to view]
john clarke
12/22/2005 2:46:53 AM
ouch...

this situation happened a while ago for us. one of our techs dettached both
the model AND msdb database (whilst in single user mode) and then sql server
was shut down. Well it would not start up.

I managed to execute the following command on the server itself:

sqlservr -c -f -s <instancename> /T3608

This worked bringing up the sql server in minimal mode. Thereupon i used
query analyser to connect and reattached both the msdb and model databases.

All was back to normal

Best of luck!

john

[quoted text, click to view]

Anthony Thomas
12/24/2005 9:25:35 AM
Yeah, that's where I was going, you can't even detach "accidentally," or
otherwise, the model database unless you are in single user mode or you've
started the instance with the 3608 trace flag (by the way, just for grins,
there is also 3607 and 3609, which behave the same way, but recover more or
fewer databases).

As far as the TCP/IP issue goes, you had to rebuild the cluster and were not
able to restore the master database. That tells me that you are NOT at the
same service pack level as the master backup you were trying to use.

Since you went ahead without the master database restored, go ahead and try
to reapply the SP3a service pack, or, better yet, apply SP4. If you are
using AWE, you will need to apply the post-SP4, 2040 AWE hotfix as well.
Also, SP4 breaks the SQLDiag (and it sounds like you will need this); so,
you'll want to contact Microsoft Customer Support Services (yes, they've
changed their name from MS PSS to MS CSS) to get the private 2148 SQLDiag
hotfix.

Since you are technically on a brand new cluster (with a lot of old cluster
settings laying around), I'd approach this as a fresh build and go through
the entire installation configuration just as if nothing was running on it.

Geoff mentioned a few things, like comclust for MSDTC and validating the
shared clustered disk configuration, but there are several other items after
a cluster installation you'll have to revisit. The new installation means
that all of the registry settings have reverted back to a fresh
installation. The fact you are on a new master database means a lot of the
sp_configure settings have reverted back to a fresh installation.

You need to check two things, the version of the engine (sqlservr.exe) and
the version of the master, model, and msdb databases. You can execute
SELECT @@VERSION to get the prior and sp_helpdb master to get the later.

Good luck. Keep us up to date with your progress.

Sincerely,


Anthony Thomas


--

[quoted text, click to view]
Admiral
12/27/2005 7:25:09 AM
Anthony,

I thank you for your response and have gone through the majority of the
steps you've provided. I also almost applied SP4, but with the type of
front end application we have that could have been disasterous if not tested
properly. Considering we are at month end, the powers above have blocked me
from attempting any further fixes.

I do believe I found the TCP/IP issue/fix, but to be safe it was time to
call in Microsoft. I gathered all the information I had and have sent it
their way. I will definitely keep you all posted on what was found and
solution. I do agree and feel that this clustered environment needs a
complete overhaul. It has had way too many cooks in the kitchen. For the
time being, I am covering every avenue by keeping a warm standby along with
increasing my incrimentals throughout the day.

Thanks Again and Happy Holidays!


[quoted text, click to view]
Admiral
12/27/2005 8:48:49 AM
Below is the response from Microsoft as promised. I haven't had a chance to
look into their findings, but one thing is for certain, I did not install
SP4 for SQL, which they seem to think I have. I will be scheduling to apply
these fixes for this upcoming Friday night and let you guys know how things
turn out.

Jeff, the current growth of our company has been extremely fast. We are
definitely changing from a (mom-n-pop) setup to corporate. I will be the
first to tell you that I am no-seasoned SQL veteran, but have been doing my
best to keep things afloat. I truly thank you all for your responses.



ANALYSIS OF MPS REPORTS

=========================

Node 1

======

Node Name --------- EMPOWER01-SAN

Node OS Version --- 2195 Windows 2000 Advanced Server

Node Service Pack - Service Pack 4



Node 2

======

Node Name --------- EMPOWER02-SAN

Node OS Version --- 2195 Windows 2000 Advanced Server

Node Service Pack - Service Pack 4



Default instance

SQL Network Name(EMPOWER01)



-- PAE enabled.



NODE 1

=======



Network Name Network Priority

============ ================

Encore LAN GIG 1

Heartbeat 2



1. ACTION -- REVERSE THE NETWORK PRIORITY. First Heartbeat followed by
Encore.



==========================================================================================

Network Interfaces Binding Order -
NO CHANGE REQUIRED.

==========================================================================================



1. Encore LAN GIG

2. Heartbeat

3. NdisWanIp







NODE 2

=======





Network Name Network Priority

============ ================

Encore LAN GIG 1

Heartbeat 2





==========================================================================================

Network Interfaces Binding Order --
NO CHANGE REQUIRED.

==========================================================================================



1. Encore LAN GIG

2. Heartbeat

3. NdisWanIp





-- There are some hotfixes difference between the nodes. - We recommend
same hotfixes on both the nodes. There might be unexpected outages if we
have these differences.



KB32911 -- missing from EMPOWER01-SAN_SQL

KB82283 -- missing from EMPOWER02-SAN_SQL



-- Found some files missing or version mismatches. - We recommend same
version of system files on both the nodes. There might be unexpected outages
if we have these differences.



C:\WINNT\SYSTEM32\DLLCACHE\OLEDB32R.DLL

has different versions on both the nodes

On node 1.. file version is (2.70:9001.0)

On node 2.. file version is (2.60:6526.0)



C:\WINNT\SYSTEM32\DRIVERS\B57W2K.SYS

On node 1.. file version is (7.43:0.0)

On node 2.. file version is (6.34:4.0)



C:\WINNT\SYSTEM32\MACROMED\FLASH\GETFLASH.EXE

On node 1.. FILE NOT FOUND.

On node 2.. file version is (7.0:19.0)



C:\WINNT\SYSTEM32\NEWDEV.DLL

On node 1.. file version is(5.0:2195.6754)

On node 2.. file version is (5.0:2195.6666)



C:\WINNT\SYSTEM32\SEAREACH4.DLL

On node 1.. file version is (4.15:1022.0)

On node 2.. FILE NOT FOUND.



C:\WINNT\SYSTEM32\SKYLON.DLL

On node 1.. file version is

On node 2.. FILE NOT FOUND.



C:\WINNT\SYSTEM32\SKYLON2.DLL

On node 1.. file version is (2.11:1020.0)

On node 2.. FILE NOT FOUND.



C:\WINNT\SYSTEM32\WRITERPDF.DLL

On node 1.. file version is (1.20:1047.0)

On node 2.. FILE NOT FOUND.



From SQL server logs

log 6, 5,4,3,2,1

-------------------

2005-12-19 01:19:43.26 server Failed to obtain
TransactionDispenserInterface: Result Code = 0x8004d01b

2005-12-19 01:19:43.26 spid3 Starting up database 'master'.

2005-12-19 01:19:43.28 spid3 Error: 9003, Severity: 20, State: 1.

2005-12-19 01:19:43.28 spid3 Cannot recover the master database.
Exiting.



Current SQL logs

-----------------------

2005-12-19 01:22:34.53 server Microsoft SQL Server 2000 - 8.00..760
(Intel X86)

Enterprise Edition on Windows NT 5.0 (Build 2195: Service Pack
4)

2005-12-19 01:22:34.54 server Logging SQL Server messages in file
'o:\logs\ERRORLOG'.

2005-12-19 01:22:34.54 server SQL Server is starting at priority class
'normal'(8 CPUs detected).

2005-12-19 01:22:34.98 server Using 'SSNETLIB.DLL' version '8.0.766'..

2005-12-19 01:22:34.98 spid3 Server name is 'EMPOWER01'.

2005-12-19 01:22:35.03 server SuperSocket Info: Bind failed on TCP port
1433. --SQL is not listening on TCP --

2005-12-19 01:22:35.54 server SQL server listening on Shared Memory,
Named Pipes.



SP4 setup

19:59:25 Installation Succeeded



From Application and system logs.



Application 12/19/2005 1:16 error 17052 MSSQLSERVER
"Failover" EMPOWER02-SAN [sqsrvres] OnlineThread:
Error 435 bringing resource online.\r\n\r\n

Application 12/19/2005 1:16 error 17052 MSSQLSERVER
"Failover" EMPOWER02-SAN [sqsrvres] OnlineThread:
ResUtilsStartResourceService failed (status 435)\r\n\r\n

Application 12/19/2005 1:16 error 17052 MSSQLSERVER
"Failover" EMPOWER02-SAN [sqsrvres]
StartResourceService: Failed to start MSSQLSERVER service. CurrentState:
1\r\n\r\n

Application 12/19/2005 1:16 error 17055 MSSQLSERVER
"Server" EMPOWER01 18052 :\r\nError: 5105,
Severity: 16, State: 4.\r\n\r\n

Application 12/19/2005 1:16 error 17055 MSSQLSERVER
"Server" EMPOWER01 17204 :\r\nFCB::Open
failed: Could not open device o\logs\mastlog.ldf for virtual device number
(VDN) 2.\r\n\r\n

Application 12/19/2005 1:16 error 17055 MSSQLSERVER
"Server" EMPOWER01 17207 :\r\nudopen:
Operating system error 3(The system cannot find the path specified.) during
the creation/opening of physical device o\logs\mastlog.ldf.\r\n\r\n



From node 2

==========

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQLServer\MSSQLServer\Parameters

SQLArg0 REG_SZ -dn:\data\master.mdf

SQLArg1 REG_SZ -eo:\logs\ERRORLOG

Geoff N. Hiten
12/27/2005 10:56:54 AM
Comments Inline

[quoted text, click to view]

Definitely the right idea. Service Packs generally help with stability, but
sometimes they break applications. Testing is important.

[quoted text, click to view]
Your experience illustrates exactly where most organizations fall down when
it comes to High Availability. They think a cluster does all the work and
they can ignore the People and Process elements. I have found that strict
procedures that place specific responsibilities on specific individuals are
key to keeping user-generated problems from killing a system. Proper
training and selection of these individuals is also important. Database
operations isn't magic, we need to quit treating it like it is. Develop
procedures that support your operational goals. Make sure the people are
familiar with these procedures and are competent to implement them. Without
that, the best technology in the world is no more reliable than the cheapest
piece of junk.

--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP



[quoted text, click to view]
Geoff N. Hiten
12/27/2005 12:10:52 PM
The SP4 information refers to the OS, not SQL.

The differences appear to be related to different patches on the OS and to a
posible mismatch in MDAC binaries. I would download MDAC 2.8 and install it
on each node. MDAC installs are not cluster aware.

Procedures actually help with growth. They are a way of sharing
institutional knowledge without long, mistake-filled training periods.
Start with the biggest pain points and work outward from there. They don't
have to be complex or involved. They should help guide a professional
towards the correct solution for a given situation. The entire concept
needs high level approval and buy-in. Believe me, a few more episodes like
this and top management will buy into the idea.


--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP



[quoted text, click to view]
Anthony Thomas
12/27/2005 6:21:47 PM
You certainly need to look into their findings.

1. Could not recover the master database! The log startup parameter has a
missing colon ":" in the registry settings that locates the master
transaction log file.

2. TCP Bind error. This usually occurs because the IP was not reset on
startup; however, in this case, it may just be an artifact of the
non-recovery of the master database.

3. Geoff is right about the SP level: it is OS not the SQL Server that they
were referring to. However, you do show that you are only running SS SP3.
If you intend to stay on SP3 (for SQL Server), you should at least apply the
latest publicly available build, 878. Just ignore that the KB was directed
at the SQL Server supporting BizTalk Server.


You must install the SQL Server 2000 update that KB article 831950 describes
to run BizTalk Server 2004
http://support.microsoft.com/?kbid=838166


Also since I last responded, it looks like Microsoft has posted a Cumulative
post-SP4 hotfix that is publicly available:


A cumulative hotfix package is available for SQL Server 2000 Service Pack 4
build 2162
http://support.microsoft.com/?Kbid=904660


Sincerely,


Anthony Thomas


--

[quoted text, click to view]
============================================================================
==============
[quoted text, click to view]
============================================================================
==============
[quoted text, click to view]
============================================================================
==============
[quoted text, click to view]
============================================================================
==============
[quoted text, click to view]
AddThis Social Bookmark Button