all groups > sql server (alternate) > august 2007 >
You're in the

sql server (alternate)

group:

Apparent DB engine bug in SQL Server 2005


Apparent DB engine bug in SQL Server 2005 Dimitri Furman
8/31/2007 11:03:27 PM
sql server (alternate):
SQL Server 2005 SP2 (build 3054)

Consider the following scenario:

- A complex multi-statement table valued function is created. Let's call
it dbo.tfFunc(@Param1, @Param2)
- A SELECT statement is executed, that calls the above function twice,
each time with a different set of parameters. In pseudocode:

SELECT <column list>
FROM dbo.tfFunc(1, 2) AS f1
<some JOIN operator> dbo.tfFunc(3, 4) AS f2
ON f1.col = f2.col
INNER JOIN dbo.Table1 AS t1
ON ...
etc.

The exact statement is probably irrelevant, as long as the same table-
valued function is called twice (I have observed the issue in two very
different statements calling the same function). The statement is
executed in a SNAPSHOT isolation level transaction, although this may
also be irrelevant.

- The statement continues executing for a long time. If sp_who2 is run at
that time, the following row is returned for the statement connection
(only relevant columns are shown):

SPID Status BlkBy Command CPUTime DiskIO LastBatch
63 SUSPENDED 63 SELECT 29282 683 08/31 18:17:37

The statement appears to be blocked by itself. If sp_lock is run at that
time, the following rows are returned:

spid dbid ObjId IndId Type Resource Mode Status
63 2 1316624641 0 TAB Sch-S GRANT
63 2 1316624641 0 TAB Sch-M WAIT

It appears that SQL Server waits indefinitely trying to obtain a schema-
modification lock on a resource which already has a schema-stability lock
placed on it by the same connection.

The following is pure speculation, but it seems reasonable to assume that
the server has materialized the result of the first call to the function
using a temporary table in tempdb, and is trying to materialize the
result of the second call using the same temporary table (same ObjId in
sp_lock results).

I do not know why this does not cause a deadlock error.

Unfortunately, I do not have a simple repro script for this. The actual
code is rather complex. While I can devise a workaround, this does look
like a bug. I am posting it here before submitting a bug on Connect, in
case anyone can shed some light. Thanks.

--
Re: Apparent DB engine bug in SQL Server 2005 Erland Sommarskog
9/1/2007 12:00:00 AM
Dimitri Furman (dfurman@cloud99.net) writes:
[quoted text, click to view]

Long time? But does it ever complete?

[quoted text, click to view]

Is this a parallel plan? In that case different threads could be
blocking each other.

[quoted text, click to view]

The table in question is likely to be the return table for the UDF.
You should be able to find out more about this table by looking in
sys.objects and sys.columns.

[quoted text, click to view]

Without a repro it will of course be difficult to address the issue.
I would suggest that when you file the bug that you include:

1) The query.
2) The code for the UDF.
3) If possible also table definitions.
4) The XML showplan. (You can save this from the graphical plan in Mgmt
Studio.)
5) The output from sys.dm_os_waiting_tasks and sys.tran_locks.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
Re: Apparent DB engine bug in SQL Server 2005 Dan Guzman
9/1/2007 12:00:00 AM
Regarding Erland's comment about a parallel plan, try running the query with
an OPTION (MAXDOP 1) hint if you see parallelism. That might provide an
easier workaround and/or provide additional info for the Connect bug report.

--
Hope this helps.

Dan Guzman
SQL Server MVP

[quoted text, click to view]
Re: Apparent DB engine bug in SQL Server 2005 Dimitri Furman
9/2/2007 3:55:41 PM
[quoted text, click to view]
in news:Xns999E95B819D01Yazorman@127.0.0.1:

[quoted text, click to view]

The longest time I let it run for is 40 minutes. Considering that it
usually runs in less than 10 seconds, the likely answer is no.

[quoted text, click to view]

Hard to tell. I forgot to mention that the problem is intermittent. When
the statement completes successfully, there is no indication of parallelism
in the actual plan. When it does not, there is obviously no plan to look at
(in fact, the only way to kill the connection in that case is to restart
the server). The estimated plan doesn't show any parallelism either. I am
talking here about the plan for the statement, not the plan for the called
function, which I apparently cannot see.

I did try OPTION (MAXDOP 1) in both the statement and the function, and
have not been able to reproduce the issue so far. But this is inconclusive,
sometimes it works for days without a problem.

[quoted text, click to view]

I did, and this is where it gets a bit interesting. The UDF in question
includes a table variable, and it turns out that the mentioned schema locks
are placed on the table in tempdb corresponding to that table variable, not
the return table for the UDF. I am not sure if this makes any substantive
difference though.

[quoted text, click to view]

I'm not sure how I could save the plan if the statement never completes...

--
Re: Apparent DB engine bug in SQL Server 2005 Erland Sommarskog
9/2/2007 5:39:14 PM
Dimitri Furman (dfurman@cloud99.net) writes:
[quoted text, click to view]

If you have to restart the server to resolve the situation, it certainly
sounds that the prospects for completion are utterly bleak.

[quoted text, click to view]

If you run the function alone, you should see its plan I think.

But I was mainly interested in whether the main query had any parallelism.
In that case it could be one thread blocking another. Hm, then again,
if the UDF causes parallelism, I guess that could also be an issue.
But I don't think this is likely, since if you insert into a table
variable, there cannot be parallelism. And it's difficult to anything
in a UDF without modifying a table variable.

Anyway, you can easily examine this next time it happens by running

SELECT * FROM sys.dm_os_tasks WHERE session_id = <trouble spid>

If there are rows with non-zero exec_context_id, there are parallel
threads.

The output from sys.dm_os_waiting_tasks would also be interesting.

[quoted text, click to view]

At least it is a clue for anyone who is trying to produce a repro.
Given that you say it's intermittent, I am not going to try.

[quoted text, click to view]

It's also available in sys.dm_exec_text_query_plan. A way to get the
plan, sys.os_waiting_tasks and more packaged into one result set, is
to use my beta_lockinfo, available at
http://www.sommarskog.se/sqlutil/beta_lockinfo.html


There have been some bugs around temp-table caching, I don't if they
could be related to what you see. There is a Cumultative Update, including
these two bugfixes at http://support.microsoft.com/kb/939537.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
Re: Apparent DB engine bug in SQL Server 2005 Dimitri Furman
9/3/2007 3:44:23 AM
[quoted text, click to view]
in news:Xns999FC8952760AYazorman@127.0.0.1:

[quoted text, click to view]

There are not, so I guess we can rule out parallelism.

[quoted text, click to view]

This should be handy. Thanks.

[quoted text, click to view]

I'll get that and watch how it goes for a few days. If it still happens,
will try to find some time to work on a repro. Will follow-up with any
news.

--
Re: Apparent DB engine bug in SQL Server 2005 Dimitri Furman
9/27/2007 9:25:20 PM
On Sep 02 2007, 11:44 pm, Dimitri Furman <dfurman@cloud99.net> wrote in
news:Xns999FF19D6AD96dfurmancloud99@127.0.0.1:

[quoted text, click to view]

Submitted feedback on Connect that includes a repro:
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?
FeedbackID=300465

--
Re: Apparent DB engine bug in SQL Server 2005 Erland Sommarskog
9/27/2007 10:01:17 PM
Dimitri Furman (dfurman@cloud99.net) writes:
[quoted text, click to view]

Thanks Dimitri. Looks like an excellent bug report. I hope that it will
be sufficient for the SQL Server people to track down the bug.

Unfortunately, it is not possible to access attachments on Connect, so
I cannot try the repro. I tried to compose my own from your description,
but it was not really that simple. Given the trouble you had in recreating
it, I wasn't suprised.

If it's possible for you to post the repro files here, I'd be interested.



--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
Re: Apparent DB engine bug in SQL Server 2005 Dimitri Furman
9/28/2007 2:05:11 AM
On Sep 27 2007, 06:01 pm, Erland Sommarskog <esquel@sommarskog.se> wrote in
news:Xns99B9FF37989Yazorman@127.0.0.1:

[quoted text, click to view]

Here it is:
http://iridule.net/cu/files/SS2005LockingBugRepro1.zip

Thanks for helping me nail it down.

--
Re: Apparent DB engine bug in SQL Server 2005 Erland Sommarskog
9/28/2007 9:28:55 PM
Dimitri Furman (dfurman@cloud99.net) writes:
[quoted text, click to view]

Got it, and indeed I had to reboot myserver. What was missing from your
description on Connect was the RECOMPILE hint. When I remove it, the
procedure completes.

I looked in the SQL Server error log, and I found that there is a
stack dump for an unresolved deadlock.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
Re: Apparent DB engine bug in SQL Server 2005 Erland Sommarskog
9/28/2007 9:42:03 PM
By the way, I tried the repro on the July CTP of SQL 2008, and the error
does not occur there, but the procedure completes successfully.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
Re: Apparent DB engine bug in SQL Server 2005 Dimitri Furman
9/29/2007 2:58:31 AM
[quoted text, click to view]
in news:Xns99B9EFAA31154Yazorman@127.0.0.1:

[quoted text, click to view]

The RECOMPILE hint is there to make it consistently reproducible. Before I
added that, the issue only occurred intermittently (apparently, when the
function's plan had to be recompiled).

[quoted text, click to view]

I wish I had seen that in the beginning, at least I could be certain it's a
real problem.

I updated the bug report to include these two bits of information.

--
Re: Apparent DB engine bug in SQL Server 2005 Dimitri Furman
9/29/2007 3:00:21 AM
On Sep 28 2007, 05:42 pm, Erland Sommarskog <esquel@sommarskog.se> wrote in
news:Xns99B9F1E3FC5AEYazorman@127.0.0.1:

[quoted text, click to view]

I hope it does not mean that they won't bother fixing it in 2005...

--
Re: Apparent DB engine bug in SQL Server 2005 Erland Sommarskog
9/29/2007 4:13:25 PM
Dimitri Furman (dfurman@cloud99.net) writes:
[quoted text, click to view]

I could be that it's fixed in Katmai "by accident", but it could also
be the case that the issue is known, but the fix is deemed to risky to
implement in SQL 2005.

I found this KB Article: http://support.microsoft.com/kb/923605, but this
appears to relate to a fix for SP1. Nevertheless, the Symptoms section
says "This problem occurs if table variables are used in a transaction."


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
AddThis Social Bookmark Button