sql server (alternate):
SQL Server 2005 SP2 (build 3054) Consider the following scenario: - A complex multi-statement table valued function is created. Let's call it dbo.tfFunc(@Param1, @Param2) - A SELECT statement is executed, that calls the above function twice, each time with a different set of parameters. In pseudocode: SELECT <column list> FROM dbo.tfFunc(1, 2) AS f1 <some JOIN operator> dbo.tfFunc(3, 4) AS f2 ON f1.col = f2.col INNER JOIN dbo.Table1 AS t1 ON ... etc. The exact statement is probably irrelevant, as long as the same table- valued function is called twice (I have observed the issue in two very different statements calling the same function). The statement is executed in a SNAPSHOT isolation level transaction, although this may also be irrelevant. - The statement continues executing for a long time. If sp_who2 is run at that time, the following row is returned for the statement connection (only relevant columns are shown): SPID Status BlkBy Command CPUTime DiskIO LastBatch 63 SUSPENDED 63 SELECT 29282 683 08/31 18:17:37 The statement appears to be blocked by itself. If sp_lock is run at that time, the following rows are returned: spid dbid ObjId IndId Type Resource Mode Status 63 2 1316624641 0 TAB Sch-S GRANT 63 2 1316624641 0 TAB Sch-M WAIT It appears that SQL Server waits indefinitely trying to obtain a schema- modification lock on a resource which already has a schema-stability lock placed on it by the same connection. The following is pure speculation, but it seems reasonable to assume that the server has materialized the result of the first call to the function using a temporary table in tempdb, and is trying to materialize the result of the second call using the same temporary table (same ObjId in sp_lock results). I do not know why this does not cause a deadlock error. Unfortunately, I do not have a simple repro script for this. The actual code is rather complex. While I can devise a workaround, this does look like a bug. I am posting it here before submitting a bug on Connect, in case anyone can shed some light. Thanks. --
Dimitri Furman (dfurman@cloud99.net) writes: [quoted text, click to view] > - The statement continues executing for a long time. If sp_who2 is run at > that time, the following row is returned for the statement connection
Long time? But does it ever complete? [quoted text, click to view] > SPID Status BlkBy Command CPUTime DiskIO LastBatch > 63 SUSPENDED 63 SELECT 29282 683 08/31 18:17:37 > > The statement appears to be blocked by itself. If sp_lock is run at that > time, the following rows are returned: > > spid dbid ObjId IndId Type Resource Mode > Status > 63 2 1316624641 0 TAB Sch-S GRANT > 63 2 1316624641 0 TAB Sch-M WAIT > > It appears that SQL Server waits indefinitely trying to obtain a schema- > modification lock on a resource which already has a schema-stability lock > placed on it by the same connection.
Is this a parallel plan? In that case different threads could be blocking each other. [quoted text, click to view] > The following is pure speculation, but it seems reasonable to assume that > the server has materialized the result of the first call to the function > using a temporary table in tempdb, and is trying to materialize the > result of the second call using the same temporary table (same ObjId in > sp_lock results).
The table in question is likely to be the return table for the UDF. You should be able to find out more about this table by looking in sys.objects and sys.columns. [quoted text, click to view] > Unfortunately, I do not have a simple repro script for this. The actual > code is rather complex. While I can devise a workaround, this does look > like a bug. I am posting it here before submitting a bug on Connect, in > case anyone can shed some light. Thanks.
Without a repro it will of course be difficult to address the issue. I would suggest that when you file the bug that you include: 1) The query. 2) The code for the UDF. 3) If possible also table definitions. 4) The XML showplan. (You can save this from the graphical plan in Mgmt Studio.) 5) The output from sys.dm_os_waiting_tasks and sys.tran_locks. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
Regarding Erland's comment about a parallel plan, try running the query with an OPTION (MAXDOP 1) hint if you see parallelism. That might provide an easier workaround and/or provide additional info for the Connect bug report. -- Hope this helps. Dan Guzman SQL Server MVP [quoted text, click to view] "Dimitri Furman" <dfurman@cloud99.net> wrote in message news:Xns999DC1CC07A1Fdfurmancloud99@127.0.0.1... > SQL Server 2005 SP2 (build 3054) > > Consider the following scenario: > > - A complex multi-statement table valued function is created. Let's call > it dbo.tfFunc(@Param1, @Param2) > - A SELECT statement is executed, that calls the above function twice, > each time with a different set of parameters. In pseudocode: > > SELECT <column list> > FROM dbo.tfFunc(1, 2) AS f1 > <some JOIN operator> dbo.tfFunc(3, 4) AS f2 > ON f1.col = f2.col > INNER JOIN dbo.Table1 AS t1 > ON ... > etc. > > The exact statement is probably irrelevant, as long as the same table- > valued function is called twice (I have observed the issue in two very > different statements calling the same function). The statement is > executed in a SNAPSHOT isolation level transaction, although this may > also be irrelevant. > > - The statement continues executing for a long time. If sp_who2 is run at > that time, the following row is returned for the statement connection > (only relevant columns are shown): > > SPID Status BlkBy Command CPUTime DiskIO LastBatch > 63 SUSPENDED 63 SELECT 29282 683 08/31 18:17:37 > > The statement appears to be blocked by itself. If sp_lock is run at that > time, the following rows are returned: > > spid dbid ObjId IndId Type Resource Mode Status > 63 2 1316624641 0 TAB Sch-S GRANT > 63 2 1316624641 0 TAB Sch-M WAIT > > It appears that SQL Server waits indefinitely trying to obtain a schema- > modification lock on a resource which already has a schema-stability lock > placed on it by the same connection. > > The following is pure speculation, but it seems reasonable to assume that > the server has materialized the result of the first call to the function > using a temporary table in tempdb, and is trying to materialize the > result of the second call using the same temporary table (same ObjId in > sp_lock results). > > I do not know why this does not cause a deadlock error. > > Unfortunately, I do not have a simple repro script for this. The actual > code is rather complex. While I can devise a workaround, this does look > like a bug. I am posting it here before submitting a bug on Connect, in > case anyone can shed some light. Thanks. > > -- > remove a 9 to reply by email
[quoted text, click to view] On Sep 01 2007, 08:39 am, Erland Sommarskog <esquel@sommarskog.se> wrote
in news:Xns999E95B819D01Yazorman@127.0.0.1: [quoted text, click to view] > Dimitri Furman (dfurman@cloud99.net) writes: >> - The statement continues executing for a long time. > > Long time? But does it ever complete?
The longest time I let it run for is 40 minutes. Considering that it usually runs in less than 10 seconds, the likely answer is no. [quoted text, click to view] > Is this a parallel plan?
Hard to tell. I forgot to mention that the problem is intermittent. When the statement completes successfully, there is no indication of parallelism in the actual plan. When it does not, there is obviously no plan to look at (in fact, the only way to kill the connection in that case is to restart the server). The estimated plan doesn't show any parallelism either. I am talking here about the plan for the statement, not the plan for the called function, which I apparently cannot see. I did try OPTION (MAXDOP 1) in both the statement and the function, and have not been able to reproduce the issue so far. But this is inconclusive, sometimes it works for days without a problem. [quoted text, click to view] > The table in question is likely to be the return table for the UDF. > You should be able to find out more about this table by looking in > sys.objects and sys.columns.
I did, and this is where it gets a bit interesting. The UDF in question includes a table variable, and it turns out that the mentioned schema locks are placed on the table in tempdb corresponding to that table variable, not the return table for the UDF. I am not sure if this makes any substantive difference though. [quoted text, click to view] > 4) The XML showplan. (You can save this from the graphical plan in > Mgmt > Studio.)
I'm not sure how I could save the plan if the statement never completes... --
Dimitri Furman (dfurman@cloud99.net) writes: [quoted text, click to view] > The longest time I let it run for is 40 minutes. Considering that it > usually runs in less than 10 seconds, the likely answer is no.
If you have to restart the server to resolve the situation, it certainly sounds that the prospects for completion are utterly bleak. [quoted text, click to view] > Hard to tell. I forgot to mention that the problem is intermittent. When > the statement completes successfully, there is no indication of > parallelism in the actual plan. When it does not, there is obviously no > plan to look at (in fact, the only way to kill the connection in that > case is to restart the server). The estimated plan doesn't show any > parallelism either. I am talking here about the plan for the statement, > not the plan for the called function, which I apparently cannot see.
If you run the function alone, you should see its plan I think. But I was mainly interested in whether the main query had any parallelism. In that case it could be one thread blocking another. Hm, then again, if the UDF causes parallelism, I guess that could also be an issue. But I don't think this is likely, since if you insert into a table variable, there cannot be parallelism. And it's difficult to anything in a UDF without modifying a table variable. Anyway, you can easily examine this next time it happens by running SELECT * FROM sys.dm_os_tasks WHERE session_id = <trouble spid> If there are rows with non-zero exec_context_id, there are parallel threads. The output from sys.dm_os_waiting_tasks would also be interesting. [quoted text, click to view] > I did, and this is where it gets a bit interesting. The UDF in question > includes a table variable, and it turns out that the mentioned schema > locks are placed on the table in tempdb corresponding to that table > variable, not the return table for the UDF. I am not sure if this makes > any substantive difference though.
At least it is a clue for anyone who is trying to produce a repro. Given that you say it's intermittent, I am not going to try. [quoted text, click to view] >> 4) The XML showplan. (You can save this from the graphical plan in >> Mgmt Studio.) > > I'm not sure how I could save the plan if the statement never completes...
It's also available in sys.dm_exec_text_query_plan. A way to get the plan, sys.os_waiting_tasks and more packaged into one result set, is to use my beta_lockinfo, available at http://www.sommarskog.se/sqlutil/beta_lockinfo.html There have been some bugs around temp-table caching, I don't if they could be related to what you see. There is a Cumultative Update, including these two bugfixes at http://support.microsoft.com/kb/939537. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
[quoted text, click to view] On Sep 02 2007, 01:39 pm, Erland Sommarskog <esquel@sommarskog.se> wrote
in news:Xns999FC8952760AYazorman@127.0.0.1: [quoted text, click to view] > Anyway, you can easily examine this next time it happens by running > > SELECT * FROM sys.dm_os_tasks WHERE session_id = <trouble spid> > > If there are rows with non-zero exec_context_id, there are parallel > threads.
There are not, so I guess we can rule out parallelism. [quoted text, click to view] > It's also available in sys.dm_exec_text_query_plan. A way to get the > plan, sys.os_waiting_tasks and more packaged into one result set, is > to use my beta_lockinfo, available at > http://www.sommarskog.se/sqlutil/beta_lockinfo.html This should be handy. Thanks. [quoted text, click to view] > There have been some bugs around temp-table caching, I don't if they > could be related to what you see. There is a Cumultative Update, > including these two bugfixes at > http://support.microsoft.com/kb/939537. I'll get that and watch how it goes for a few days. If it still happens, will try to find some time to work on a repro. Will follow-up with any news. --
Dimitri Furman (dfurman@cloud99.net) writes: [quoted text, click to view] > On Sep 02 2007, 11:44 pm, Dimitri Furman <dfurman@cloud99.net> wrote in > news:Xns999FF19D6AD96dfurmancloud99@127.0.0.1: > >> If it still happens, >> will try to find some time to work on a repro. Will follow-up with any >> news. > > Submitted feedback on Connect that includes a repro: > https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx? > FeedbackID=300465 Thanks Dimitri. Looks like an excellent bug report. I hope that it will be sufficient for the SQL Server people to track down the bug. Unfortunately, it is not possible to access attachments on Connect, so I cannot try the repro. I tried to compose my own from your description, but it was not really that simple. Given the trouble you had in recreating it, I wasn't suprised. If it's possible for you to post the repro files here, I'd be interested. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
Dimitri Furman (dfurman@cloud99.net) writes: [quoted text, click to view] > On Sep 27 2007, 06:01 pm, Erland Sommarskog <esquel@sommarskog.se> wrote > in news:Xns99B9FF37989Yazorman@127.0.0.1: >> If it's possible for you to post the repro files here, I'd be interested. > > Here it is: > http://iridule.net/cu/files/SS2005LockingBugRepro1.zip Got it, and indeed I had to reboot myserver. What was missing from your description on Connect was the RECOMPILE hint. When I remove it, the procedure completes. I looked in the SQL Server error log, and I found that there is a stack dump for an unresolved deadlock. -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
[quoted text, click to view] On Sep 28 2007, 05:28 pm, Erland Sommarskog <esquel@sommarskog.se> wrote
in news:Xns99B9EFAA31154Yazorman@127.0.0.1: [quoted text, click to view] > Got it, and indeed I had to reboot myserver. What was missing from > your description on Connect was the RECOMPILE hint. When I remove it, > the procedure completes.
The RECOMPILE hint is there to make it consistently reproducible. Before I added that, the issue only occurred intermittently (apparently, when the function's plan had to be recompiled). [quoted text, click to view] > I looked in the SQL Server error log, and I found that there is a > stack dump for an unresolved deadlock.
I wish I had seen that in the beginning, at least I could be certain it's a real problem. I updated the bug report to include these two bits of information. --
On Sep 28 2007, 05:42 pm, Erland Sommarskog <esquel@sommarskog.se> wrote in news:Xns99B9F1E3FC5AEYazorman@127.0.0.1: [quoted text, click to view] > By the way, I tried the repro on the July CTP of SQL 2008, and the error > does not occur there, but the procedure completes successfully.
I hope it does not mean that they won't bother fixing it in 2005... --
Dimitri Furman (dfurman@cloud99.net) writes: [quoted text, click to view] > On Sep 28 2007, 05:42 pm, Erland Sommarskog <esquel@sommarskog.se> wrote > in news:Xns99B9F1E3FC5AEYazorman@127.0.0.1: > >> By the way, I tried the repro on the July CTP of SQL 2008, and the error >> does not occur there, but the procedure completes successfully. > > I hope it does not mean that they won't bother fixing it in 2005...
I could be that it's fixed in Katmai "by accident", but it could also be the case that the issue is known, but the fix is deemed to risky to implement in SQL 2005. I found this KB Article: http://support.microsoft.com/kb/923605, but this appears to relate to a fix for SP1. Nevertheless, the Symptoms section says "This problem occurs if table variables are used in a transaction." -- Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at
Don't see what you're looking for? Try a search.
|