Groups | Blog | Home
all groups > sql server programming > october 2004 >

sql server programming : CHARINDEX in user-defined function


Aaron [SQL Server MVP]
10/2/2004 5:45:06 PM
[quoted text, click to view]

The view workaround is suspect also. See http://www.aspfaq.com/2439 for
other workarounds and a link to a discussion about the inconsistency of the
results from a view...

A

Agoston Bejo
10/2/2004 9:39:44 PM
Hi,
I'm trying to use the CHARINDEX built-in function in my user-defined
function, but it doesn't seem to work.
Right now I got down to only write a test-function that doesn't do anything
else but call charindex:

CREATE FUNCTION X(@p_str1 VARCHAR, @p_str2 VARCHAR)
RETURNS INT AS
BEGIN
RETURN CHARINDEX(@p_str1, @p_str2);
END


Now, when I do a simple select like this:

SELECT NAME, dbo.X('Smith', NAME) AS Expr1, CHARINDEX('Smith', NAME) AS
Expr2 FROM PERSONS

The second column will be 0 everywhere, so rows like this occur:

John Smith, 0, 6

I've read that non-deterministic functions are not allowed to be used in
user-defined functions, and I suspect that this is what causes this.
In this case:

1. Why can't non-deterministic functions be used?
2. Shouldn't there be some error thrown in dbo.X() instead of it simply
returning 0?
3. Is there some similar deterministic function that can be used? I really
need this function for doing some basic parsing, I'm don't want write my own
in vain.
4. How on Earth can such a simple function be non-deterministic?

Thx


Hugo Kornelis
10/2/2004 10:11:08 PM
[quoted text, click to view]

The default length for a varchar is 1 byte. You might want to change the
above line to something like:

CREATE FUNCTION X(@p_str1 VARCHAR(20), @p_str2 VARCHAR(50))

(20 and 50 used as example - se proper judgement to find the best length
for your function). This will solve your problem.


Now on to your questions:

[quoted text, click to view]

Because (unlike views) a user-defined function gets called multiple times
during the execution of one query; if the results change from call to
call, all kinds of nasty side-effects will kick in,


[quoted text, click to view]

If you attempt to create a user-defined function with a non-deterministic
function, you'll get an error message.

[quoted text, click to view]

A workaround if you need to use a non-deterministic function in your
userdefined function is to put the nondeterministic function in a view and
reference the view from within the function. This can not be used for all
non-deterministic functions, but it's a common workaround for including
the current date and time (getdate()) in a UDF.

[quoted text, click to view]

It isn't. :-)


Best, Hugo
--

Agoston Bejo
10/3/2004 11:12:17 AM
Hi!
What you told solved the problem completely. I have some minor questions
left, however:

"Hugo Kornelis" <hugo@pe_NO_rFact.in_SPAM_fo> az alábbiakat írta a következo
hírüzenetben: dc2ul09hsjru7d0der6jpqoe2cbics6e2m@4ax.com...
[quoted text, click to view]

I see, only to be clear on the case: does 'non-deterministic' in this
terminology also mean that the function has side effects? It is possible
that a 'non-deterministic' function (in the classical sense) may not change
anything in the database.

[quoted text, click to view]

From the MSDN:

"All built-in string functions, except for CHARINDEX and PATINDEX, are
deterministic."

(URL:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_
fa-fz_7oqb.asp - Transact-SQL Reference / Built-in functions)

But I didn't get any error message, and CHARINDEX worked indeed in my
function.
Is the MSDN lying? :)



[quoted text, click to view]

Steve Kass
10/3/2004 6:15:15 PM
For fun, give this a try (using the framework of Tibor's example):

SELECT foo, count(*) as ct
FROM (
SELECT dbo.foo1() as foo
FROM Northwind.dbo."Order Details" AS od
INNER JOIN Northwind.dbo.Orders AS o
ON o.OrderId = od.OrderID
) T
GROUP BY foo

You are likely to get a result set containing several rows, all of which
have the same value of foo. Even adding DISTINCT to the outer query
does not guarantee a result set with distinct rows.

SK

[quoted text, click to view]
Hugo Kornelis
10/3/2004 10:53:54 PM
[quoted text, click to view]

Hi Aaron,

Thanks for the heads-up!

Good article - and a very interesting script in Tibor's proof. I ran it
several times and got 5, 6, 7, 8 or 9 results - and often, they were not
even distinct at all!!

I wish I had encountered this link before I tried to explain to the OP why
UDF's can't contain non-deterministic functions...

Best, Hugo
--

Hugo Kornelis
10/3/2004 11:19:16 PM
[quoted text, click to view]

Hi Agoston,

I hope you also read the article posted by Aaron (in reply to my message)
which shows why the workaround with a view has some problems as well.
Please follow his advice instead of my original advice.

I'll type the answers to your minor questions inline.


[quoted text, click to view]

No, 'non-deterministic' in this terminology means *only* whether the
function is deterministic or not - in other words: if the result of the
function (with the same arguments) will always be the same or not.

Not having side effects is *another* requirement for user-defined
functions (UDFs). I've seen some workarounds to this requirement posted in
these newsgroups as well, but I'd recommend against using them - there is
some solid reasoning behind the requirement of not having side effects.
The main reason is the fact that the optimizer is free to choose different
query plans, that might involve more or less calls to the UDFs used in the
query, and (if columns are used as parameters) different sequences. The
result of the query becomes inpredictable.


[quoted text, click to view]

It isn't lying. Unclear, yes. But not lying.


First, let me give you a quote that is more relevant for you. The URL is
http://msdn.microsoft.com/library/en-us/createdb/cm_8_des_08_460j.asp.

(start quote)

Built-in functions that can return different data on each call are not
allowed in user-defined functions. The built-in functions not allowed in
user-defined functions are:

@@CONNECTIONS @PACK_SENT GETDATE
@@CPU_BUSY @PACKET_ERRORS GetUTCDate
@@IDLE @TIMETICKS NEWID
@@IO_BUSY @TOTAL_ERRORS RAND
@@MAX_CONNECTIONS @@TOTAL_READ TEXTPTR
@@PACK_RECEIVED @@TOTAL_WRITE

(end quote)

The quote you gave relates to determining whether your UDF will be
considered deterministic or non-deterministic. Since it contains a call to
CHARINDEX (which is considerde non-determinstic), your UDS will be
considered non-deterministic as well. This means that you can't create an
index on a view or computed column that is creating using this UDF. As
long as that is not your intention, it should not bother you.

(By the way - I do admit I often wondered WHY SQL Server regards CHARINDEX
and PATINDEX to be nondeterministic).

Best, Hugo
--

skissane NO[at]SPAM gmail.com
10/16/2004 3:54:58 PM
Hi,

I wrote the following replacement for CHARINDEX, which is
deterministic, albeit slower than the builtin CHARINDEX function. But
since it is deterministic, I can use it in indexed views, etc....

create function dbo.fn_CHARINDEX(@ch char(1), @s varchar(8000))
returns int with schemabinding as
begin

declare @i int

set @i = 1

while @i <= len(@s) begin
if substring(@s,@i,1) = @ch
return @i
set @i = @i + 1
end

return NULL

end

Cheers
Simon Kissane
AddThis Social Bookmark Button