Groups | Blog | Home
all groups > sql server programming > december 2004 >

sql server programming : Question for Joe Celko - re-post


Michael C
12/12/2004 10:58:00 PM
Hi Mr. Celko,

A few questions on your improved Soundex from SQL For Smarties, which I'm
trying to implement on SQL Server 2K and .NET:

1) In step 2 you say to replace all non-leading vowels with "A". Does that
include "Y"?

2) At one step (5.0) you say to "perform cleanup functions"... is there
anything specific that needs to be done here?

3) In step (5.1) I'm to "drop all terminal A and S characters"... does
that mean all A's and S's on the end, in any combination? I.e., "SS", "AA",
"AS", "SASSA", etc.? Or just all A's *or* all S's occurring at the very
end? If I'm reading correctly, we eliminate all A's (except leading A's) in
step 5.3 - so is it even necessary to first strip out all terminal A
characters before this?

4) Finally, in step (5.4) "strip all but the first of repeating adjacent
character substrings"... Don't get rid of the first repeating set anywhere,
or don't get rid of the first repeating characters if they occur at the
beginning of the string?

Sorry for all the questions, but I want to make sure I implement this
correctly.

Thanks,
Michael C.

Adam Machanic
12/12/2004 11:07:58 PM
There are much better algorithms than SOUNDEX available; have you seen this
article, on implementing Double Metaphone in T-SQL?

http://www.winnetmag.com/SQLServer/Article/ArticleID/26094/SQLServer_26094.html

I'm betting you'll have a lot better luck with it than with any SOUNDEX
implementation, but YMMV as always :)


--
Adam Machanic
SQL Server MVP
http://www.sqljunkies.com/weblog/amachanic
--


[quoted text, click to view]

Michael C
12/13/2004 7:19:54 AM
Thanks Adam, I appreciate the suggestion. I'm actually doing a comparison
of various Soundex-type algorithms. I'll check the Double Metaphone link
out - it may be one I can use in the comparison.

Thanks again,
Michael C.

[quoted text, click to view]

AddThis Social Bookmark Button