Groups | Blog | Home
all groups > sql server full text search > december 2003 >

sql server full text search : unicode full text cataluge


itamar
12/4/2003 2:35:36 AM
hi all
i ma developing on an sql 2000 / win 2000 platform.
i have built a table which contains a data in a unicode
format..ie:"екдоты..."
(this specific data is in russian). i have full text
indexed the table (Neutral "Language for Word Breaker"). i
have noticed very odd search results for example:
i)when i search for "зн&#1072" i get results
which contains this sequence but also results that have
the sequence seperated by a space ie:зн
ака&#1085"
ii)when i search for "знак&#1072"
or for "знак" i will get similar
results.
John Kane
12/4/2003 9:21:27 AM
itamar,
While I can't read Russian and the Russian text you copied & pasted into
this posting got corrupted, perhaps you could save your Russian Unicode text
to txt file and use Encoding: Unicode and attach that to your reply?

Are you storing this Russian Unicode text in a column defined with
Unicode-based datatype? Specifically, Nchar, Nvarchar or NText ? When you
query the russian text, are you using the "N" prefix? See KB article 239530
(Q239530) "INF: Unicode String Constants in SQL Server Require N Prefix" at
http://support.microsoft.com/default.aspx?scid=kb;en-us;239530

Regards,
John



[quoted text, click to view]

itamar zik
12/6/2003 11:22:54 PM

hi John
thank you for your reply...i am using nvarchar defined field , but
somehow the data in it ,which is fed in through asp forms & vb dll, is
being stored as numbers : &#num;&#num;&#num;...the munbers are the
unicode values of the letters.....when i query the field with a
character string ( which also turns to look like :";&#num;&#num;"...i
get the funny results which do not change if i prefix "N" befor
it....may be i have got it all wrong???
thanx,itamar

*** Sent via Developersdex http://www.developersdex.com ***
John Kane
12/7/2003 8:00:55 AM
itamar,
Considering that you are using nvarchar and that your ASP forms that Unicode
characters are being stored as their representative hex number, I'd
recommend that you look at your ASP forms and how they are sending the
Unicode text to SQL Server for storage in the nvarchar column as this is
where your problem resides. Without having an actual Russian Unicode text
file saved as a .txt file and saved with "Encoding: Unicode" attached to
your reply, it is difficult to provide more help.

Regards,
John


[quoted text, click to view]

AddThis Social Bookmark Button