all groups > sql server full text search > july 2003 >
You're in the

sql server full text search

group:

Does Full Text search function support Japanese characters


Does Full Text search function support Japanese characters traceycui332 NO[at]SPAM hotmail.com
7/23/2003 4:48:20 PM
sql server full text search:
Hi there,

I want to search Japanese characters with Full Text search function. I
created a new database on SQL 2000, English Win2k Server. The
database's collation is "Japanese-bin" and contains one table that
contains 4 fields, which are ID -- int, primary key, identity;
PageName -- nvarchar; Contents -- ntext; Keywords -- ntext.

Defined a Full Text index and Full Text catalog in the Full Text Index
Wizards and selected "Neutral" from "Language for Word Breaker", the
table fields to be indexed were PageName, Contents and Keywords, then
ran a full population.

I executed some SQL statements as follows:
1. insert into tbSearch
VAULES(N'b"ì‘åŠw—Šw•"‰ž—p"Šw‰È',N'b"ì‘åŠw—Šw•"‰ž—p"Šw‰È',N'b"ì‘åŠw—Šw•"‰ž—p"Šw‰È')
(ran a full population)

2. select N'PageName' from tbSearch where
FREETEXT(*,N'b"ì‘åŠw—Šw•"‰ž—p"Šw‰È')

3. select N'PageName' from tbproduct where
CONTAINS(*,N'b"ì‘åŠw—Šw•"‰ž—p"Šw‰È')

I got "no result" from point 2 and 3.

Please give me some ideas.

Thank you very much!

Re: Does Full Text search function support Japanese characters John Kane
7/23/2003 5:29:06 PM
Tracey,
Why did you select "Neutral" from "Language for Word Breaker" when Japanese
is a valid option?
The use "Neutral" for Language for Word Breaker, causes the words to be
broken based upon the "white spaces" between words and this is not what you
want to use, if the only language in your FT Index columns.

I'd recommend that you drop and re-create your FT Catalog and re-create it
with "Japanese" for the Language for Word Breaker and then run a Full
Population and re-test your query. There may be other issues here as well,
such as changes to the MSSQLServer startup account from the Component
Services, but lets get the language for word breaker set to the correct
language first.

Regards,
John



[quoted text, click to view]
VAULES(N'b"ì'åSw-Sw."?z-p"Sw?È',N'b"ì'åSw-Sw."?z-p"Sw?È',N'b"ì'åSw-S
w."?z-p"Sw?È')
[quoted text, click to view]

Re: Does Full Text search function support Japanese characters traceycui332 NO[at]SPAM hotmail.com
7/27/2003 5:23:02 PM
Hi John,

Thanks for your reply.

I Have changed 'Neutral' to 'Japanese' from "language for Word
Breaker". And it is still not working properly, like, some results are
not related to search string. But,if I use 'like' instead of
'FREETEXT', I get all correct results. The SQL statement is:
select * from tbSearch where PageName like N'b"ì'åSw-Sw."?z-p"Sw?È'
?

The default language is 'en_english'(select @@language), unfortunately
I can not change the setting.

Do I need to change any settings to make the Full-text search function
working?

Regards,

Tracey


[quoted text, click to view]
Re: Does Full Text search function support Japanese characters John Kane
7/28/2003 7:28:09 AM
Tracey,
You should keep in mind that Full-Text Search (FTS) is "word based" vs. the
T-SQL command LIKE is "string based" and you will most often always get
different results when using these methods. While I can neither read or
right Japanese, if you translate these Japanese "words" into English words,
perhaps I could help more. However, because Japanese is a double-byte
language and characters can have more than one meaning, even direct
translation to English may be difficult.

You also say that you are getting "results are not related", this could be
related to the use of FREETEXT as it is a more "relaxed" FTS method and will
give you more results than CONTAINS. That said and the fact that you're
using Japanese can also explain why you're getting different "word" results.
I'd suggest using a "simple Japanese" word that grammatically is easy to
translate into English and has only one meaning for your FTS testing to
simplify the issues.

While your default language is US_ENGLISH (right?), you can specify in SQL
2000, a different language at the database and table column level as well as
use the correct "language for word breaker" with most, but not all of the
languages supported by SQL Server 2000.

Regards,
John



[quoted text, click to view]

AddThis Social Bookmark Button