all groups > sql server full text search > september 2003 >
You're in the

sql server full text search

group:

Ignored words with Hebrew


Re: Ignored words with Hebrew John Kane
9/20/2003 1:14:35 AM
sql server full text search:
Tamir,
While SQL Server (7.0 and 2000) supports the Hebrew language, Full-Text
Search (FTS) only supports a smaller sub-set of these languages and
unfortunately Hebrew is not one of these languages. See SQL Server 2000 BOL
title "Column-Level Linguistic Analysis" for more details.

What you can do is drop and re-create your existing FT Catalog and use the
Neutral "Language for Word Breaker" and then run a Full Population and
re-test your FTS query . Note, with the Neutral word breaker, you lose the
ability to use the INFLECTIONAL predicate for your search terms. This is
because the Neutral word breaker "breaks" the words during the FT
Indexing/Population process using the "white space" between the words, so in
effect searching for the of "rock" will not find "rocker" as with the
Neutral wordbreaker, "rock" and "rocker" are different words.

For some good news about Hebrew FT searches, you might want to review
http://www.melingo.com/morfix_data.htm#sql and their Data Morfix product as
"Morfix will Plug-In to your SQL Server database, to enable effective
searching of textual elements of the database." You should contact them for
more information. Let others on this newsgroup know, if after you test it,
that it meets your expectations!!

Regards,
John


[quoted text, click to view]

Ignored words with Hebrew Tamir Kamara
9/20/2003 10:34:52 AM
Hi,

I'm having a weird problem: I'm trying to search the Indexing service
catalog with Hebrew words via the sql server (linked to the catalog), but
I'm getting "... only ignored words".
the weird thing is that when I try to query the catalog directly from asp
page (with ole connection) it works fine and I get results. I use the
"Locale Identifier=1037" in the linked server and in the ole connection
string but somehow it doesn't work with the sql server. furthermore, the
problem is only when the server is logged out - when a user login to the
server there is no problem and the query works fine. this is related to the
regional setting but I checked everything and the default language is what
it should be - Hebrew.

Any ideas on how to get around this?

Re: Ignored words with Hebrew John Kane
9/23/2003 2:11:19 PM
You're welcome, Tamir,
Hmm, this is the SQL Server Full-Text Search newsgroup
(microsoft.public.sqlserver.fulltext), so I just assumed that you were using
SQL FTS... <G>.
Could you post your Openquery query statement as well as how you have the
MSIDX Link Server defined? Does the error "...only ignored words" ONLY
occur when no user is logged on to the server (server defined logged into
the "SQL Server"?) server where IS is located and logged into that server?

I don't believe that SQL Server supports adding "locale identifier=1037" to
the connection string, and you might consider cross-posting this question to
the IS newsgroup at microsoft.public.inetserver.indexserver

Regards,
John


[quoted text, click to view]

Re: Ignored words with Hebrew Tamir Kamara
9/23/2003 7:43:49 PM
John,

Thank you for responding but I think you have really understand the
situation:
I'm not talking about full-text with databases but about using indexing
service as a linked server in the sql server. I'm using the openquery
method.
my problem is that when no user is logged on to the server the IS returns
"...only ignored words". I tried adding "locale identifier=1037" to the
connection string (in the linked server adding) but sql server apparently
doesn't pass it throw.
and to comment to something you've written - IS supports Hebrew fully (as
far as I've seen).



[quoted text, click to view]

Re: Ignored words with Hebrew John Kane
9/25/2003 3:02:44 PM
Hi Tamir,
Thanks for the additional info! I've done a bit of research and while there
are very few postings that reference Hebrew in the IS newsgroup (only 2), a
quick check of both the web (via Google :-) for Hebrew and "indexing
Service" as well as SQL Server 2000 turns up more references. First of all,
you may be using the wrong "Locale Identifier=1037", instead I *think* you
should be using 1255 or "Locale Identifier=1255", as SQL 2000 BOL title
"Collations" as 1255 as the codepage for Hebrew. Additionally, the MSDN
article "Microsoft Index Server Tips and Tricks" at:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnindex/html/msdn_istips.asp
also has 1255 as the base charset. Additionally, you may want to review
"Indexing Service Fails to Find DBCS Characters Update" at
http://www.microsoft.com/windows2000/downloads/recommended/q286221/default.asp
(Google found this hit, as Hebrew is in the html source code) as based upon
your OpenQuery I suspect that you are searching MS Word documents that
contain Hebrew... If not, let me know and I can research this further.

Regards,
John


[quoted text, click to view]

Re: Ignored words with Hebrew Tamir Kamara
9/25/2003 7:50:04 PM
Hi,

I've a server installed with sqlserver2k and iis which also runs the IS.

I use this to add the linked server:
sp_addlinkedserver @server = 'MyLinkedServer', @srvproduct = 'Index
Server', @provider = 'MDIDXS', @datasrc = 'Web', @provstr = 'Locale
Identifier=1037'

And this to query the catalog:
SELECT *
FROM OpenQuery(MyLinkedServer,
'SELECT FileName, Size, DocAuthor
FROM SCOPE('' "D:\" '')
WHERE CONTAINS(''"SomeHebrewWord"'')')

and yes - the error only occuer when no user is logged into the server



[quoted text, click to view]

Re: Ignored words with Hebrew Tamir Kamara
9/27/2003 10:39:50 AM
Hi,

I need to try this at work but I think that the correct number is 1037
because I also tried querying the IS directly from the asp page (using the
ole thing) and again I had a problem with the ignored words but then I added
the "locale...1037" to the connection string and it did the trick.
I'll try it and be in touch.



[quoted text, click to view]
Re: Ignored words with Hebrew John Kane
10/1/2003 4:30:52 PM
Tamir,
I'm assuming that you're using Windows Server 2000. Correct? If so, what is
the service pack (SP) level for your Win2K server? If SP3, did you download
and apply this Windows 2000 SP3 hotfix?

http://www.microsoft.com/windows2000/downloads/recommended/q286221/default.asp

If not, you may want to do so as well as consider upgrading to Windows 2000
SP4, as the hotfix should also be included in SP4.
Regards,
John



[quoted text, click to view]
Re: Ignored words with Hebrew Tamir Kamara
10/1/2003 8:28:57 PM
John,

I've checked the 1255 code but it doesn't improve the situation. still the
same error. any ideas?

[quoted text, click to view]

Re: Ignored words with Hebrew John Kane
10/5/2003 7:18:38 AM
You are correct about SP3 as the hotfix is listed as "Windows 2000 Hotfix
(Pre-SP3)", but then I didn't ask for your full output of SELECT @@version
either... my bad...

My gut reaction is that this is less of a SQL FTS problem and more of either
an Indexing Service problem/issue or more related to the Hebrew
language/collation. You might want to post again to the IS newsgroup
(microsoft.public.inetserver.indexserver) with more of the exact problem as
well as a sample doc/file for them to test with IS as IS and SQL FTS use the
same basic MS Search technology...

Regards,
John



[quoted text, click to view]
Re: Ignored words with Hebrew Tamir Kamara
10/5/2003 11:13:54 AM
I have SP3 installed, meaning I don't need to install the hotfix separately
because 2000 server SP3 isn't listed as affected system.


[quoted text, click to view]
AddThis Social Bookmark Button