Groups | Blog | Home
all groups > sql server full text search > january 2004 >

sql server full text search : PDF Search Engine


shajutc NO[at]SPAM yahoo.co.in
1/26/2004 8:04:23 AM
Hi,

I am building a search engine for PDF documents for a website.
I have both Engish and Arabic PDF files mixed up together.
The things what I already done are as follows.

1. Created a table with an image column and doctype column.
2. Saved all pdf documents in binary format(Arabic and English mixed
up in the same table) and .pdf in doctype column.
3. Created a full text index with wordbreaker as neutral and document
type column as the coulumn 'doctype'.
4. Populated the full text index.
5. Created an asp page which has a text box and submit button. This
asp page redirects to a search page with the search text as
querystring which queries the full text index and return the values.

I found that its working properly for English search words. But no
result is displaying for the Arabic words even if the arabic words are
there in the pdf document.

I am using win2000 and SQL server 2000 with sp3.

Can anybody point me some solution for that...? Am I going through the
right way..?
Pls help me....

John Kane
1/26/2004 1:24:02 PM
Thomas,
While SQL Server (7.0 and 200) both fully support the Arabic language,
Full-Text Search only supports a "subset" of SQL Server supported languages
and Arabic is not one of those languages. However, you can use the
"Neutral - Language for Word Breaker" when you setup the FT Catalog via the
FT Indexing Wizard under SQL Server 2000. In addition to the BOL
documentation, there is now on MSDN - "Arabic Language Support in Microsoft
SQL Server 2000" at:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_arabicsupport.asp
that might also be helpful to you. It is also recommended that you not mix
the languages and store only one language per column and then specify the
correct language for that column's "Language for Word Breaker".

Regards,
John



[quoted text, click to view]

AddThis Social Bookmark Button