Groups | Blog | Home
all groups > sql server full text search > december 2003 >

sql server full text search : Full-text indexing returns zero row



Sang Nahm
12/4/2003 8:29:43 AM
Hi,

I would like to use SQL Server 2000 Full-text indexing on
various file extension types such as XLS, DOC, HTM and
PDF, stored in a table. I have done the following:

1. Created a table as the following:
CREATE TABLE [dbo].[WPFTImages] (
[WPFTImageId] [int] IDENTITY (1, 1) NOT FOR
REPLICATION NOT NULL ,
[WorkPaperId] [int] NOT NULL ,
[ImageContent] [image] NOT NULL ,
[FileExt] [varchar] (5) COLLATE
SQL_Latin1_General_CP1_CI_AS NOT NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
2. We use the ColdFusion to load the binary documents
into the image column:

<cfquery name="insertBinaryData" datasource="regulatory"
dbtype="odbc">
INSERT INTO WPFTImages (WorkPaperId, FileExt,
ImageContent)

VALUES
(
#getworkpapers.workpaperid#,
'#getworkpapers.fileextension#',
<cfqueryparam value="#binData#"
cfsqltype="cf_sql_blob">
)
</cfquery>

My sense is that this is effective, because the documents
can be retrieved and reconstructed (e.g., as Word
documents) without error.
3. Used Full-text Index Wizard to create catalog
using dbo.WPFTImages table on ImageContent column using
FileExt column (doc, xls, htm, pdf) as FileType binding
column.
4. Performed full population of the catalog.

When I use the following query with a word that I know for
sure exists in the document stored in ImageContent field,
I get 0 row returned:

select WPFTImageId
from WPFTImages
WHERE Contains (ImageContent, '"bank"')

Please advise. Also, I did not see PDF files listed as
Full-Text index supporting file type in BOL. Is it
supported please?

Much thanks in advance.

Sang Nahm
John Kane
12/4/2003 9:30:49 AM
Sang,
In order for SQL Server 2000 FT Indexing of various file types to work
correctly, you must change the your "file extension" column "FileExt" from
"varchar(5)" to varchar(4) or sysname. You will need to drop the existing FT
Catalog and then alter the table and then re-create the FT Catalog and run a
Full Population.

Secondly, using ColdFusion to load the document files, may or may not be a
factor as it's also important that when the document files are
uploaded/inserted into the IMAGE column, that all internal binary data be
un-altered, so that the MSSearch engine can properly recognize the document
type.

Thirdly, for Adobe's PDF files, you will need to download Adobe's PDF
IFilter (Indexing Filter) from their web site at
http://www.adobe.com/support/techdocs/165c6.htm.

Regards,
John



[quoted text, click to view]

AddThis Social Bookmark Button