Sang,
In order for SQL Server 2000 FT Indexing of various file types to work
correctly, you must change the your "file extension" column "FileExt" from
"varchar(5)" to varchar(4) or sysname. You will need to drop the existing FT
Catalog and then alter the table and then re-create the FT Catalog and run a
Full Population.
Secondly, using ColdFusion to load the document files, may or may not be a
factor as it's also important that when the document files are
uploaded/inserted into the IMAGE column, that all internal binary data be
un-altered, so that the MSSearch engine can properly recognize the document
type.
Thirdly, for Adobe's PDF files, you will need to download Adobe's PDF
IFilter (Indexing Filter) from their web site at
http://www.adobe.com/support/techdocs/165c6.htm. Regards,
John
[quoted text, click to view] "Sang Nahm" <sang.nahm@ots.treas.gov> wrote in message
news:06ef01c3ba83$d283a1f0$a001280a@phx.gbl...
> Hi,
>
> I would like to use SQL Server 2000 Full-text indexing on
> various file extension types such as XLS, DOC, HTM and
> PDF, stored in a table. I have done the following:
>
> 1. Created a table as the following:
> CREATE TABLE [dbo].[WPFTImages] (
> [WPFTImageId] [int] IDENTITY (1, 1) NOT FOR
> REPLICATION NOT NULL ,
> [WorkPaperId] [int] NOT NULL ,
> [ImageContent] [image] NOT NULL ,
> [FileExt] [varchar] (5) COLLATE
> SQL_Latin1_General_CP1_CI_AS NOT NULL
> ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
> GO
> 2. We use the ColdFusion to load the binary documents
> into the image column:
>
> <cfquery name="insertBinaryData" datasource="regulatory"
> dbtype="odbc">
> INSERT INTO WPFTImages (WorkPaperId, FileExt,
> ImageContent)
>
> VALUES
> (
> #getworkpapers.workpaperid#,
> '#getworkpapers.fileextension#',
> <cfqueryparam value="#binData#"
> cfsqltype="cf_sql_blob">
> )
> </cfquery>
>
> My sense is that this is effective, because the documents
> can be retrieved and reconstructed (e.g., as Word
> documents) without error.
> 3. Used Full-text Index Wizard to create catalog
> using dbo.WPFTImages table on ImageContent column using
> FileExt column (doc, xls, htm, pdf) as FileType binding
> column.
> 4. Performed full population of the catalog.
>
> When I use the following query with a word that I know for
> sure exists in the document stored in ImageContent field,
> I get 0 row returned:
>
> select WPFTImageId
> from WPFTImages
> WHERE Contains (ImageContent, '"bank"')
>
> Please advise. Also, I did not see PDF files listed as
> Full-Text index supporting file type in BOL. Is it
> supported please?
>
> Much thanks in advance.
>
> Sang Nahm
>