Groups | Blog | Home
all groups > sql server programming > august 2004 >

sql server programming : storing and searching office docs in SQL


John Kane
8/18/2004 7:23:29 PM
aussie rules,
Vishal is incorrect.

You CAN both store and search (as in Full-text) the contents of the MS Word
files stored in an SQL Table's FT-enable IMAGE column, IF you are using SQL
Sever 2000 and add a "file extension" column!!! Search SQL Server 2000 Books
online (BOL) for title "Filtering Supported File Types" and
sp_fulltext_column. Specifically:

[@type_colname =] 'type_column_name'

Is the name of a column in qualified_table_name that holds the document type
of column_name. This column must be char, nchar, varchar, or nvarchar. It is
only used when the data type of column_name is an image. type_column_name is
sysname, with no default.

Additionally, below is a SQL script example of using TextCopy.exe to import
MS Word documents into SQL Sever table and IMAGE column and then using SQL
FTS CONTAINS or FREETEXT to search the contents of that MS word document:

use pubs
go
if exists (select * from sysobjects where id = object_id('FTSTable'))
drop table FTSTable
go
CREATE TABLE FTSTable (
KeyCol int IDENTITY (1,1) NOT NULL
CONSTRAINT FTSTable_IDX PRIMARY KEY CLUSTERED,
TextCol text NULL,
ImageCol image NULL,
ExtCol char(3) NULL, -- can be either sysname or char(3)
TimeStampCol timestamp NULL
) ON [PRIMARY]
go
-- Insert data... (Note: Initalizing IMAGE column with 0xFFFFFFFF for use
with TextCopy.exe)
INSERT FTSTable values('Test TEXT Data for row 1', 0xFFFFFFFF, 'doc', NULL)
go
-- Select data
SELECT * from FTSTable
go
declare @query varchar(200)
-- Insert MS_Word_document.doc into Row 5 !!
-- NOTE: Ensure the correct path for textcopy.exe!!
set @query = 'D:\MSSQL80\MSSQL$SQL80\Binn\textcopy /s '+@@servername+' /u sa
/p<password> /d pubs /t FTSTable /c ImageCol /f
D:\SQLFiles\Shiloh\<MS_Word>.doc /i /k 5000 /w "where KeyCol=1"'
print @query
exec master..xp_cmdshell @query
go
-- Select data
SELECT * from FTSTable
go
-- FTI
use pubs
go
exec sp_fulltext_database 'enable' -- only do this once!
go
-- Drop FTI, if necessary...
-- exec sp_fulltext_table 'FTSTable','drop'
-- exec sp_fulltext_Catalog 'FTSCatalog','drop'

exec sp_fulltext_catalog 'FTSCatalog','create'
exec sp_fulltext_table 'FTSTable','create','FTSCatalog','FTSTable_IDX'
exec sp_fulltext_column 'FTSTable','ImageCol','add', 0x0409, 'ExtCol'
exec sp_fulltext_column 'FTSTable','TextCol','add'
exec sp_fulltext_table 'FTSTable', 'activate'
go
-- Start FT Indexing...
exec sp_fulltext_catalog 'FTSCatalog','start_full'
go
-- Wait for FT Indexing to complete and check NT/Win2K Application log for
success/errors..

-- Search for search_word_here in MS_Word
select KeyCol, ImageCol from FTSTable where
contains(*,'<search_word_in_MS_Word_here>') order by KeyCol
go
-- Search for search_word_here in MS_Word file...
select KeyCol, ImageCol from FTSTable where
freetext(*,'<search_word_in_MS_Word_here>') order by KeyCol
go

-- Remove FT Indexes & Catalog & table..
exec sp_fulltext_table 'FTSTable','drop'
exec sp_fulltext_Catalog 'FTSCatalog','drop'
drop table FTSTable

Regards,
John




[quoted text, click to view]

aussie rules
8/18/2004 11:52:18 PM
Hi,

Can you store an office doc, like a word or excel document in a SQL field,
and then somehow search within the contents of the office document for
words.

Is there somewhere i can read up on this ?


Thanks




Vishal Parkar
8/19/2004 6:33:33 AM
[quoted text, click to view]
field,<<

you can store word documents using image datatypes in SQL Server, internally
sql server stores this data into binary format.

[quoted text, click to view]
words.<<

Since content of the image datatype are binary in nature, you can not
straightaway search for the content of word document once you store them
image datatype.

all you can do is ,store documents in one column (image) and have another
column (text/varchar according to the size of the contents of the document)
column which will store the content of the word document (in pure text,
which will not have any kind of word fomatting). You can implement search
functionality in the document by running standard sql statement against this
column.

--
Vishal Parkar
vgparkar@yahoo.co.in | vgparkar@hotmail.com

Vishal Parkar
8/20/2004 12:28:53 AM
I haven't worked much with full-text indexing. Thanks for those inputs.

--
Vishal Parkar
vgparkar@yahoo.co.in | vgparkar@hotmail.com

AddThis Social Bookmark Button