Binder,
Q. What is the relationship between FTS and Indexing Service?
A. While they use the same underlying Microsoft Search Technology, they full
text index different servers. Indexing Service handles the server's files on
its local disk drive, while FTS (or really the "Micrsoft Search" service
[mssearch.exe]) full text indexes textaul (char, nvarchar, text, etc.)
columns in SQL Server tables. Yes, it seems to me that using the Indexing
Service, should work for you.
What is the name of your app? Does it support SQL Server 2000? If so, does
it support the storage of MS Word documents in columns that are defined with
the IMAGE datatype? Is the feature that is titled "Full-text Querying of
File Data", a feature of your app, or are you referring to the feature of
SQL Severer (version) ?
In addition to SQL Server's Full-text Search (FTS) component, you can also
define a "Linked Server" to the Indexing Service via using MSIDX, the "OLE
DB Provider for Microsoft Indexing Service". You would define this linked
server via sp_addlinkedserver. Below is an example from SQL Server 2000
Books Online:
G. Use the Microsoft OLE DB Provider for Indexing Service
This example creates a linked server and uses OPENQUERY to retrieve
information from both the linked server and the file system enabled for
Indexing Service.
EXEC sp_addlinkedserver FileSystem,
'Index Server',
'MSIDXS',
'Web'
GO
USE pubs
GO
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'yEmployees')
DROP TABLE yEmployees
GO
CREATE TABLE yEmployees
(
id int NOT NULL,
lname varchar(30) NOT NULL,
fname varchar(30) NOT NULL,
salary money,
hiredate datetime
)
GO
INSERT yEmployees VALUES
(
10,
'Fuller',
'Andrew',
$60000,
'9/12/98'
)
GO
IF EXISTS(SELECT TABLE_NAME FROM INFORMATION_SCHEMA.VIEWS
WHERE TABLE_NAME = 'DistribFiles')
DROP VIEW DistribFiles
GO
CREATE VIEW DistribFiles
AS
SELECT *
FROM OPENQUERY(FileSystem,
'SELECT Directory,
FileName,
DocAuthor,
Size,
Create,
Write
FROM SCOPE('' "c:\My Documents" '')
WHERE CONTAINS(''Distributed'') > 0
AND FileName LIKE ''%.doc%'' ')
WHERE DATEPART(yy, Write) = 1998
GO
SELECT *
FROM DistribFiles
GO
SELECT Directory,
FileName,
DocAuthor,
hiredate
FROM DistribFiles D, yEmployees E
WHERE D.DocAuthor = E.FName + ' ' + E.LName
GO
Regards,
John
[quoted text, click to view] "Binder" <rgondzur@NO_SPAM_aicsoft.com> wrote in message
news:OcUOqQGWEHA.1012@TK2MSFTNGP09.phx.gbl...
> John,
>
> What is the relationship between FTS and Indexing Service?
> It looks like the Indexing Service maintains a catalog much the same as
FTS.
>
> We have support for WORD in our app already by storing the WORD doc in our
> file warehouse on the file system.
> We can display the .doc file in our viewer the same as a .tif image.
> We currently don't have functionality to search for data in the WORD docs,
> only text from the OCR process.
> Since the WORD file is already stored in the file system and referenced by
> our application, I was wondering about the feature that is titled
"Full-text
> Querying of File Data"
>
> It looks like it uses the Index Service to allow searching for data in
files
> on the file system.
> Wouldn't that work for my scenario?
>
> It appears that when we want to search for data contained in a WORD doc,
we
> would use the SCOPE function in our query. Otherwise, we continue to
search
> for text from the OCR process.
>
> Can you provide some insight?
>
> Thanks
>
>
>
>
>
> "John Kane" <jt-kane@comcast.net> wrote in message
> news:O7TiL6BWEHA.2928@tk2msftngp13.phx.gbl...
> > Binder,
> > What version of SQL Server (2000 or 7.0) and on what OS platform (NT4.0,
> > Win2K, or Win2003) is it installed? Could you post the full output of
> > SELECT @@version -- as this is helpful to answering your question.
> >
> > If you are using SQL Server 2000, you can use it's new feature (this
> feature
> > is not present in SQL 7.0) - from SQL Sever 2000 BOL title "Filtering
> > Supported File Types". This feature allows you to store the binary
version
> > of the MS Word document and then in your table define a file extension
> > column and populate it with the correct values ("doc" for MS Word
> document)
> > and then run a Full Population and then you can use the CONTAINS or
> FREETEXT
> > quires to FTS the contents of these files stored in a sql table>
> >
> > If you are using SQL Server 7.0, you will need to setup a process to
> extract
> > the MS Word text and then store this text in a TEXT column and the FT
> Index
> > that column, much as you do for your OCR'ed data.
> >
> > Regards,
> > John
> >
> >
> > "Binder" <rgondzur@NO_SPAM_aicsoft.com> wrote in message
> > news:eIXu546VEHA.2716@tk2msftngp13.phx.gbl...
> > > We currently have an application that OCRs a tif image and places the
> > > recognized text in a SQL table.
> > > The table is then indexed by the FTS service.
> > > The app then allows you to search for any of the text and display the
> > > corresponding tif image in a viewer.
> > >
> > > I would also like to be able to search WORD docs for their contents
> using
> > > the same catalog.
> > >
> > > What is the proper manner to have the WORD docs indexed by the FTS
> > service?
> > > Do I need to extract the text from the WORD doc and store it in the
> table
> > > much like the recognized text
> > > from the OCR process?
> > >
> > > Thanks
> > >
> > >
> > >
> >
> >
>
>