all groups > sql server full text search > january 2005 >
You're in the

sql server full text search

group:

Question on IFilters


Question on IFilters (ananthapus NO[at]SPAM hotmail.com)
1/18/2005 12:30:14 PM
sql server full text search:
Hi,

Are the Ifilters COM/CORBA objects that I could call from my code? I'm trying to find out whether I could call them in my java code to extract text from various document formats such as PDF/MS Office etc before storing them to the database. The documents that we are looking for full text search are on the average 100Mb in size and I'm looking at ways to cut down the size before storing them in the SQL server database.

Appreciate your reply,

Anantha

**********************************************************************
Sent via Fuzzy Software @ http://www.fuzzysoftware.com/
Re: Question on IFilters John Kane
1/18/2005 1:14:46 PM
Anantha,
The best information source for this is the MSDN Platform SDK "Using Custom
Filters with Indexing Service" at:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/indexsrv/html/ixufilt_912d.asp

Specifically, click on "Filter Samples" -> HtmlProp Sample:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/indexsrv/html/ixufilt_0lwl.asp

This provides examples of how to "extract value-type properties. It converts
HTML meta properties to data types other than strings as specified by a
configuration file."

Hope that helps!
John
--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/


[quoted text, click to view]
trying to find out whether I could call them in my java code to extract text
from various document formats such as PDF/MS Office etc before storing them
to the database. The documents that we are looking for full text search are
on the average 100Mb in size and I'm looking at ways to cut down the size
before storing them in the SQL server database.
[quoted text, click to view]
ASP.NET resources...

Re: Question on IFilters Hilary Cotter
1/18/2005 4:15:27 PM
They are com objects, you can call them from code. Here is an example of how
to call them.

http://sqljunkies.com/HowTo/C4AC6E97-8D84-411D-8551-08CE63EC99B6.scuk

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
[quoted text, click to view]
trying to find out whether I could call them in my java code to extract text
from various document formats such as PDF/MS Office etc before storing them
to the database. The documents that we are looking for full text search are
on the average 100Mb in size and I'm looking at ways to cut down the size
before storing them in the SQL server database.
[quoted text, click to view]
ASP.NET resources...

Re: Question on IFilters (ananthapus NO[at]SPAM hotmail.com)
1/19/2005 12:36:41 PM
Thanks John/Hilary for your reply and links.

I have another question though. If I store only txt documents in the SQL server (all other docuemnts are converted into txt documents before storing them in the database) does the indexing service still use the filter (in this case the standard filter) to extract textual data for indexing purposes?

Anantha


**********************************************************************
Sent via Fuzzy Software @ http://www.fuzzysoftware.com/
Re: Question on IFilters Hilary Cotter
1/19/2005 5:16:48 PM
yes, it uses the default or null iFilter.

--
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
[quoted text, click to view]
server (all other docuemnts are converted into txt documents before storing
them in the database) does the indexing service still use the filter (in
this case the standard filter) to extract textual data for indexing
purposes?
[quoted text, click to view]
ASP.NET resources...

Re: Question on IFilters John Kane
1/19/2005 6:38:27 PM
You're welcome, Anantha,
Yes, it does. However, keep in mind how you import or insert the text can be
important as well as where you store the row text. You may want to consider
using TextCopy.exe that ships with SQL Server 2000.

Thanks,
John
--
SQL Full Text Search Blog
http://spaces.msn.com/members/jtkane/


[quoted text, click to view]
server (all other docuemnts are converted into txt documents before storing
them in the database) does the indexing service still use the filter (in
this case the standard filter) to extract textual data for indexing
purposes?
[quoted text, click to view]
ASP.NET resources...

AddThis Social Bookmark Button