"Oleg Tkachenko" <foo@dummy.com> wrote in message
news:uFklS7yGIHA.1208@TK2MSFTNGP05.phx.gbl...
> clintonG wrote:
>> "Oleg Tkachenko [MVP]" <some@body.com> wrote in message
>> news:%23l24%23kUGIHA.1188@TK2MSFTNGP04.phx.gbl...
>>> clintonG wrote:
>>>> Putting the search textbox on the page is the easy part. What's
>>>> preferred way to find terms in XML files located on the file system?
>>>> Like finding stuff saved in XML files some of the blogs use these days
>>>> to store their blog items? There can be lots and lots and lots of XML
>>>> files on the file system to search.
>>> Naive implementation: parse each file in a turn and search in it.
>>> Better one: implement your own indexing engine or use XML indexing
>>> engine such as Lucene or use XML-enabled database such as SQL Server
>>> 2005 to store your XML documents.
>>>
>>>
>>> --
>>> Oleg Tkachenko [XML MVP, MCPD]
>>>
http://www.tkachenko.com/blog |
http://www.XmlLab.Net >>
>> That's cool thanks. Let me confirm what you're saying, I can keep the
>> files in the file system for reasons that are good to do so but I should
>> save a redundant copy as XML in SQL Server which can be used for search
>> then? If that's the way to work it out that's fine with me.
>
> What is the best solution depends on a type of search you need, size and
> type of XML documents etc. SQL Server is great choice, but data
> duplication is bad as requires additional efforts to have XML data in
> sync. I'd check Lucene too. RSS Bandit is using it to search in feeds and
> seems to be pretty fast.
>
> --
> oleg
hosted service provider. I should make the time to install, configure and
try locally. I need to search through blogs and XML files used for feeds and
likely a wiki at some point so Lucene is a good solution. WebHost4Life is
pretty good at getting free stuff and making it available for customers. I'm
going to bug them about this.