all groups > dotnet xml > october 2007 >
You're in the

dotnet xml

group:

How to search against XML files in the file system?



How to search against XML files in the file system? clintonG
10/24/2007 7:51:21 PM
dotnet xml: Putting the search textbox on the page is the easy part. What's preferred
way to find terms in XML files located on the file system?
Like finding stuff saved in XML files some of the blogs use these days to
store their blog items? There can be lots and lots and lots of XML files on
the file system to search.

--
<%= Clinton Gallagher
NET csgallagher AT metromilwaukee.com
URL http://clintongallagher.metromilwaukee.com/


Re: How to search against XML files in the file system? Oleg Tkachenko [MVP]
10/28/2007 12:00:00 AM
[quoted text, click to view]

Naive implementation: parse each file in a turn and search in it.
Better one: implement your own indexing engine or use XML indexing
engine such as Lucene or use XML-enabled database such as SQL Server
2005 to store your XML documents.


--
Oleg Tkachenko [XML MVP, MCPD]
Re: How to search against XML files in the file system? clintonG
10/30/2007 12:17:45 PM

[quoted text, click to view]

That's cool thanks. Let me confirm what you're saying, I can keep the files
in the file system for reasons that are good to do so but I should save a
redundant copy as XML in SQL Server which can be used for search then? If
that's the way to work it out that's fine with me.

<%= Clinton

Re: How to search against XML files in the file system? clintonG
10/30/2007 7:54:11 PM

[quoted text, click to view]

I'm familiar with Lucene but can't deploy it --yet-- on the WebHost4Life
hosted service provider. I should make the time to install, configure and
try locally. I need to search through blogs and XML files used for feeds and
likely a wiki at some point so Lucene is a good solution. WebHost4Life is
pretty good at getting free stuff and making it available for customers. I'm
going to bug them about this.

Re: How to search against XML files in the file system? Oleg Tkachenko
10/30/2007 9:55:41 PM
[quoted text, click to view]

What is the best solution depends on a type of search you need, size and
type of XML documents etc. SQL Server is great choice, but data
duplication is bad as requires additional efforts to have XML data in
sync. I'd check Lucene too. RSS Bandit is using it to search in feeds
and seems to be pretty fast.

--
oleg
AddThis Social Bookmark Button