all groups > dotnet xml > june 2007 >
You're in the

dotnet xml

group:

Large XML file and some kind of indexing?


Large XML file and some kind of indexing? Andrew Brook
6/25/2007 3:38:08 PM
dotnet xml:
Hi Everyone,

I have a very large XML file (~1GB). I would like to essentially
pre-navigate the entire structure using an XmlReader and somehow index the
positions of important elements.

I suppose I'd hoped that I could access the exact file position of an xml
element, store that position with a unique id in a hashtable and then be
able to quickly seek back to that position at a later point to get the data
out.

The only way I can think of achieving this is to implement my own Stream
Reader which will allow seeking as well as returning accurate position
information (incorporating buffering etc).

I've seen documentation about special XPath Indexed Navigators but these
only work when the XML is in memory and that's definitely something I need
to avoid.

I don't suppose anyone has come across a problem like this, or perhaps have
some ideas about solving the problem?

Thanks,
Andrew

Re: Large XML file and some kind of indexing? Angel_J._Hernández_M.
6/29/2007 11:52:57 AM
Hi there,

What about if you insert that file into an XML Column in a SQL Server 2005
Table. In SQL Server 2005 you can have XML indexes

Best regards,


--
Angel J. Hernández M.
MCP,MCAD,MCSD,MCDBA,MCT
Microsoft MVP
http://msmvps.com/blogs/angelhernandez
angeljesus14@hotmail.com



[quoted text, click to view]
Re: Large XML file and some kind of indexing? Andrew Brook
7/2/2007 12:00:00 AM
Thanks, i'll look into that.

At the moment i'm implementing a text reader that has buffering like a
streamreader, but still exposes the correct position in the base stream. Not
sure where i'll head after that, but maybe use it as a basis for an
XmlTextReader.

Andrew

[quoted text, click to view]

Re: Large XML file and some kind of indexing? Andrew Brook
7/2/2007 12:00:00 AM
Just as an update, i'm giving up on the custom indexing approach. I only
managed to produce an index that was a fifth of the size of my original data
(test file 10mb). Unfortunately that's still far too big, especially if my
XML is 1GB. I'll perhaps come back to this when I get time. :)

Andrew


[quoted text, click to view]

AddThis Social Bookmark Button