Groups | Blog | Home
all groups > dotnet xml > april 2005 >

dotnet xml : Modifying an XML of more than 10 GB(REALTIME) !!!!


Ross Presser
4/20/2005 2:02:01 PM
[added the some newsgroups, as this seems to be a more general idea than
just XML files]

On Wed, 20 Apr 2005 16:28:23 GMT, Harpreet Matharu via DotNetMonster.com
[quoted text, click to view]

Where do you expect the output to be, if you're not creating a new file?
You can't generally re-write a file in place.....

Well, maybe you can. Theoretical idea follows, no code written.

We need to design a ForwardOnlyRewritingStream class that inherits from
TextStream (or maybe Filestream). It keeps track of what's been read and
what's been written, pretending in effect to be two separate streams, one a
forward-only read stream, the other a forward-only writing stream. Attach
your XMLTextReader to its read stream and your XMLTextWriter to its write
stream.

Internally, the ReadStream will read blocks of the file into memory as
needed, and WriteStream will write blocks back to the same file. If
WriteStream is about to write to a block that ReadStream hasn't read in
yet, it forces ReadStream to read it in. That way the WriteStream can get a
little ahead of the ReadStream.

You could just let the ordinary memory management take care of it, letting
your unused-yet ReadStream blocks swap to disk like any other unused
objects. Or if you expect to expand your file greatly, like say from 10GB
to 11GB, you could write in your own swap management, writing out blocks to
disk, reading them back in and deleting them as ReadStream uses them up.

Searching on the web to see if this has ever been done before, the closest
I came across was the Perl module Tie::File, written by one of my favorite
people (MJD):

http://perl.plover.com/TieFile/

So now you could just use perl to manipulate your file. :) But I think this
idea has merit.
Harpreet Matharu via DotNetMonster.com
4/20/2005 4:28:23 PM
Hello,

I'm working on a very large XML file in C#. I need to modify or insert
certain data on spicified tags in the file, without creating a new file !!!!
!. How can this be possible ?
Oleg Tkachenko [MVP]
4/21/2005 12:00:00 AM
[quoted text, click to view]

Yes, it's pretty possible. There is a technique of chained
XmlReader/XmlWriter, which allows to perform XML modifications in a
streaming way. Take a look at
http://blogs.msdn.com/mfussell/archive/2005/02/12/371546.aspx for some
samples.

--
Oleg Tkachenko [XML MVP, MCP]
Ross Presser
4/21/2005 10:12:19 AM
[quoted text, click to view]

Looked at that link -- the trouble is that the XmlWriter gets attached to a
new stream, not the original file. It seemed to me that the original poster
wanted to rewrite the file in-place, without needing to allocate another
AddThis Social Bookmark Button