[quoted text, click to view] Marc Jennings wrote:
> Hi there,
>
> I have a 600MB xml file that I am trying to pull a small amount of
> data from, using an XMLTextReader in C#.
>
> All works well, until I get an exception thrown in linw 4,277,905
> because of an illegal character for the encoding type.
> "There is an invalid character in the given encoding. Lin 4277905,
> position 26."
Not a very useful piece of software if it doesn't actually say what
the character is...
[quoted text, click to view] > Now, this file is obviously fairly large - too large for a text editor
No, Emacs can easily handle a file this big (assuming you have some
sensible amount of memory). Otherwise use standard text utilities, eg
$ head -4277905 myfile.xml | tail -1
These are available for Windows systems if you install CygWin.
Some people dislike using console utilities, but they should be in the
toolbag of any heavy XML user for the occasions when other methods fail.
[...]
[quoted text, click to view] > 2) Is there another way to ignore errors in an element? I know that
> the error at the line mentioned above is not within data that I am
> trying to extract on this run through, so I can safely ignore it.
Not easily. XML processors usually work on the basis that you process the
whole document. But it's often possible to use a stream utility as a
non-XML filter to extract the well-formed subset which contains the data
you are interested in.
///Peter