Groups | Blog | Home
all groups > dotnet xml > october 2005 >

dotnet xml : Invalid character in XML


v-kevy NO[at]SPAM online.microsoft.com
10/14/2005 9:17:44 AM
Hi Marc,

As far as I know, we can get the Xml file as a stream and when constructing
the stream reader object, we can specify the encoding type. We cannot
change the encoding type after during reading after constructed. We can
also spaecify the encoding type in the XmlTextReader constructor with
XmlParserContext. If you need to ignore the exception, you just catch it
and do nothing.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
Marc Jennings
10/14/2005 9:49:04 AM
Hi there,

I have a 600MB xml file that I am trying to pull a small amount of
data from, using an XMLTextReader in C#.

All works well, until I get an exception thrown in linw 4,277,905
because of an illegal character for the encoding type.
"There is an invalid character in the given encoding. Lin 4277905,
position 26."

Now, this file is obviously fairly large - too large for a text editor
- so I was wondering two things

1) Is there a way to change the encoding type of the XmlTextReader
object? (I had a quick look but it seems to be read only)
2) Is there another way to ignore errors in an element? I know that
the error at the line mentioned above is not within data that I am
trying to extract on this run through, so I can safely ignore it.

TIA
Peter Flynn
10/14/2005 8:17:45 PM
[quoted text, click to view]

Not a very useful piece of software if it doesn't actually say what
the character is...

[quoted text, click to view]

No, Emacs can easily handle a file this big (assuming you have some
sensible amount of memory). Otherwise use standard text utilities, eg
$ head -4277905 myfile.xml | tail -1
These are available for Windows systems if you install CygWin.

Some people dislike using console utilities, but they should be in the
toolbag of any heavy XML user for the occasions when other methods fail.

[...]
[quoted text, click to view]

Not easily. XML processors usually work on the basis that you process the
whole document. But it's often possible to use a stream utility as a
non-XML filter to extract the well-formed subset which contains the data
you are interested in.

///Peter
AddThis Social Bookmark Button