Groups | Blog | Home
all groups > dotnet xml > october 2005 >

dotnet xml : Get 3 chars before <?xml version...


David Thielen
10/7/2005 5:14:01 PM
Hi;

My code is:

XmlDocument doc = new XmlDocument();
doc.AppendChild(xmlDoc.CreateXmlDeclaration("1.0", "UTF-8", ""));
....
doc.Save(outStream);

And my saved document has:
0xef 0xbb 0xbf before the <?xml...

What do I have to do to eliminate this? (.net 2.0)

--
David Thielen
10/7/2005 5:50:02 PM
I learn something new everyday - I was not aware of this. How long has this
been part of the standard?

--
thanks - dave


[quoted text, click to view]
Pascal Schmitt
10/8/2005 12:00:00 AM
Hello!

[quoted text, click to view]

Why do you want to do it?
AFAIK that's the Unicode Byte-Order-Marks wich every XML parser should
be able to understand.

Maybe a solution would be to switch to another encoding (US-ASCII,...)


--
Pascal Schmitt
10/8/2005 12:00:00 AM
[quoted text, click to view]

I don't know...
But it's not directly part of XML - it's part of the Unicode-Standard
(and since XML 1.0 is based on Unicode 2.0, it must be older than this...)

The 3 bytes in your document are the Byte Order Mark for UTF-8, wich is
optional.

--
Pascal Schmitt
Peter Flynn
10/8/2005 12:14:49 PM
[quoted text, click to view]

Since the very beginning. The WD-xml-961114 draft says (4.2.3):

"Entities encoded in UCS-2 must begin with the Byte Order Mark
described by ISO 10646 Annex E and Unicode Appendix B (the ZERO
WIDTH NO-BREAK SPACE character, U+FEFF). This is an encoding
signature, not part of either the markup or character data of
the XML document. XML processors must be able to use this
character to differentiate between UTF-8 and UCS-2 encoded
documents." [p.20]

///Peter
--
Bruno Jouhier
10/9/2005 7:56:23 PM
[quoted text, click to view]

"Byte Order Mark" makes sense for Unicode where the characters are read and
written as 16 bit quantities and where the byte order depends on the
endianity, not for UTF-8 where the data is read and written as a byte
stream.

For UTF-8, this is rather a "signature".

Bruno

AddThis Social Bookmark Button