Groups | Blog | Home
all groups > dotnet xml > may 2004 >

dotnet xml : XMLDocument character encoding


Eric Cadwell
5/21/2004 2:10:11 PM
We are encoding strings using XMLElement:

private string XMLEncode(string val)
{
if(val.Length == 0)
return string.Empty;
XmlElement element = xmldoc.CreateElement("E");
element.InnerText = val;

return element.InnerXml;
}

The question is what encoding is being used to translate the string? What is
the default encoding for XMLDocument? Is the resultant string in UTF-8,
Unicode, etc...?
Our server side components require ISO-8859-1 so I am now trying to convert
from one char set to another like this:

private string XMLEncode(string val)
{
if(val.Length == 0)
return string.Empty;
element = xmldoc.CreateElement("E");
element.InnerText = val;

string temp = element.InnerXml;
byte[] wrong = System.Text.Encoding.Unicode.GetBytes(temp);
byte[] right = System.Text.Encoding.Convert(Encoding.Unicode,
Encoding.GetEncoding("ISO-8859-1"), wrong);
string done = Encoding.GetEncoding("ISO-8859-1").GetString(right);

return done;
}

Am I correct in assuming that the encoder is Unicode? Is this a time bomb?
It appears to be working correctly - just seems like a hack!

Thanks;
-Eric

Dare Obasanjo [MSFT]
5/21/2004 2:49:13 PM
The encoding of all strings in the .NET Framework is UTF-16. This means that
your XMLEncode method is returning a UTF-16 string. The correct thing to do
in this case would be to save the XmlDocument to a stream and specify the
encoding you want on the stream writer.

--
This posting is provided "AS IS" with no warranties, and confers no rights.

[quoted text, click to view]

Eric Cadwell
5/21/2004 4:08:04 PM
Thanks.
-Eric


[quoted text, click to view]

AddThis Social Bookmark Button