all groups > vj# > july 2004 >
You're in the

vj#

group:

System.Text.Encoder.GetBytes(...) ?


System.Text.Encoder.GetBytes(...) ? Lars-Inge Tønnessen
7/7/2004 1:26:32 AM
vj#: In C#:

System.Text.UTF8Encoding utf8 = new System.Text.UTF8Encoding()
System.Text.Encoder encoder = utf8.GetEncoder()
encoder.GetBytes( char[] chars, int charIndex, int charCount, byte[] bytes,
bool flush )

In J#:
System.Text.UTF8Encoding utf8 = new System.Text.UTF8Encoding()
System.Text.Encoder encoder = utf8.GetEncoder()
encoder.GetBytes( char[] chars, int charIndex, int charCount, ubyte[] bytes,
bool flush )

Why ubyte[] in J# ?

(I had to implement the utf-8 rfc 3629 in plain old C++ to make it work with
my norwegian ØÆÅ. =:o) )



Thanks,
Lars-Inge Tønnessen
www.larsinge.com

Re: System.Text.Encoder.GetBytes(...) ? Lars-Inge Tønnessen
7/7/2004 10:09:24 PM
OK, I got it!! =:o)

The System.Text.UTF8Encoding does convert it accurately on the bit level.
I have implemented the UTF-8 RFC (in J# too) just to make sure, and we get
the same result.

Source: Ø
00000000 00000000 00000000 11011000 - 216
My J# RFC impl.
00000000 00000000 00000000 11000011 - 195
00000000 00000000 00000000 10011000 - 152
The System.Text.UTF8Encoding
00000000 00000000 00000000 11000011 - 195
00000000 00000000 00000000 10011000 - 152

Source: Æ
00000000 00000000 00000000 11000110 - 198

My J# RFC Impl:
00000000 00000000 00000000 11000011 - 195
00000000 00000000 00000000 10000110 - 134
The System.Text.UTF8Encoding
00000000 00000000 00000000 11000011 - 195
00000000 00000000 00000000 10000110 - 134

Source: Å
00000000 00000000 00000000 11000101 - 197

My J# RFC Impl:
00000000 00000000 00000000 11000011 - 195
00000000 00000000 00000000 10000101 - 133
The System.Text.UTF8Encoding
00000000 00000000 00000000 11000011 - 195
00000000 00000000 00000000 10000101 - 133


The "problem" was with the last character in the utf-8 code. I got the same
letter for all ØÆÅ. It looks like all characters with bit 8 set print the
same letter in text.

But, it's working "under the hood". =:o)



Regards,
Lars-Inge Tønnessen
www.larsinge.com

AddThis Social Bookmark Button