Thank you for yous answer. I know that for this sample the File.Copy() would
"Joerg Jooss" wrote:
> Carlo Marchesoni wrote:
>
> > I really don't achieve to read a simple 'input.txt' with the
> > following content: Jürg (Hex: 4a fc 72 67)
> > to an identical 'output.txt'
> >
> > I do the following (and tried with tons of different encodings):
> > private static void WriteFile() {
> > StreamWriter sr = File.CreateText("Output.txt");
> > try
> > {
> > using (TextReader tr = new StreamReader(new
> > FileStream("Input.txt",FileMode.Open),Encoding.ASCII ))
> > {
> > string iniLine = "";
> > while ((iniLine = tr.ReadLine()) != null)
> > {
> > if (iniLine.Length > 0)
> > sr.WriteLine(iniLine);
> > }
> > tr.Close();
> > }
> > }
> > catch
> > {
> > sr.Close();
> > }
> > sr.Flush();
> > sr.Close();
> > }
> >
> >
> > But in Output I NEVER have exactly the same Hex values as in Input.
> > Isn't there a way to say "take the same encoding as the input" ?
>
> There's no way of identifying a text file's character encoding (save
> for a few exceptions). And regarding your code sample, note that ASCII
> doesn't include Umlaut characters. Thus, your StreamReader simply loses
> them in this case.
>
> But the real issue is that File.OpenText() always uses UTF-8, but your
> sample text 0x4a 0xfc 0x72 0x67 is an 8 bit encoding, most likely
> Windows-1252 or ISO-8859-1. Even if you open the source file with the
> correct encoding, the output will always differ at the byte level,
> because UTF-8 encodes Umlaut characters differently.
>
> But why decode and encode anyway? Your code is a simple file copy. If
> that's all you need, File.Copy() or using FileStreams will work just
> fine with all encoding combinations.
>
> Cheers,
> --
>
http://www.joergjooss.de > mailto:news-reply@joergjooss.de