all groups > dotnet internationalization > october 2004 >
You're in the

dotnet internationalization

group:

Store in a file a web page written in chinese


Store in a file a web page written in chinese etantonio NO[at]SPAM libero.it
10/25/2004 1:10:45 AM
dotnet internationalization:
Hi,
I want to read an html page written in chinese and store it in a file
having extension .aspx , I'm not sure where is the problem, I use the
following lines of code:

String sAddress = "http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http://www.etantonio.it/EN/index.aspx"
;

WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8 );
String sHtmlTradotto = reader.ReadToEnd();

StreamWriter writer = new StreamWriter( "prova.aspx" , false,
System.Text.Encoding.UTF8) ;

writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();

But the file produced didn't contain the chinese characters so, how
can I solve the problem???

Many Thanks in advance ...

RE: Store in a file a web page written in chinese Nitin
10/29/2004 1:26:02 AM
file which u have might have the chinese character but u might not be seeing
??? because of improper font setting select the font for chinese language and
then check

[quoted text, click to view]
Re: Store in a file a web page written in chinese etantonio NO[at]SPAM libero.it
11/2/2004 6:57:51 AM
With the following page aspx
I try to translate one my page from English to Chinese, using UTF8,
the result Is that the Chinese characters do not come read correctly,
if instead I insert directly the address
http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=http://www.etantonio.it/en/index.aspx&lp=en_zh
into the browser the page he comes shown correctly in Chinese, if i
save it and put it in my site and with the same below script I try to
read it and to save it always with utf8, the Chinese characters come
saves you normally, than problem there is to your opinion? My scope is
to save in automatic way in a file with extension aspx the content of
the page http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=http://www.etantonio.it/en/index.aspx&lp=en_zh

hello and thanks....
Antonio D'Ottavio
www.etantonio.it



<%@ Page Language="c#" debug="true" trace="true"%>
<%@ import Namespace="System" %>
<%@ import Namespace="System.IO" %>
<%@ import Namespace="System.Net" %>

<script runat="server">
static string sLanguageSrc = "EN";
static string sLanguageDest = "ZH";
string PathDirectory ;
static FileInfo[] fi ;

void Page_Load(Object Src, EventArgs E )
{
String sAddressEncoded =
HttpUtility.UrlEncode("http://www.etantonio.it/en/index.aspx") ;
String sAddress =
"http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=" +
sAddressEncoded + "&lp=" + sLanguageSrc + "_" + sLanguageDest ;
WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream, Encoding.UTF8
);
String sHtmlTradotto = reader.ReadToEnd();

String RegStringSymError =
"(?i)\\<script\\slanguage=\"JavaScript\"\\>(\\s\\n)*\\<!--(\\s\\n)*function\\sSymError\\(\\)(\\s|\\n)*{(\\s|\\n)*return\\strue;(\\s|\\n)*}(\\s|\\n)*window.onerror\\s=\\sSymError;(\\s\\n)*//--\\>(\\s\\n)*\\</script\\>";
sHtmlTradotto = Regex.Replace(sHtmlTradotto, RegStringSymError,
"");
Trace.Write("sHtmlTradotto", sHtmlTradotto);

StreamWriter writer = new StreamWriter(
Server.MapPath("/Etantonio/EN/ZH_Tradotta.aspx") , false,
System.Text.Encoding.UTF8) ;
writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();
}


</script>

<html>
<head>
<title>Traduttore Cinese</title>
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1">
<META name="author" content="Antonio DOttavio">
<META name="keywords" content="Motore Ricerca Gif Animate, Animated
Gif, Gif Animate, Gif, Animated, WebMaster, Web, Azioni, Borsa,
Grafici, Criteri, Elettronica, Telecomunicazioni, Informatica,
Università, Economia, Finanza">
<meta name="description" content="Motore Ricerca Gif Animate, Animated
Gif">
<link href="../../Stili.css" rel="stylesheet" type="text/css">
</head>

<body>
</body>
Re: Store in a file a web page written in chinese Sylvain Lafontaine
11/2/2004 12:44:01 PM
Trying to display Chinese with the charset iso-8859-1? If you want to
display Chinese, all of your page must be in Unicode and not only just a
part of it, the other part being in italian.

Replace iso-8859-1 with utf-8 and take at the following two articles
(especially the end of the first one). The second one is there in case you
need to know the code page for UTF-8 (65001: Response.Codepage = 65001 or
Session.CodePage=65001 but Reponse.CharSet="UTF-8").

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_dataencoding.asp

http://support.microsoft.com/?kbid=232580

S. L.

[quoted text, click to view]

Re: Store in a file a web page written in chinese etantonio NO[at]SPAM libero.it
11/3/2004 11:58:57 PM
Hi Sylvain,
I maded what you suggested, in my page named TraduttoreCinese, I
changed to utf-8 in fact now I have "charset=utf-8" and
System.Text.Encoding.UTF8 both for reading the page from the web and
for writing to a file, this is the code:

////////////////////////////////////////////////////////////////////////
<%@ Page Language="c#" debug="true" trace="true"%>
<%@ import Namespace="System" %>
<%@ import Namespace="System.IO" %>
<%@ import Namespace="System.Net" %>

<script runat="server">
static string sLanguageSrc = "EN";
static string sLanguageDest = "ZH";
string PathDirectory ;
static FileInfo[] fi ;

void Page_Load(Object Src, EventArgs E )
{
String sAddressEncoded =
HttpUtility.UrlEncode("http://www.etantonio.it/en/index.aspx") ;
String sAddress =
"http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=" +
sAddressEncoded + "&lp=" + sLanguageSrc + "_" + sLanguageDest ;
WebRequest req = WebRequest.Create(sAddress);
WebResponse result = req.GetResponse();
Stream ReceiveStream = result.GetResponseStream();
StreamReader reader = new StreamReader(ReceiveStream,
Encoding.UTF8 );
String sHtmlTradotto = reader.ReadToEnd();
Trace.Write("sHtmlTradotto", sHtmlTradotto);

StreamWriter writer = new StreamWriter(
Server.MapPath("/Etantonio/EN/ZH_Tradotta.aspx") , false,
System.Text.Encoding.UTF8) ;
writer.Write(sHtmlTradotto);
writer.Flush();
writer.Close();
}


</script>

<html>
<head>
<title>Traduttore Cinese</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
</body>
</html>
///////////////////////////////////////////////////////////////////////////

still the result is not good in fact this is the result showing no
chinese character, result different from see directly on the browser
at the url:

http://babelfish.altavista.com/babelfish/trurl_pagecontent?url=http://www.etantonio.it/en/index.aspx&lp=en_zh


this instead is my ugly result:
///////////////////////////////////////////////////////////////////////////
sHtmlTradotto <html><meta http-equiv="content-type"
content="text/html; charset=UTF-8"><base
href="http://www.etantonio.it/en/index.aspx">
<!-- removed --><meta http-equiv="Content-Type" content="text/html ;
CHARSET=UTF-8"><base href="http://www.etantonio.it/EN/index.aspx">
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<head>
<title>Etantonio</title>
<meta name="author" content="Antonio DOttavio">
<meta name="description" content="Etantonio Index">
<link href="Stili.css" rel="stylesheet" type="text/css">
</head>
<body>

<script language=JavaScript src="menu_array.js"
type=text/javascript></script>
<script language=JavaScript src="mmenu.js"
type=text/javascript></script>

<table width="750" height="430" border="0" cellpadding="0"
cellspacing="0" background="/images/EsserSpettatoriNonEstSerioElefante.jpg">
<tr>
<td valign="top">

<table width="90%" border="0" align="center" cellspacing="12">
<tr height="70" valign="top">
<td>&nbsp;</td>
<td width="25%" rowspan="2">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fUniversita%2findex.aspx"
class="testoMedioVerde"></a></p>
<p align="center" class="testoPiccolissimoVerde">, </p>
</td>
<td width="25%" rowspan="2">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fEconomia%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fEconomia%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fEconomia%2findex.aspx"
class="testoMedioVerde"></a> </p>
<p align="center" class="testoPiccolissimoVerde">, , , 1994
</p></td>
<td width="25%">&nbsp;</td>
</tr>
<tr height="140" valign="top">
<td width="25%">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fLavoro%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fLavoro%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fLavoro%2findex.aspx"
class="testoMedioVerde"></a> </p>
<p align="center" class="testoPiccolissimoVerde">, , </p>
</td>
<td width="25%">
<p align="center" ><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fWeb%2fGifAnimate%2findex.aspx"
class="testoMedioVerde"></a> </p>
<p align="center" class="testoPiccolissimoVerde">GIF , </p>
</td>
</tr>
<tr valign="top">
<td width="25%">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fVarie%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fVarie%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fVarie%2findex.aspx"
class="testoMedioVerde"></a> </p>
<p align="center" class="testoPiccolissimoVerde">, , , </p>
</td>
<td width="25%"> <div align="center"></div></td>
<td width="25%"> <div align="center"></div></td>
<td width="25%">
<p align="center"><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fContatti%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fContatti%2findex.aspx"
class="testoMedioVerde"></a><a
href="http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=en_zh&trurl=http%3a%2f%2fwww.etantonio.it%2fEN%2fContatti%2findex.aspx"
class="testoMedioVerde"></a></p>
<p align="center" class="testoPiccolissimoVerde">nel delle </p>
</td>
</tr>
</table>

</td>
</tr>
</table>
<script>InserisciFooter();</script>
<br>

</body>
Re: Store in a file a web page written in chinese Sylvain Lafontaine
11/4/2004 1:03:47 PM
Hi,

I didn't have the time to mount a full in my system right now. However;
I can see this duplicate header:

sHtmlTradotto <html><meta http-equiv="content-type"
content="text/html; charset=UTF-8"><base
href="http://www.etantonio.it/en/index.aspx">
<!-- removed --><meta http-equiv="Content-Type" content="text/html ;
CHARSET=UTF-8"><base href="http://www.etantonio.it/EN/index.aspx">
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

Maybe IE is unable to see that the charset is indeed UTF-8. Have you tried
to set the encoding directly to UNICODE-8 in the options of IE?

You should also try your code by first writing only the chinese page,
without your own writing, and also trying to use an IFrame.

S. L.

[quoted text, click to view]
AddThis Social Bookmark Button