all groups > visual studio .net general > march 2005 >
You're in the

visual studio .net general

group:

Spanish characters not appearing in vb.net


Spanish characters not appearing in vb.net GeorgeAtkins
3/14/2005 12:29:03 PM
visual studio .net general:
Help!

I am using vb.net with VS 2003 and dotnet framework 1.1

Problem: I am reading a text file, one line at a time, to extract book
titles. Many are in Spanish. For some reason I am unaware of, special
characters are not "read", and automatically dropped!

Original: ¡Adiós, pequeño!
Reads in vb.net as : Adis, pequeo!

I am doing no pre-processing of the text. Here is a code snippet of my
project:

Imports System
Imports System.Globalization
Imports System.Collections
Imports System.IO
Imports System.DateTime
Imports System.Text.RegularExpressions
Imports Microsoft.VisualBasic

Sub ReadTitles
Dim LibCntr, fnd, nf, ctr, y, z, StartPos, TabPos As Integer
Dim BkTitle As String ' the parsed title from marc file
Dim arline As String ' AR file
Dim srAR As System.IO.StreamReader = New
StreamReader("c:\\ardata.txt")

Do Until srAR.Peek() = -1 ' read AR file until EOF
' These 3 lines get the book title from the AR file...
arline = srAR.ReadLine ' get entire line
TabPos = InStr(9, arline, vbTab) ' look for embedded tab stop
ARList.Add(Mid(arline, 9, TabPos - 9)) ' assign title substring
to arraylist
Loop

So, an original text line will look like this:
30151SP ¡Adiós, pequeño! Janet/Allan Ahlberg 3.4 0.5 47.0

My code is supposed to extract just the title portion: ¡Adiós, pequeño!
But, as I wrote above, it actually captures this: Adis, pequeo!

What am I missing in all of this? String is unicode-aware, after all. But do
I need to declare a specific character set or something?

Thanks for any insights!
RE: Spanish characters not appearing in vb.net GeorgeAtkins
3/14/2005 12:41:09 PM
I want to add that the file I am reading contains both English as well as
Spanish book titles. I am using the english version of VS/VB.NET.
George

[quoted text, click to view]
RE: NOW IT DOES GeorgeAtkins
3/14/2005 2:43:06 PM
I FIGURED IT OUT!

It wasn't obvious to me, of course, but after a series of Sherlock
Holmes'-style eliminations, I arrived at the Streamreader and discovered that
I needed to define a specific code page.

So, here is what I added:

Imports System.DateTime

Then changed the reader line to:
Dim srAR As System.IO.StreamReader = New
StreamReader("c:\\ardata.txt", Encoding.GetEncoding(1252))

And it reads the "high ascii" characters correctly. I would have thought the
standard windows encoding (1252) would be a default. Go figure.

George

[quoted text, click to view]
RE: Spanish characters not appearing in vb.net GeorgeAtkins
3/14/2005 2:47:04 PM
I FIGURED IT OUT!
After a series of Sherlock Holmes-style eliminations, I arrived back to the
streamreader. Apparently, I needed to specify a coding page, like this:

Imports System.Text
....
Dim srAR As System.IO.StreamReader = New StreamReader("artext",
Encoding.GetEncoding(1252))


Once I did this, I get proper readings of my "high" ascii characters. I
would have thought 1252 would be a default. Never assume.

George

[quoted text, click to view]
AddThis Social Bookmark Button