Groups | Blog | Home
all groups > sql server programming > january 2007 >

sql server programming : 俱


Sunil Virmani
1/5/2007 11:04:00 PM
Hi All,

Create a table , and put in some row 子ど and 俱

and fire a query something like SELECT * FROM mytable WHERE
(myColumn = '子ど')
it gives me right result.

But if i fire
SELECT * FROM mytable WHERE (myColumn = '俱')

It does not return any result for the same even if myColumn has '俱'.

Surprisnly if i fire query like "SELECT * FROM mytable" it
correctly dispalys 俱.

What's the reason for the same ? Why does it is not able to search me on
this japanese character(俱).Collation is Japanese_Unicode_CI_AS

Regards,
Razvan Socol
1/6/2007 12:13:55 AM
Hi, Sunil

For unicode characters (stored in nvarchar columns) you should put a N
before the string.
For example, the following works fine:

CREATE TABLE mytable (
ID int IDENTITY PRIMARY KEY,
mycolumn nvarchar(10) COLLATE Japanese_Unicode_CI_AS
)

INSERT INTO mytable VALUES ('abc')
INSERT INTO mytable VALUES (N'def')
INSERT INTO mytable VALUES (N'=E5=AD=90=E3=81=A9')
INSERT INTO mytable VALUES (N'=E4=BF=B1')

SELECT * FROM mytable WHERE mycolumn=3DN'=E5=AD=90=E3=81=A9'
SELECT * FROM mytable WHERE mycolumn=3DN'=E4=BF=B1'
SELECT * FROM mytable WHERE mycolumn=3DN'abc'

Razvan
Sunil Virmani
1/6/2007 2:11:00 AM
Thanks Razvan for your quick response.

I have one query
"why it was working without N in 子ど but not in 俱" ?



Regards,
Sunil


[quoted text, click to view]
Razvan Socol
1/6/2007 3:58:15 AM
[quoted text, click to view]

Try this (preferably with Results To Text, instead of Results To Grid):

SELECT a, CONVERT(varchar(10),a) b,
CONVERT(varchar(10),a COLLATE Japanese_Unicode_CI_AS) c
FROM (
SELECT N'abc' a
UNION ALL SELECT N'=C4=83=C3=AE=C5=9F=C5=A3=C3=A2'
UNION ALL SELECT N'=E5=AD=90=E3=81=A9'
UNION ALL SELECT N'=E4=BF=B1'
UNION ALL SELECT N'=CE=A3=D0=B4=D0=A4'
) u

This will return:
a b c
----- ---------- ----------
abc abc abc
=C4=83=C3=AE=C5=9F=C5=A3=C3=A2 a=C3=AEst=C3=A2 ?i??a
=E5=AD=90=E3=81=A9 ?? =E5=AD=90=E3=81=A9
=E4=BF=B1 ? ?
=CE=A3=D0=B4=D0=A4 S?? =CE=A3=D0=B4=D0=A4

You can see that some characters are replaced with the equivalent
non-accented characters, but some are replaced with question marks.
This depends on the code-page assigned to the expression, which depends
on the collation. It happens that the characters =E5=AD=90 and =E3=81=A9 ar=
e both
present in the 932 code page (Japanese), but the character =E4=BF=B1 is not
present (maybe it's too complex or more rarely used; I don't speak
japanese).

Usually, in each code-page there are at most 256 characters, because a
varchar stores each character on one byte. However, code page 932
(Japanese) is a DBCS code page (some characters are stored on two
bytes). You can see what characters are present in this code page here:
http://www.microsoft.com/globaldev/reference/dbcs/932.mspx
For example, you can find the character =E5=AD=90 by clicking on 8E, at the
position 71.

Razvan
AddThis Social Bookmark Button