Groups | Blog | Home
all groups > sql server programming > november 2004 >

sql server programming : BINARY_CHECKSUM is not reliable


Scott
11/16/2004 10:24:01 PM
The below code inserts 2 different records into a table, but the binary
checksum produces the same number for each insert, thus thinks the records
are the same, this is suppose to be impossible ???

@@version return
Microsoft SQL Server 2000 - 8.00.850 (Intel X86) Aug 7 2003 11:07:42
Copyright (c) 1988-2003 Microsoft Corporation Standard Edition on Windows NT
5.0 (Build 2195: Service Pack 4)

I am using SQL2000 sp3a

Any ideas
Is there another way to check records in tables?

Thanks.............


if exists (select * from dbo.sysobjects where id = object_id(N'[slbin]') and
OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [slbin]

CREATE TABLE [slbin] (
[pri] [int] NOT NULL ,
[FirstName] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[LastName] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[OtherNames] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[PreferredName] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[TradingName] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[BusinessName] [varchar] (100) COLLATE Latin1_General_CI_AS NULL ,
[FullNameTemp] [varchar] (100) COLLATE Latin1_General_CI_AS NULL
) ON [PRIMARY]

insert into slbin (pri,FirstName, LastName, OtherNames, PreferredName,
TradingName, BusinessName, FullNameTemp)
values (1,'Livasi','Sheck',NULL,NULL,NULL,NULL,'Mr Livasi Sheck')

insert into slbin (pri,FirstName, LastName, OtherNames, PreferredName,
TradingName, BusinessName, FullNameTemp)
values (2,'Liuasi','Sheck',NULL,NULL,NULL,NULL,'Mr Liuasi Sheck')



SELECT *
FROM slbin

-- the result in the CheckSumNum column should be different
SELECT pri, BINARY_CHECKSUM(FirstName, LastName, OtherNames,
PreferredName, TradingName, BusinessName, FullNameTemp) as CheckSumNum
FROM slbin




if exists (select * from dbo.sysobjects where id = object_id(N'[slbin]') and
OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [slbin]





Steve Kass
11/17/2004 1:54:00 AM
No, it's not supposed to be impossible, and no such claim is made that
I'm aware of. Books Online makes it clear: "BINARY_CHECKSUM(*) will
return a different value for most, but not all, changes to the row, and
can be used to detect most row modifications."

In any case, it can't possibly be impossible, since there are only 4
billion possible BINARY_CHECKSUM values, and gazillions of possible
combinations of 7 varchar(100)s (4 billion is 2^31, and the number of
distinct rows in your table is 2^5604). It's a mathematical certaintly
that there is some value of BINARY_CHECKSUM shared by at least 2^5573
distinct (pri, FirstName, ..., FullNameTemp) values.

That said, you'd hope that none of the combinations with the same
BINARY_CHECKSUM are too similar, and unfortunately, BINARY_CHECKSUM is
not a good hash function. For a single string column, it does some
simple xor-ing and shifting, and it does shifting and xoring to put
together multiple columns also.

Here's what binary_checksum does with a single varchar string:

create function binary_checksum_varchar (
@t varchar(1000)
) returns int as begin
declare @b bigint set @b = 0
declare @c tinyint
declare @s bit set @s = 0
declare @i int set @i = 1

while @i <= len(@t) begin
set @c = ascii(substring(@t,@i,1))
set @b = @b / 16 * 16 + @b % 16 ^ @c / 16
set @b = @b * 16 + @c % 16
if @c >= 128 begin
set @b = @b ^ 0xFF
set @s = 1 - @s
end
set @b = @b % 0x0100000000 ^ @b / 0x0100000000
set @i = @i + 1
end

if @s = 1 set @b = @b ^ 0xFFFFFFFF
if @b >= 0x80000000 set @b = @b | 0xFFFFFFFF00000000
return @b
end

And this will give you an idea of how it is working when passed more
than one column:
select binary_checksum('gabcdefe','abcdefgG')
select cast(binary_checksum('gabcdefe') as binary(4))
select cast(binary_checksum('abcdefgG') as binary(4))

Steve Kass
Drew University



[quoted text, click to view]
avnrao
11/17/2004 12:23:31 PM
in your case, you are specifying the column names..and these 2 rows have
same values..and hence it returns same CheckSum value.
Select BINARY_CHECKSUM(*) will return different values.

Av.
http://dotnetjunkies.com/WebLog/avnrao
http://www28.brinkster.com/avdotnet

[quoted text, click to view]

AddThis Social Bookmark Button