all groups > sql server misc > september 2003 >
You're in the

sql server misc

group:

advice please: uniqueidentifier vs. identity as PK


advice please: uniqueidentifier vs. identity as PK google NO[at]SPAM sigol.co.uk
9/27/2003 11:17:30 AM
sql server misc:
Hi,

Hope you can help:

I am in the design stage of a database that will increase by around
100,000 records per annum. As well as the main three tables (each
containing 100,000 recors) there will be around 12-15 lookup tables,
each with between 10 and 50 lookup items.

I want to use GUIDs rather than INTS as the PK. My main reasons for
this are:

1. I can create the GUID in my code before the record is inserted
without having to find it after posting the record. This makes it
easier for me to link other tables to the main records.

2. Databases at various locations may be merged at a later date and
therefore GUIDs are a much easier solution because it would otherwise
require that I remap IDs in each database being merged into the master
one, which will be a big headache later.

However, I am looking for advice on this subject from anybody who has
used both GUIDs and INTs as PKs. What are the strengths and weaknesses
of both types and what about performance differences? I want to use
GUIDs for all PKs in all tables (main and lookup tables).

Any help would be great.

Thanks in advance,

Re: advice please: uniqueidentifier vs. identity as PK Brooks Stoner [MSFT]
9/29/2003 11:29:20 AM
Hello Sigol,

You have hit on the main advantages of using a unique identifier for the PK
in a table. The biggest drawbacks I have found are size and working with
GUIDs when querying the tables later (Integers are much easier to type and
remember).

Here is what I usually take into account when making this decision.

Size - a unique identifier is 16 bytes, while a bigint is 8 bytes. This not
only increases the size of your table, but also the index. If you have a
million rows in the table and are carrying an extra 8 bytes per row in both
the table and index (extra 12 bytes if you are using int instead of bigint),
it increases your size requirements by 16 million bytes. Not too bad, but
if you are expecting a table with 100 million rows, then you increase the
size by about 1.6 gb.

Insertion - You should probably not create a clustered index on a GUID if
you are going to be doing a lot of inserts. Since the GUID generation is
pseudo-random, you won't be inserting many records at the end of the table,
which will slow down inserts as well as increase fragmentation. If you have
columns that will be used for a non-unique index, and also match most
lookups, you could cluster on those columns.
Another way to keep insertions relatively fast is to add another column as a
primary key defined as a bigint, and making it the primary key with a
clustered index. Then you would also have a unique index on the unique
identifier column as well. If you define all your foreign keys on the
unique identifier value, you can still easily merge databases. This has the
obvious disadvantage of increased table size, so you have to decide if the
benefits (faster inserts) outweigh the costs (size).

I hope this helps.

Brooks Stoner

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm.

[quoted text, click to view]

AddThis Social Bookmark Button