all groups > sql server new users > december 2005 >
You're in the

sql server new users

group:

How do you implement a dictionary?


How do you implement a dictionary? Paul
12/27/2005 1:54:15 PM
sql server new users:
OK, I understand that a table with a single row IS a dictionary.
Actually that's how my app stores its globals right now, but
predictably this row is getting unmanagably long.

So I was thinking it's time for a do-it-yourself dictionary table, what
I had in mind was a key column (VARCHAR), value column (TEXT), and
possibly a third column to specifiy the value's 'type' so the client
app knows how parse it. Or better yet define accessor procedures.

Then I got to thinking, this has got to be a well-trodden path, doesn'T
Microsoft define standard stored procedures for this? But I can'T seem
to find anything. Not sure what to look for.

Or is this a bad idea? I'm a programmer so I don't always think
relationally. For this app I'm not worried about stuff like foreign
key constraints.
Re: How do you implement a dictionary? David Portas
12/27/2005 2:52:04 PM
[quoted text, click to view]

I'm not sure what you mean by a "dictionary" in this context. If you do
just mean a global parameters table then I'd go for the single row with
a named column for each distinct parameter. The main advantage over
your loosely typed key-value suggestion is that you can implement the
correct data type and any constraints for each column. If you have some
unwieldy number of parameters then you may need to create some other
tables or columns in other tables - I'd suspect there is a hole in your
data model if you are storing an excessive number of scalar parameters.

--
David Portas
SQL Server MVP
--
Re: How do you implement a dictionary? Paul
12/28/2005 10:34:44 AM
By dictionary I just mean hash table, key/value collection or whatever
you call it in SQL land. I'm aware that this type of container would
not allow constraints but for this app I'm not worried about
constraining the data.

If the "long row" solution is the best practice then I'll stay with
that. It's just a pain because each time I add or remove a scalar,
that means a schema change which forces a code change / recompile in my
..NET client app. That gets old real fast.

It just strikes me as odd that a database can be such a clumsy device
for storing... data. I guess the relational model takes some getting
used to.

Thanks for the help.
Re: How do you implement a dictionary? David Portas
12/28/2005 6:57:04 PM
[quoted text, click to view]

Surely not if you use stored procs to retrieve the data. That's one of the
many advantages of procs. Of course if you need to use the variable
clientside you'll presumably have to release some new code anyway so then
you'd have to expect to recompile.

--
David Portas
SQL Server MVP
--

Re: How do you implement a dictionary? Michael Hotek
12/29/2005 6:44:46 PM
There isn't a "best practice", because this varies on practical
implementation.

There is a "best practice" that is simply relational design. This is done
by creating a table with an ID column, type column, and value column. This
is simply defining your data as actual data. Now you can add rows to a
table as you add parameters. This is something that I'd VERY strongly
suggest doing. Not only does changing schema each time you add a nw
parameter get old, it also limits your application to a total of 1024
parameters. This might seem like a lot, but I've worked with apps that had
thousands of parameters that needed to be tracked.

Here are some possible permutations you might want to look at:

create table tab1
(ID int identity(1,1),
ParmType varchar(30) not null,
ParmValue varchar(30) not null)

create table tab1
(ID int identity(1,1),
ParmType varchar(30) not null,
ParmValue varchar(30) not null,
SortOrder int not null)

create table tab1
(ID int identity(1,1),
ParmType varchar(30) not null,
IntValue int null,
NumValue decimal(10,4) null,
StringValue varchar(30) null)


create table tab1
(ID int identity(1,1),
ParmType varchar(30) not null,
IntValue int null,
NumValue decimal(10,4) null,
StringValue varchar(30) null,
SortOrder int not null)

--
Mike
Mentor
Solid Quality Learning
http://www.solidqualitylearning.com


[quoted text, click to view]

Re: How do you implement a dictionary? Adam Machanic
12/30/2005 10:22:11 PM
[quoted text, click to view]

Mike,

From a strictly relational standpoint, EAV is not a good design
practice; it violates first normal form by allowing an attribute to belong
to more than one domain. While I agree that this is OK and probably a good
flexible solution for small lookup tables such as this one, I've seen this
kind of design flaw cause huge data integrity problems in bigger databases.
It becomes very difficult to write constraints to enforce business and data
rules when your domain is dynamic. For that reason, I steer people away
from this technique in most cases.


--
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--

Re: How do you implement a dictionary? David Portas
12/31/2005 3:34:56 AM
[quoted text, click to view]

Agreed. Another very practical problem with the type/value structure is
that you'll need special handling to convert each value to its
appropriate datatype - if you store dates and numerics as strings in
the same column for example. That conversion will most likely have to
be done client-side otherwise it could be hard to handle result sets or
output variables of a type that's unkown at design-time.

Doing the type conversion client-side brings its own problems. How do
you standardise the methods used to encode dates or numerics as strings
in an application that spans multiple locale's for example? With all
that extra coding the apparent advantages of this method can dwindle
away pretty quickly I find.

--
David Portas
SQL Server MVP
--
Re: How do you implement a dictionary? Paul Williams
1/3/2006 4:30:45 PM
Well, you guys are gonna hate me for this, but I lost my patience and
here's what I did.



CREATE TABLE [dbo].[LookupTable] (
[ParamKey] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
NULL ,
[ParamValue] [sql_variant] NOT NULL
) ON [PRIMARY]
GO

CREATE PROCEDURE dbo.sp_LTDelete
@paramKey VARCHAR(50)
AS
DELETE FROM LookupTable WHERE ParamKey = @paramKey
GO

CREATE PROCEDURE dbo.sp_LTInsert
@paramKey VARCHAR(50),
@paramValue SQL_VARIANT
AS
INSERT INTO LookupTable(ParamKey, ParamValue) VALUES (@paramKey,
@paramValue)
GO

CREATE PROCEDURE dbo.sp_LTLookup
@paramKey VARCHAR(50)
AS
SELECT ParamValue FROM LookupTable WHERE ParamKey = @paramKey
GO

CREATE PROCEDURE dbo.sp_LTUpdate
@paramKey VARCHAR(50),
@paramValue SQL_VARIANT
AS
UPDATE LookupTable SET ParamValue = @paramValue WHERE ParamKey =
@paramKey
GO



By the way those are the first stored procedures I've ever written.
;-) But since they all turned out to be one-liners I'll probably get
rid of them.


[quoted text, click to view]

Why the ID column? The ParamKey is the primary key. My other tables
all have ID cols but since this one is cheating anyway, why even
pretend... Also I think the type is implicit with SQL_VARIANT.


[quoted text, click to view]

Yeah, I don't remember relational database theory anymore but I'm
pretty sure a dynamically expandable container is going to violate
everything in sight. But if man never played with fire we'd still be
living in caves.


[quoted text, click to view]

Again SQL_VARIANT to the rescue.

But if you were to store it as VARCHARs along with the type, solutions
abound. One solution would be to pick a single date format (100, 109
or 113) and stick with it throughout. The .NET DateTime class has very
flexible parsing capabilities.

But if you wanted to get cute and store dates in various locales,
that's still doable. The type column is really a type hint, it's just
a string. (OK for safety you could also define a table of data types.)
You could define date types that include the locale like DATETIME_113
and the insert accessor could use that to format it.


I've never understood why SQL supports date localization in the first
place. Presenting localized data to the user is the client's job.

Come to think of it, if you wanted to store a foreign key, instead of
INT you could define a type like 'CustomerIdType' and the insert
procedure could use that to validate it. Not exactly a foreign key
constraint but at least a sanity check. Then you could create a
trigger for when customers are deleted.
Re: How do you implement a dictionary? Michael Hotek
1/3/2006 6:44:16 PM
The integer identity column is there to provide an abstraction. Your table
definition now has a primary key with a physical value. This means that if
you decide the ParamKey should be changed or the business forces it to be
changed, you not only have to update the value in your lookup table, but
also cascade the update to every other table in the database where that
value existed. This creates a TREMENDOUS amount of overhead. With it being
a simple integer identity value, you stick the integer into any table that
needs to reference it and then if you need to change the value at a later
date, you simply change the value and nothing else is impacted.

Sure, you can do this as a sql_variant. I personally HATE that datatype and
wish it would be eliminated from the product. Developers are taught to
exlicitly define their datatypes, so that you know how to manipulate them.
You might as well just take this data, stuff it into a flat file in the OS,
and just read in the file when you need it. While the table structure you
have might work, you will eventually wind up with all kinds of conversion
issues and you've done nothing more than turn a database into a dumping
ground for flat files.

You can do whatever you want. You asked for a solution. None of us gave a
solution that included using a physical value as a primary key nor did it
use a sql_variant datatype. That is because we've jumped feet first into
the fire more than once and have also had to clean up systems with
performance issues more than once. I can't tell you how many hundreds of
times I've seen a structure like the one you defined and I can't come up
with a single, solitary case where it worked for more than the initial
application deployment.

--
Mike
Mentor
Solid Quality Learning
http://www.solidqualitylearning.com


[quoted text, click to view]

Re: How do you implement a dictionary? David Portas
1/4/2006 2:20:48 AM
[quoted text, click to view]

Repeat everything Mike said. Double emphasis about SQL_VARIANT - it's a
disaster!

What I'll add is that you shouldn't be using sp_ as a prefix for user
procedures. sp_ is reserved for system procs and will cause recompiles
and poor performance in user procs. Also, the fact that a proc is a
single line is no good reason to eliminate it. Most of the time it pays
to perform all data access through procs and anyway a proc that
modifies data should include error-handling, so it won't be just one
line.

Finally, "LookupTable" is a silly name for a silly concept. See:
http://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-big-design-mistakes.html

--
David Portas
SQL Server MVP
--
Re: How do you implement a dictionary? Andrew J. Kelly
1/4/2006 8:16:02 AM
I agree with Mike and David and just want to add that as a rule, all stored
procedures should start with SET NOCOUNT ON at the beginning.

--
Andrew J. Kelly SQL MVP


[quoted text, click to view]

Re: How do you implement a dictionary? Paul Williams
1/9/2006 8:26:03 AM
[quoted text, click to view]

I am aware of this and have done that elsewhere. If they want to
change a key, they would have to delete and reinsert. AT PRESENT only
the client refers to the globals, but come to think of it that could
change. That's why I ask.

[quoted text, click to view]

This is actually a seperate topic. Could you expand on that?

The reason I ask is that variant-like data types are used throughout
the Microsoft product line (a .NET Hashtable returns type object, in VB
I think it's even called 'variant', in Win32/COM everything in sight is
a void *), and in scripting language like Perl all scalars are
variants. I'm at peace with them for client programming but databases
bring up other considerations.

[quoted text, click to view]

Global data happens. Just because the data is global doesn't mean it's
of lesser stature. There are still the same requirements for security,
concurrent access and backup. Beyond that I did ask, keeping it all in
one database is a requirement.

[quoted text, click to view]

OK, you've put project specifics on the table, so I will respond. If
the customer wants a slick, highly tuned database app, they will have
to pay for it. On general principle. There is nothing on my resume
that suggests I am capable of producing that, and the customer is fully
aware.

My job is to create a functioning prototype, I will not be sustaining
it. It's due in June and my contract expires in July. Reading between
the lines, I think I see the word - 'Bangalore'. The customer is
notorious.

By the way, how's the job market for DB admins these days?
Re: How do you implement a dictionary? Paul Williams
1/9/2006 9:12:36 AM
[quoted text, click to view]

Once again, not in my experience, but my experience does not include
SQL. Please elaborate.


[quoted text, click to view]

Wow, changing an identifier name alters the behavior of the product?!!
My gast is flabbered. Especially given that Microsoft has been
force-feeding us hungarian notation for lo these many years. Shame on
Microsoft.


[quoted text, click to view]

I probably shouldn't have made the comment, a .NET 1.1 has a lot of
built-in goodies that hide DB details from the client programmer.
Which explains how I find myself in a project that's 75% feature
complete with very limited database knowledge.


[quoted text, click to view]

That's a bold statement which bears explanation.


[quoted text, click to view]

Mr. Andrews gives the example:

"However, consider a simple query: "show the names of all employees
who are clerks and earn less than 2000"."

I have nothing that complex in mind. I would like to transform

SELECT CompanyName FROM AbsurdlyLongGlobalRowOfFixedLength WHERE ID = 1

into

SELECT ParamValue FROM NiceShortGlobalTableOfDynamicLength WHERE
ParamKey = 'CompanyName'

Nothing more complicated than that. I am aware that that rules out the
possibility of constraints and more complex queries. I'm OK with that.

My hesitation with the ID column is that I tend to add features only as
needed. The ability to rename keys is not needed at this time, but if
the ID column is that important, fine, I'll throw it in too. Don't
cost nuthin'.

Now, if I stick with that particular SELECT statement and never change
anything but the ParamKey, what is the specific problem?

I guess what puzzles me is that every other modern programming
environment I'm familiar with has some of hash table, SQL is the odd
man out. Apparently that's beyond the scope of the SQL spec, but
Microsoft could easily have provided one anyway. It is conspicuous in
its absence.
Re: How do you implement a dictionary? Adam Machanic
1/9/2006 11:53:30 AM
[quoted text, click to view]

Hi Paul,

Be careful here. a .NET Object and VB's variant/SQL Server's
sql_variant are entirely different things. Although they look similar from
a development point of view, the important distinction is how they actually
work. .NET's Object is a universal type that any other type can be downcast
to, thereby losing its type identity, enabling containers such as a
Hashtable to deal with instances of any type. Variant datatypes on the
other hand, work by internally defining a field for every type they can
represent -- obviously, this is incredibly inefficient.

There are other implementation issues, as well, having to do with type
safety, which .NET's Object deals with much more nicely than VB/SQL Server.
I don't know how Perl deals with those issues, so I can't comment there.


[quoted text, click to view]

EXACTLY. You should treat "global" data with the same respect, from a
data modeling standpoint, as any other data.


[quoted text, click to view]

Let me rephrase this how I would read it if I were the customer and
happened to see this post:

"Paul is unwilling to create better systems for the app, even if he
knows how to do it; he doesn't think we pay him enough to keep best
practices in mind."

To be honest, based on that paragraph I would fire you if you were
working for me and said something like that.


[quoted text, click to view]

An adage I live by:

"Amateurs write code for themselves. Professionals write code for
others."

I don't care whether it's a prototype or an enterprise solution; I
always strive to be professional if I'm being paid.


[quoted text, click to view]

Excellent, if you know what you're doing. The market (at least in my
area -- Boston) is flooded with people who shouldn't be working as DBAs.
Our user group gets quite a few job openings to post, and very rarely do I
hear that the employers were able to fill them--there simply aren't enough
skilled candidates around.


--
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--

Re: How do you implement a dictionary? Paul Williams
1/9/2006 12:59:53 PM
[quoted text, click to view]

Ahhh, it's a COM VARIANT, or at least a similar thing. In other words
a C structure containing a type field and a union. Yes, I am in 100%
agreement that this data type should be banned. It'll never happen
though, COM is set in stone

Thanks, that's exactly the answer I was looking for. Curiously
undocumented.
Re: How do you implement a dictionary? David Portas
1/11/2006 4:16:16 AM
[quoted text, click to view]

SQL Server certainly isn't unique among databases or languages in this
respect. It's usual and desirable to reserve certain naming conventions
for system use when there are system and user objects sharing the same
name-space. Common prefixes for user procs are "usp_" or "prc_". Never
"sp_" or "xp_".

[quoted text, click to view]

Didn't I already cover that? Your table is untyped - or rather you are
constrained to using a single type per value. What is the problem with
using columns for the values. The ID is redundant in your first
example. You can constrain the table to a single row and do:

SELECT CompanyName
FROM AbsurdlyLongGlobalRowOfFixedLength

(I'm not endorsing the "absurdly long" comment of course. The table has
as many columns as you need - no more and no less).

[quoted text, click to view]

SQL is more than a "programming environment". SQL is a DATA MODEL.
Tables are the data structure used by SQL and there is only one sort of
table. You haven't given any explanation of what functionality you
think is lacking. Far from it. On the basis of what you've said SQL
seems very suitable.

--
David Portas
SQL Server MVP
--
Re: How do you implement a dictionary? Z
1/11/2006 3:00:51 PM
You can also look at it like this. A database forces structure. That means
a precise definition. An apple doesn't spontaneously change into an orange.
Sure a tadpole turns into a frog and a caterpillar turns into a butterfly,
but the DNA structure is still exactly the same. All a database really does
is to make you think about exactly what you want to do and define very
specific rules to be enforced. Sure, you can stuff everything into a single
column with a text datatype. It works. You then have to write n
permutations of extraction routines in order to work with the different
types of data at which point, you shouldn't even be using a database,
because simply stuffing a flat file on a file system would work better and
provide infinite flexibility.


[quoted text, click to view]

Re: How do you implement a dictionary? Paul Williams
1/11/2006 3:55:59 PM
[quoted text, click to view]

That's the answer. Now I know what to look for.
This post clears up a couple other questions as well, thanks!
AddThis Social Bookmark Button