Marcmc,
this is a very, very standard problem......and no...it is not a small
problem...it points directly to one of the toughest problems in data
warehousing....data integration...
The standard problem more generally stated is:
How do I effectively integrate and maintain relationships between data
that provides real keys that may change or be re-used over time?
For example, telephone numbers get longer, and they are constantly
re-used which is not a problem for the operational system because each
number is unique at a point in time but it is a BIG problem for the EDW
because it must understand that these things change over time....
The very standard answer is...if you are building dimensional DW, all
keys are best as generated meaningless integers.....this means that
somewhere there must be translation tables translating real keys coming
in to meaningless integers and you must have some standardised
mechanism to allocate these keys.....
Where multiple pieces of data can arrive with multiple different
contents but represent the same thing you must have another translation
table converting them to an intermediate meaningless intger key which
is then itself converted to yet another meaningless integer key...of
course, the standard example here is customer keys coming from
different source systems.....since you work in Insurance you will be
well familiar that different policy management systems often allocate
their own customer numbers...
I have published the code to maintain keys in dimension tables in C so
that most people can read it...it's on www.peternolan.com....I have not
published the mechanism of using a cross reference table for things
like customer numbers but it's pretty trivial.....
So, a good question.... ;-)
Peter
www.peternolan.com