Groups | Blog | Home
all groups > sql server data warehouse > february 2005 >

sql server data warehouse : Dimension values without data in a fact table


CLAUDIO
2/5/2005 9:39:03 AM
I have an ODS system and a Data warehouse system
I have a Sales fact table in the ODS system and I have these fields:

SALES
ID_CUSTOMER (PK),
ID_MODEL (PK),
ID_TIME (PK),
SALES,
QUANT_ART,
COST


Then in some records in the fields ID_Time or ID_Model or ID_Customer I
don’t have values (NULL) because in the transactional systems these record
don’t have values (NULL).

The users want to generate aggregate reports with the Sales table...

The question is:
I have to put a “dummy” value in the dimensions Customer, Model and Time
(for example “0”) and put this value in the fact table if the dimensions
fields have NULL values????
Or I have to leave the NULL values?

What is the best choice? Why?
Adam Machanic
2/6/2005 1:20:42 AM
Personally -- and this really does boil down to personal preference -- I
believe that one of the key things that should happen during an ETL is
elimination of all "questionable" data. That includes unknown data -- can
you really report on something that's unknown? At the least, generate
well-known tokens to replace the NULLs with. If possible, get rid of those
rows on the way in (of course, that really depends on context) -- perhaps
they can appear in the aggregate data, but not in the line-level data?
There are various ways of dealing with the problem, but personally I am very
firm when designing data warehouses and ensure that, one way or another,
there will be absolutely no NULLs in the database.


--
Adam Machanic
SQL Server MVP
http://www.sqljunkies.com/weblog/amachanic
--


[quoted text, click to view]

AddThis Social Bookmark Button