Personally -- and this really does boil down to personal preference -- I
believe that one of the key things that should happen during an ETL is
elimination of all "questionable" data. That includes unknown data -- can
you really report on something that's unknown? At the least, generate
well-known tokens to replace the NULLs with. If possible, get rid of those
rows on the way in (of course, that really depends on context) -- perhaps
they can appear in the aggregate data, but not in the line-level data?
There are various ways of dealing with the problem, but personally I am very
firm when designing data warehouses and ensure that, one way or another,
there will be absolutely no NULLs in the database.
--
Adam Machanic
SQL Server MVP
http://www.sqljunkies.com/weblog/amachanic --
[quoted text, click to view] "CLAUDIO" <CLAUDIO@discussions.microsoft.com> wrote in message
news:937E414E-AB9A-4E4D-8A58-D8251139550E@microsoft.com...
> I have an ODS system and a Data warehouse system
> I have a Sales fact table in the ODS system and I have these fields:
>
> SALES
> ID_CUSTOMER (PK),
> ID_MODEL (PK),
> ID_TIME (PK),
> SALES,
> QUANT_ART,
> COST
>
>
> Then in some records in the fields ID_Time or ID_Model or ID_Customer I
> don't have values (NULL) because in the transactional systems these record
> don't have values (NULL).
>
> The users want to generate aggregate reports with the Sales table...
>
> The question is:
> I have to put a "dummy" value in the dimensions Customer, Model and Time
> (for example "0") and put this value in the fact table if the dimensions
> fields have NULL values????
> Or I have to leave the NULL values?
>
> What is the best choice? Why?
>