The problem you are encountering is whether or not we consider "missing" as
a value or "missing at random" where it doesn't make a difference.
You're suggestion to use a nested table is a good one - I believe that the
clustering algorithm was written to ignore missing values when there are
only two states. That is, that the existence of an attribute will be used
to pull it into a cluster, but the absence of an attribute is not
considered.
You don't need to create an OLAP Mining Model to accomplish this - simply
create a relational model as usual with the wizard and add nested tables in
the mining editor. You may have to play around a bit with how you specify
columns, etc., to get the desired behavior.
--
-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
[quoted text, click to view] "Cesar" <anonymous@discussions.microsoft.com> wrote in message
news:505201c42c7a$4021b680$a001280a@phx.gbl...
Thank you for your reply.
I understand your point, however I cannot filter records
that have null fields because I need them.
What I need is to make a segmentation based on the fields
that have values only. I did a segmentation but some of
the clusters have a description that
says for example: "ATTRIBUTE_NAME="Field1 Val"
ATTRIBUTE_VALUE="missing" SUPPORT="24.122610171863009"
PROBABILITY="0.89322856501640546"
I need to ignore missing values. For example, if you
have the following vectors:
(1,2,3)
(1,7,null)
(9,8,null)
(1,2,3) and (1,7,null) must be in the same cluster and
(9,8,null) could be in another cluster.
In other words, when a record has a missing value it must
be "projected" across the other dimensions in order to
find the closest cluster.
Do you know if it is possible to do this in a relational
mining model? Or is it better to try with an OLAP or
nested table mining model?
Thanks,
César
[quoted text, click to view] >-----Original Message-----
>You can set the column to NOT NULL and it will return an
error on
>processing - but that doesn't sound like what you want.
>
>I'm not sure I understand what you're asking for - NULL
is missing - the
>only way we could ignore NULL is to either treat is as
missing (which we do)
>or throw out the whole case that has the NULL value
(which we don't do).
>You can do the latter by filtering the data with a where
clause, tho.
>
>--
>
>-Jamie MacLennan
>SQL Server Data Mining
>This posting is provided "AS IS" with no warranties, and
confers no rights.
>"Cesar" <anonymous@discussions.microsoft.com> wrote in
message
>news:480001c42bf8$212679f0$a601280a@phx.gbl...
>It seems that this clause is not supported in SQLServer
>2000.
>
>Do you know how can I ignore null values in a Clustering
>mining model ?
>
>Thanks,
>César