all groups > sql server data mining > september 2003 >
You're in the

sql server data mining

group:

Microsoft Decision Tree: how it works exactly?



Microsoft Decision Tree: how it works exactly? Peter
9/26/2003 1:31:57 PM
sql server data mining: Hi,

I am using the built in Microsoft Decision Tree to perform
some data mining task on my Analysis Server (SP3). I have
some difficulty to understand how it pickes the node and
how it splits and terminates, etc.. I'd really like to
know how the algorithm works.

Another question is: Is there anyway to control the
splitting and terminating conditions from the Analysis
Manager?

The last question is that I read the Analysis Manager
(SP3) supports the third party's algorithm plug in. Does
that mean I can develop my own mining algorithm and call
it from the analysis manager? If so, where can I find ways
to do that?

It might be a lot to ask. I really appreciate any input.

Thanks,

Peter

Re: Microsoft Decision Tree: how it works exactly? Jamie MacLennan (MS)
9/28/2003 4:52:47 PM
You can find many of your answers in the FAQ at
http://groups.msn.com/AnalysisServicesDataMining

In particular you can control the way a tree splits and how deep it is with
the SPLIT_METHOD and COMPLEXITY_PENALTY parameters. At the above website,
there is a sample AM plug-in that provides a user-interface for setting
algorithm parameters.

I believe you can also find the link to the sample OLEDB for Data Mining
provider that you can use as a basis for your own algorithms

--
Jamie MacLennan
SQL Server Data Mining

This posting is provided "AS IS" with no warranties, and confers no rights.
-----------------------------------------------------------------

[quoted text, click to view]

Re: Microsoft Decision Tree: how it works exactly? Peter
9/29/2003 3:11:08 PM
Jamie,

Thanks for the reply. I am ware that website. But when I
wend there, the webpage just can not be accessed. Can you
confirm that it is still there?

http://groups.msn.com/AnalysisServicesDataMining/Documents/
Files/FAQ%2Ehtm

Thanks,

Peter

[quoted text, click to view]
Re: Microsoft Decision Tree: how it works exactly? Jamie MacLennan (MS)
9/30/2003 10:10:59 AM
Try this link http://groups.msn.com/analysisservicesdatamining/faq.msnw
[quoted text, click to view]

Re: Microsoft Decision Tree: how it works exactly? Peter
9/30/2003 11:36:38 AM
Jamin,

Thanks a lot for the reply. The link works just fine.

I have downloaded the following package

DataMiningAddIns.exe

and unzipped it. According to the readme.txt, I closed the
running application Analysis Manager and ran the

DataMiningAddIn.reg

After that, I got the message saying that the registration
is successful.

Then, I started the Analysis Manager again. But here is
the problem, when I right click the mining models of one
database, say "Mushrooms", I can not see the "Advanced
Model Properties" from the list.

When I right click on server name and select properties,
the "Add-ins" tab shows that "Mining model properties" as
available Add-ins but there is also a yellow sign
saying "this tab applies to the local computer only".

I am not sure where went wrong. Any suggestions?

I am using SQL Server 2000 (SP3), Analysis Manger SP3 on
Windows 2000, FYI.

Regards,

Peter

[quoted text, click to view]
Re: Microsoft Decision Tree: how it works exactly? Peter Kim [MS]
9/30/2003 1:38:52 PM
Yes, I just tried and it still there.

http://www.msnusers.com/AnalysisServicesDataMining/Documents/Files%2FFAQ.htm
or
http://www.msnusers.com/AnalysisServicesDataMining/faq.msnw

--
Peter Kim
This posting is provided "AS IS" with no warranties, and confers no rights.

[quoted text, click to view]

Re: Microsoft Decision Tree: how it works exactly? Peter
10/1/2003 1:34:58 PM
Jamie and Peter,

Thanks a lot for the reply. The link works just fine.

I have downloaded the following package

DataMiningAddIns.exe

and unzipped it. According to the readme.txt, I closed the
running application Analysis Manager and ran the

DataMiningAddIn.reg

After that, I got the message saying that the registration
is successful.

Then, I started the Analysis Manager again. But here is
the problem, when I right click the mining models of one
database, say "Mushrooms", I can not see the "Advanced
Model Properties" from the list.

When I right click on server name and select properties,
the "Add-ins" tab shows that "Mining model properties" as
available Add-ins but there is also a yellow sign
saying "this tab applies to the local computer only".

I am not sure where went wrong. Any suggestions?

I am using SQL Server 2000 (SP3), Analysis Manger SP3 on
Windows 2000, FYI.

Regards,

Peter

[quoted text, click to view]
Re: Microsoft Decision Tree: how it works exactly? Raman Iyer [MS]
10/1/2003 5:05:17 PM
You also need to register the DataMiningAddIns.dll.

--
Raman Iyer
SQL Server Data Mining
[Please do not send email directly to this alias. This alias is for
newsgroup purposes and is intended to prevent automated spam. This posting
is provided "AS IS" with no warranties, and confers no rights.]
..

[quoted text, click to view]

Re: Microsoft Decision Tree: how it works exactly? Peter
10/1/2003 6:13:41 PM
It is now working! Thank you all, Peter, Jamie and Raman.

I checked out the two papers (shown in the following)
listed on the FAQ in answering the question "Where do I
get the details of the two algorithms? " However, they
don't seem to address very clearly what the creteria is
used to stop the splitting, which node to split and the
discretization of continuous values. Is there any other
document better addressing these issues? I know there are
many academic papers talking about these issues, but my
main concern is how they are handled in Microsoft Decision
Tree.

Papers I read:
=====================================================
- Correlation counting:
Surajit Chaudhuri, Usama M. Fayyad, Jeff Bernhardt,
Scalable Classification over SQL Databases. ICDE 1999: 470-
479
Found in
http://ftp.research.microsoft.com/Users/surajitc/icde99.pdf

- The default scoring methods (Bayesian Dirichlet
Equivalent with Uniform prior):
David M. Chickering; Dan Geiger; David Heckerman,
Learning Bayesian Networks: The Combination of Knowledge
and
Statistical Data, MSR-TR-94-09, 1994
Found in
http://www.research.microsoft.com/scripts/pubdb/pubsasp.asp
?recordID=81
=======================================================

Another thing is that when I try to get information on how
to plug-in third party's algorithm from the following link

http://www.microsoft.com/sql/techinfo/BI/2000/dmproviderswp
..asp

I got the "Page not found" error. Did I miss something?

Thanks for any input.

Peter

[quoted text, click to view]
Re: Microsoft Decision Tree: how it works exactly? Peter Kim [MS]
10/2/2003 1:49:19 PM
The tree stops splitting when it sees no split gives better split score
any longer. The research paper is describing how we calculate
the split score. In terms of split method, we have three different
methods implemented; simple BINARY, COMPLETE, BOTH.
Simple BINARY produces split condition like Hobby=golf,
Hobby!=golf while COMPLETE produces Hobby=golf,
Hobby=tennis, and so on. BOTH will take the best out of the
two methods for each split. The split method can be specified
using parameter, SPLIT_METHOD.

There is also a parameter, COMPLEXITY_PENALTY that
controls the tree depth by penalizing the split score.

Continuous inputs are handled differently from discrete.
For each node to split, we collect a sample cases for candidate
continuous inputs and find the best cut-points in some way.
So, it is different from DISCRETIZED attribute.

Let me know if you have more questions.
--
Peter Kim
This posting is provided "AS IS" with no warranties, and confers no rights.

[quoted text, click to view]
AddThis Social Bookmark Button