Hi Wayan,
There isn't any limit to the number of cases per se, but you should be very careful with the three parameters.
The first you have no control over, and it's the number of unique items in your dataset. You won't be able to do the same type of analysis if you have 1,000,000 products as you would if you have 1,000 products in your database.
The second parameter is the MINIMUM_SUPPORT. Always start by setting this to a high number (0.50) , and then decrease this until you find the right setting for your case.
The third parameter is the MAXIMUM_ITEMSET_SIZE. If you set this to 2 you will only look at pairwise correlation, something that's prudent if you have millions of different products. I think the default is 3 and it may be too high in your case.
Unfortunately the algorithm is limited to the fact that the results have to fit in main memory. It uses a clever datastructure for this, but still, it's not unlimited.
Hope this helps,
Jesper Lind
Microsoft Research
[quoted text, click to view] >
> Hi Jamie and All,
>
>
>
> I have been using SQL Server 2005 data mining for quite a while. They work
> fine until recently when I use a big database. For example for the market
> basket analysis, we have 3.6 million of cases and it failed to process the
> mining model. It ran for 2 days and the server was still processing (but it
> actually looked freezing). Is there any limitation on the number of cases
> for this algorithm? I am sure there is no such kind of limitation. So then
> something wrong with our data or the mining model? The mining structure is
> very simple and it is similar with the sample from Jamie's book on Data
> Mining with SQL Server 2005.
>
>
>
> Thanks,
>
> Wayan
>