all groups > sql server data mining > june 2006 >
You're in the

sql server data mining

group:

Cross Selling and predicting quantity sold?


Re: Cross Selling and predicting quantity sold? Jamie MacLennan (MS)
6/22/2006 3:28:44 PM
sql server data mining:
You need to make the quantity column predictable and not the nested table
(or nested key). Also make it input if you want to use it that way.

I would recommend creating one model to predict what's likely and another
model to predict quantities though. For quantities, you are going to want
to use Decision Trees - be sure to set your feature selection parameters
(MAXIMUM_INPUT ... etc) or many things will be feature selected away.

--

-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
PLEASE POST NEW QUESTIONS AT
http://forums.microsoft.com/msdn/showforum.aspx?forumid=81&siteid=1
[quoted text, click to view]

Cross Selling and predicting quantity sold? anonymous_user NO[at]SPAM sqlserverdatamining.com
6/22/2006 7:20:07 PM
I've done a lot of web research and read a book on analysis services but still haven't found an example of how to develop a cross selling algorithm that can predict not only what is sold, but how much is predicted to be sold.

For instance,
If I know customer A has bought 5 units of item 1, 10 units of item 2, and 20 units of item 3, I may know there is a 50% chance they will also want to but item 4.

Is there any way to determine how much of item 4 they might buy?

For instnace, say Item 4 is batteries and Item1 is a clock radio. We know that if they buy one clock radio they are likley to buy 2 batteries, but if they buy 2 clock radios they are likley to buy 4 batteries, not 2. In each case, there is the same likleyhood that they buy at least some batteries.

If my data set looks like this (in my sales table)
Customer / Item / Quantity

How do I design the data mining model to account for quantity? I tried several approaches to this and can't seem to get it to work correctly when taking quantity into account.

The method I used takes the sale table and makes it a case table and nested table, treats the Customer from the cast table as a key, and the item from the nested table as a key, input, and predict. This is how the books I've read explain how to create a cross selling model. I don't understand how I make it take quantity of each item into account.

Would it be better to do this in a 2nd mining model and not use it in the cross selling model?

Thanks
Tom
Re: Cross Selling and predicting quantity sold? Jamie MacLennan (MS)
6/23/2006 10:51:41 AM
You're close, but you have a little ways to go yet.

First - MAXIMUM_INPUT_ATTRIBUTES sets the number of inputs that are
considered by the algorithm, not how many can be in a single case. For
example, if you use the default - 255 and you have 1000 products, only 255
of those products will be considered by the algorithm - the others won't
have any impact.

Next you have Item and Quantity as independent case level attributes. This
means that you have _one_ quantity attribute for all products. What you
likely meant to do is to have the quantity tied to the customer/item pair.
You could create a composite key which is Customer/Item, however, you are
still looking at a single Quantity measure for all products.

What you want to do is to simply make the quantity attributes in the nested
table predictable. For example, using DMX to describe the model, it would
look like this:

CREATE MINING MODEL QuantityPredict
(
CustomerID LONG KEY,
Products TABLE,
(
Item TEXT KEY,
Quantity LONG REGRESSOR PREDICT // Input and Predictable
)
) USING
Microsoft_Decision_Trees(MAXIMUM_INPUT_ATTRIBUTES=0,MAXIMUM_OUTPUT_ATTRIBUTES=0)

Then to get the quantity of a particular item you would issue a query like
this:

SELECT (SELECT * FROM Predict(Products) WHERE Item='Batteries') FROM
QuantityPredict
NATURAL PREDICTION JOIN
(SELECT (SELECT 'Radio' AS Item, 1 AS Quantity) as Products) as t

--

-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
PLEASE POST NEW QUESTIONS AT
http://forums.microsoft.com/msdn/showforum.aspx?forumid=81&siteid=1
[quoted text, click to view]

Re: Cross Selling and predicting quantity sold? Jamie MacLennan (MS)
6/23/2006 2:57:31 PM
You would have to look at the resultant model - the data might not support
your hypothesis. Another thing to try is to play with the other
parameters - especially complexity penalty.

--

-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
PLEASE POST NEW QUESTIONS AT
http://forums.microsoft.com/msdn/showforum.aspx?forumid=81&siteid=1
[quoted text, click to view]
Re: Cross Selling and predicting quantity sold? anonymous_user NO[at]SPAM sqlserverdatamining.com
6/23/2006 3:20:07 PM
Jamie,

Thanks for the response. I am trying, as you suggested, to create a second model which will predict the quantity of a particular item that someone may purchase. I'm trying to do this with nested tables again, and am coming up with the following model:

Sales table is treated as both a nested and case table:
Case Table:
Customer = Key
Item = Input
Quantity = PredictOnly
Nested Table:
Item = Key, Input
Quantity = Input

The predictions I'm getting don't seem to change based on the quantity of items in the nested table.

Is this the right way to answer the question? In the example I gave, the business question would be:

A customer currently has 2 clock radios in their basket. Our cross selling model (seperate model) tells us that their is a high liklihood they will want to buy batteries. How many batteries are they likely to buy given that we know how many clock radios they've got in their basket?

In this case I would run a prediction query which looks like:

Select FLATTENED
T.[Item],
PredictHistogram([Quantity])
FROM [Model]
NATURAL PREDICTION JOIN
( SELECT
'Batteries' as [Item],
( SELECT 'Clock Radio' AS [Item], 2 as Quantity
) AS [Basket])
AS t

What I would expect is for this to return that it expects 4 batteries to be purchased, since I know people who buy clock radio's normally buy 2x as many batteries as they do radios.

Am I writing this query incorrectly, or is my model wrong, or both? I'm fairly new to using nested tables in analysis.

Maximum Inputs is defaulted to 255, so as long as I don't have more than that many items in my basked none of them will be feature selected away, right?

Thanks,
Tom
[quoted text, click to view]
Re: Cross Selling and predicting quantity sold? anonymous_user NO[at]SPAM sqlserverdatamining.com
6/23/2006 8:20:08 PM
That makees sense, thanks!

I tried implementing as you suggested but these two DMX statements return the same results to me. Notice in one I specify the customer has 1 radio in their basket, adn in the next statement I specify the customer has 1,000 radios in their basket. What I'd expect to see is that the number of batteries they purchase should be 1,000x higher in the second example. The result I get appears to be closer to the average number of batteries that customers who purchase radios purchase, rather than varying based on the number of radios purcahsed.


SELECT (SELECT * FROM Predict(Products) WHERE Item='Batteries') FROM
QuantityPredict
NATURAL PREDICTION JOIN
(SELECT (SELECT 'Radio' AS Item, 1 AS Quantity) as Products) as t

SELECT (SELECT * FROM Predict(Products) WHERE Item='Batteries') FROM
QuantityPredict
NATURAL PREDICTION JOIN
(SELECT (SELECT 'Radio' AS Item, 1000 AS Quantity) as Products) as t


So, the quantity of the items in the basket (the ones in the nested table, in this case the radio) do not appear to be affecting the quantity of each additional item begin predicted, which is really what I need for it to do. In this case I'd want the first statement to predict 2 batteries and the second statement to predict 2,000 batteries.

Any ideas on what I'm missing here?

Thanks,
Tom


[quoted text, click to view]
Re: Cross Selling and predicting quantity sold? Jamie MacLennan (MS)
6/26/2006 10:33:18 AM
As I said in my previous response, you need to look at the model that was
created - due to the data, there may not be any significant relationship.

--

-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
PLEASE POST NEW QUESTIONS AT
http://forums.microsoft.com/msdn/showforum.aspx?forumid=81&siteid=1
[quoted text, click to view]

RE: Cross Selling and predicting quantity sold? anonymous_user NO[at]SPAM sqlserverdatamining.com
6/26/2006 3:40:07 PM
This page does describe the method you suggestedI use, but I'm still having a problem getting the quantity predicted to change based on the quantity of other items purchased. My test data set is very small, could that be causing the problem? Again, what I would expect is that if it's predicting quantity then the predicted quantity should change based on the quantity of other items purchased. But for my example, no matter how many radios I put in the basked the quantity of batteries predicted stays the same and the percent chance of purchasing batteries stays the same too.

http://www.sqlserverdatamining.com/DMCommunity/TipsNTricks/1090.aspx

Any ideas on why this is the case would be appreciated.

Thanks,
Tom



[quoted text, click to view]
AddThis Social Bookmark Button