all groups > sql server data mining > october 2005 >
You're in the

sql server data mining

group:

Mailing list response - data model question


Mailing list response - data model question Richard
10/31/2005 11:50:23 AM
sql server data mining:
Hi All,

I'm trying to create a data mining model to allow me to predict response
rates for a mail shot, based on the response to many other mailshots,
and hence to generate the next mailing list.

My question is about the underlying fact table structure - is the
following structure sensible, and if not why not?

Server: SQL 2005 and AS 2005.
Table: "Events_Fact" - includes orders and mailings
Columns:
Time_ID - The date of the event
Event_ID - Type of event, either an Order, or a mailshot sent
Customer_ID - Customer receiving mail, or placing order
Product_ID - either a real product for order lines,
or a mailshot identifier for mailings
Qty - Qty for Orders, 1 for mailshots
Value - Value for Orders, 0 for mailshots
PrevOrderDays - Number of days since the previous order
placed by the customer
PrevMailDays - Number of days since the last mailshot sent to
the customer.

Dimension tables are time, customer, product (including mailing list
codes) and have the other customer predictive attributes such as
location, gender etc.

What I would like to be able to do is to mine this data to get those
customers who respond in the shortest time, spend the most, repeat
order, etc. etc. We can then tailor the next list for maximum return. We
would also like to analyse by recency of purchase, as this is a known
predictor of response.

Is this a sensible structure? Will it support the analysis I want to do?
I think it will, but although I've done a lot of OLAP analysis over the
years, I'm not so familiar with data mining. I am pretty convinced I
need to create a bespoke cube to include mailshots and orders, otherwise
I never see people who don't respond to a mailshot in a normal orders cube.

If this is not sensible, can anyone suggest something better?

Thanks is advance,


Re: Mailing list response - data model question Jamie MacLennan (MS)
11/1/2005 10:20:08 AM
Actually, for data mining you don't need to create a cube at all. What you
want is a table that has all of the information for a particular customer
for a mailshot. The most important piece of information was that they
responded.

Basically you want to create a table that has each customer id, all
information about the customer including specific information pertinent to
that mailshot, and whether or not they responded. Then create a model to
predict if they respond. This is identical to the Bike Buyer example that
ships with SQL 2005.

For predicting response time, do as above, but only use customers that
actually responded. The predictable variable in this case would be the time
they responded. You might want to try two different scenarios - one where
you predict the continuous amount of time, which will give you a regression
tree with regression formulae, and another where you discretize the time
amount - possibly into three buckets - which will give you a classification
tree seperating quick, medium, and long responses

--

-Jamie MacLennan
SQL Server Data Mining
This posting is provided "AS IS" with no warranties, and confers no rights.
[quoted text, click to view]

Re: Mailing list response - data model question Richard
11/11/2005 10:32:59 PM
Hi Jamie,

Thanks for that. I'd seen the bike buyer example, but not really studied
it before. Now I think I get the picture.

Sorry I didn't reply earlier, I can only read newsgroups from home, and
not had any chance in the last two weeks.

Regards,

Richard.

[quoted text, click to view]
AddThis Social Bookmark Button