Prob 1: Experiment with 3 different splits of training data – test data and see how it changes the values of b0 and b1 and the values in the confusion matrix.
- 50 % training data – 50 % test data
- 80 % training data – 20 % test data
- 90 % training data – 10 % test data
Prob 2: A chain of luxury stores selling clothes and have printed 5000 colourful catalogs, each catalog containing a coupon that provides a 50 € discount on purchases of 200€ or more. To save on expenses you would like to only send the catalogs to customers with the highest probability of using the coupon.
- Column Spending: last year’s total spending
- Column Card: whether they have your credit card
- Column Coupon: whether they used the previous promotional coupon they received.
- Promotion responders (1) or non-responders (0).
- Spending and Card: explanatory variables (X)
- Coupon: output variable (y).
a) Find the values of b0, b1 and b2 and write a model in the equation form p̂ = b0 + b1x1 + ... + bqxq Use the equation to predict to which customers should get catalog offer
b) Print the confusion matrix and comment on how good the model is at classifying customers. How large % of customers are misclassified by the model?