View on GitHub

Business Analytics

Content from my Business Analytics Fall 2018 class at CU Boulder

Lab: Model Performance and Expected Value Framework

Question 1

You are a credit card company. “There are 900 million credit cards in circulation in the United States. According to the FTC September 2003 Identity Theft Survey Report, about 1% (10 million) cards are stolen and fraudulently used each year.”

Assume that you have a model that correctly classifies 90 out of every 100 actual positives. But, it also incorrectly flags as positives 10 out of every 100 negatives. Create a confusion matrix, in millions (e.g., if the answer were “900 million”, type “900”). For the matrix, use a total i of 900.

i=900	n	p
N
P

p = actual yes
n = actual no
Y = predicted yes
N = predicted no

Question 2

Using the confusion matrix above and the description in the text, match the following metric to the correct value.

TPR (aka recall aka sensitivity)
FPR
PPV (aka precision)
F1-score

A. 0.1112

B. 0.20

C. 0.80

D. 0.10

E. 0.90

F. 0.5

G. 0.0918

H. 0.1666

Question 3

You are a judge. You would like to use a model to predict whether someone is guilty or not. You have the following cost-benefit matrix:

b(Y,p) = 100
b(Y,n) = -150
b(N,p) = 0
b(N,n) = 0

Where

Y = guilty verdict
N = innocent verdict
p = actually guilty
n = actually innocent

That is to say, society “gains 100” if a guilty person goes to jail, “loses 150” if an innocent person goes to jail, “loses nothing” if a guilty person walks free, and “loses nothing” in an innocent person walks free (Note that in reality, society does lose something if a guilty person walks free, but we’re making a simplifying assumption for this thought exercise).

Your model makes a prediction for whether or not someone is guilty. At what minimum probability should you declare someone “guilty”?

Question 4

Now make the assumption that there actually is a cost to a guilty person walking free.

b(Y,p) = 100
b(Y,n) = -150
b(N,p) = -100
b(N,n) = 0

Where

Y = guilty verdict
N = innocent verdict
p = actually guilty
n = actually innocent

Before, we only had to consider the Expected Value for if we made a “guilty” prediction. Now, we need to separately model the expected value for if we make a “guilty” prediction and for when we make a “innocent” prediction.

Like this:

EV with "guilty" prediction = p(p) * b(Y, p)  + p(n) * b(Y, n)
EV with "not guilty" prediction = p(p) * b(N, p) + p(n) * b(N, n)

You should make a guilty verdict when “EV with guilty prediction” > “EV with not guilty prediction”.

At what probability of actually-guilty should you make a guilty verdict? In other words, p_guilty > _____?