View on GitHub

Business Analytics

Content from my Business Analytics Fall 2018 class at CU Boulder

Lab: Model Performance and Expected Value Framework

Question 1


Read this

You are a credit card company. “There are 900 million credit cards in circulation in the United States. According to the FTC September 2003 Identity Theft Survey Report, about 1% (10 million) cards are stolen and fraudulently used each year.”

Assume that you have a model that correctly classifies 90 out of every 100 actual positives. But, it also incorrectly flags as positives 10 out of every 100 negatives. Create a confusion matrix, in millions (e.g., if the answer were “900 million”, type “900”). For the matrix, use a total i of 900.

i=900 n p
N    
P    
p = actual yes
n = actual no
Y = predicted yes
N = predicted no

Question 2


Using the confusion matrix above and the description in the text, match the following metric to the correct value.

  1. TPR (aka recall aka sensitivity)
  2. FPR
  3. PPV (aka precision)
  4. F1-score

A. 0.1112

B. 0.20

C. 0.80

D. 0.10

E. 0.90

F. 0.5

G. 0.0918

H. 0.1666

Question 3


You are a judge. You would like to use a model to predict whether someone is guilty or not. You have the following cost-benefit matrix:

b(Y,p) = 100
b(Y,n) = -150
b(N,p) = 0
b(N,n) = 0

Where

Y = guilty verdict
N = innocent verdict
p = actually guilty
n = actually innocent

That is to say, society “gains 100” if a guilty person goes to jail, “loses 150” if an innocent person goes to jail, “loses nothing” if a guilty person walks free, and “loses nothing” in an innocent person walks free (Note that in reality, society does lose something if a guilty person walks free, but we’re making a simplifying assumption for this thought exercise).

Your model makes a prediction for whether or not someone is guilty. At what minimum probability should you declare someone “guilty”?

Question 4


Now make the assumption that there actually is a cost to a guilty person walking free.

b(Y,p) = 100
b(Y,n) = -150
b(N,p) = -100
b(N,n) = 0

Where

Y = guilty verdict
N = innocent verdict
p = actually guilty
n = actually innocent

Before, we only had to consider the Expected Value for if we made a “guilty” prediction. Now, we need to separately model the expected value for if we make a “guilty” prediction and for when we make a “innocent” prediction.

Like this:

EV with "guilty" prediction = p(p) * b(Y, p)  + p(n) * b(Y, n)
EV with "not guilty" prediction = p(p) * b(N, p) + p(n) * b(N, n)

You should make a guilty verdict when “EV with guilty prediction” > “EV with not guilty prediction”.

At what probability of actually-guilty should you make a guilty verdict? In other words, p_guilty > _____?