For Data Science positions, the interviewers use to ask questions about probabilities which require some knowledge of statistics. Today, we will provide an example of the Binomial Distribution where we need to solve for n!
Question
Joe is a recreational basketball player. His average percentage in free three-point shots is 20%. How many three-point shots need to attempt so that the probability to score at least 1 is greater than 95%?
Answer
Let \(X\) be the random variable of the number of three-point scored which follows the Binomial Distribution with probability \(p=0.2\), i.e. \(X \sim B(X, 0.2)\).
Notice that the pmf of the Binomial Distribution is given by the formula \(P(X=k)={n \choose k}p^k (1-p)^{n-k} \). Thus, we have:
\(P(X \geq 1)>0.95\) or
\(1-P(X =0)>0.95\) or
\(P(X =0)<0.05\) or
\( {n \choose 0} 0.2^0(1-0.2)^n<0.05\) or
\( 0.8^n<0.05\) or
\( n \ln{(0.8)}< \ln{(0.05)}\) or
\( n >\frac{ \ln{(0.05)} }{ \ln{(0.8)} }\) or
\( n > 13.42\)
Since we want the “least” integer we need to get the ceiling which means \(n=14 \)
You can confirm that the number of n=14 is correct if you type in excel:
=1-BINOM.DIST(0,14,0.2,TRUE)
# which returns 0.95602 which is greater than 0.95. Also if Joe attempts 13 shots instead of 14 his probability becomes
=1-BINOM.DIST(0,13,0.2,TRUE)
# 0.945 which is less than 0.95 which was our target.
Notice that this type of problems could be solved by iterations and “trial and errors”, meaning that we could keep testing different values of n until we get the required number.