You can encounter this type of questions during the interview process for Data Scientist positions. So the question can be like that:
Question: Assume that a process follows a normal distribution with mean 50 and that we have observed that the probability to exceed the value 60 is 5%. What is the standard deviation of the distribution?
Solution:
\(P(X \geq 60) = 0.05\)
\(1- P(X < 60) = 0.05\)
\(P(X < 60) = 0.95\)
\(P(\frac{X-50}{\sigma} < \frac{60-50}{\sigma}) = 0.95\)
\(P(\frac{X-50}{\sigma} < \frac{10}{\sigma}) = 0.95\)
\(Z(\frac{10}{\sigma})= 0.95\)
But form the Standard Normal Distribution we know that the \(Z(1.644854)=0.95\) (qnorm(0.95) = 1.644854
), Thus,
\(\frac{10}{\sigma} = 1.644854\)
\(\sigma = 6.079567\)
Hence the Standard Deviation is 6.079567. We can confirm it by running a simulation in R estimating the probability of the Normal(50, 6.079567) to exceed the value 60:
set.seed(5) sims<-rnorm(10000000, 50, 6.079567 ) sum(sims>=60)/length(sims)
And we get:
[1] 0.0500667
As expected, the estimated probability for our process to exceed the value 60 is 5%.