Predictive Hacks

How to estimate the Standard Deviation of Normal Distribution

You can encounter this type of questions during the interview process for Data Scientist positions. So the question can be like that:

Question: Assume that a process follows a normal distribution with mean 50 and that we have observed that the probability to exceed the value 60 is 5%. What is the standard deviation of the distribution?

Solution:

\(P(X \geq 60) = 0.05\)

\(1- P(X < 60) = 0.05\)

\(P(X < 60) = 0.95\)

\(P(\frac{X-50}{\sigma} < \frac{60-50}{\sigma}) = 0.95\)

\(P(\frac{X-50}{\sigma} < \frac{10}{\sigma}) = 0.95\)

\(Z(\frac{10}{\sigma})= 0.95\)

But form the Standard Normal Distribution we know that the \(Z(1.644854)=0.95\) (qnorm(0.95) = 1.644854), Thus,

\(\frac{10}{\sigma} = 1.644854\)

\(\sigma = 6.079567\)

Hence the Standard Deviation is 6.079567. We can confirm it by running a simulation in R estimating the probability of the Normal(50, 6.079567) to exceed the value 60:

set.seed(5)
sims<-rnorm(10000000, 50, 6.079567 )
sum(sims>=60)/length(sims)

And we get:

[1] 0.0500667

As expected, the estimated probability for our process to exceed the value 60 is 5%.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s