Predictive Hacks

# Interview questions about Stats and Probabilities

For Data Science positions, during the interview process, is common to ask questions about Statistics and Probabilities. We will provide some potential interview questions and their indicative solutions.

Question 1: Assuming that X follows the Normal Distribution with Mean=0.545 and Standard Deviation=0.155 find the probability that X exceeds 0.395.

$$Pr(X>0.395) = 1-Pr(X \leq 0.395) = 1- Pr(\frac{X-0.545}{0.155} \leq \frac{0.395-0.545}{0.155} )=$$

$$=1-Z( \frac{0.395-0.545}{0.155})=1-Z(-0.96744) = 1-0.166587 = 0.833$$

We can provide also the solution in R.

1-pnorm(0.395, mean=0.545, sd=0.155)

[1] 0.8334134



Question 2: The probability that a patient recovers from a rare blood disease is 0.4.  If 15 patients are known to have contracted the disease, what is the probability that exactly 5 fail to recover?

The probability to recover is 0.4 and the probability to fail is 0.6. We want to calculate the probability that exactly 5 out of 15 failed to recover. All the possible combinations to get 5 out of 15 people is: $${15\choose 5} = 3003$$.

So the probability we would like to calculate is $$Pr(X=5) = 3003 \times 0.4^{10} \times 0.6^5 = 0.024486$$

We can provide also the solution in R.

dbinom(5,15,0.6)

[1] 0.02448564


Question 3: A secretary makes 2 errors per page on average. What is the probability that on the next 2 pages she makes not more than 3 errors?

Answer 3: We can argue that the mistakes per page follow the Poisson distribution with parameter λ=2. Now we can argue that the number of mistakes of every 2 pages follow a Poisson distribution with parameter λ=4. The probability to make not more than 3 errors is:

Paragraph
$$Pr(X \leq 3) = \sum_{k=1}^{3} \frac{4^k e^{-4}}{k!}=0.43347$$

We can provide also the solution in R.

ppois(3,4)

{1] 0.4334701



Question 4: A homeowner plants 6 flower bulbs selected at a random from a box containing 5 tulips and 4 roses.  What is the probability that he planted 2 roses and 4 tulips?

Answer 4: The probability is given by: $$\frac{ {4 \choose 2} \times {5 \choose 4} }{9 \choose 6 }=0.3571429$$

We can provide also the solution in R.

dhyper(x=4,  m=5, n=4, k=6)
[1] 0.3571429



### Get updates and learn from the best

R

#### The fastest way to Read and Writes file in R

Compare Read and Write files time When we are dealing with large datasets, and we need to write many csv

R

#### How to Connect R with SQL

Need to Connect R with SQL It is common for Data Analysts/Scientists to connect R with SQL. For that reason,