For Data Science positions that require some knowledge of Statistics and Programming skills, is common to ask questions like those below.

## Question 1

Suppose an urn contains `40 red`

, `25 green`

and `35 blue`

balls. Balls are drawn from the urn one-by-one, at random and without replacement. Let \(N\) denote the draw at which the first `blue ball`

appears, and \(S\) denote the number of `green balls`

drawn until the \(N_{th}\) draw (i.e. until the first `bue ball`

appears). Estimate \(E[N|S=2]\) by generating \(10000~iid\) copies of \((S,N)\)

## Solution 1

urn<-c(rep("red",40), rep("green",25), rep("blue",35)) v<-{} for (i in 1:10000) { s<-sample(urn,100, replace = FALSE) blue_ball<-min(which(s=="blue")) green_balls<-min(which(s[1:blue_ball]=='green')) green_balls[!is.finite(green_balls)] <- 0 if (green_balls==2) { v<-c(v,blue_ball) } } mean(v)

[1] 4.792257

## Question 2

Suppose that claims are made to an insurance company according to a Poisson process with rate `10 per day`

. The amount of a claim is a random variable that has an exponential distribution with mean \(\$1000\). The insurance company receives payments continuously in time at a constant rate of \(\$11000\) per day. Starting with an initial capital of \(\$25000\), use \(10000\) simulations to estimate the probability that the firm’s capital is always positive throughout its first \(365\) days.

## Solution 2

output<-{} for(i in 1:10000) { initial_capital<-25000 sums<-initial_capital for (d in 1:365) { P<-rpois(1,10) C<-rexp(1,1/1000) R<-11000 sums<-sums+R-C*P } output<-c(output,sums) } mean(output>0)

`[1] 0.9644`