Probability for Discrete Random Variable

Binomial Distribution

Concept

The binomial distribution is used to get the probability of the number of successful events.

  • Binomial probability: the probability of success (p) or failure (1-p)

  • A binomial random variable is the number of successes x in n repeated trials.

The required information are the success probability p and number of trial n

Example 1:

example from medium@AerinKim

Using the binomial PMF, what is the probability that 20 ppl will clap next week?

We need the success probability p and number of trial n

From the past statistics,

  • number of trial n= 1134 visits/ week

  • success # = 17 claps per week

    • rate or expected value of x **** is 17 claps/week.

  • success p= 17 claps/1134visits /week = 1.5% for 1 week

Then, what is the probability of x=20 claps in next week? Using binomial PMF, Binomial P(X=x) is 0.0692

Poisson Distribution

Definition:

Predicts the probability of a given number of events occuring in a fixed interval of time with the avg. number of events in that interval is lambda (rate parameter)

Read this article for the full text:

Intuition

What is Possison for? What can Possion do which Binomial can't?

Binomial Shortcoming

Binomial random variable is "Binary" 0 or 1 and it cannot have multiple events in the unit of time.

From the previous example, 17ppl/week means 17/(7*24)=0.1 ppl per hour.

Then, in one week, there can be multiple events(S, F) for different hours.

This multiple event issue can be solved by dividing the time unit into more smaller units, from week to hour, to have one event at a time.

  • For each minute, 0.1/60 [ppl/min] is the rate.

Then, it allows multiple events in one hour for the time unit is now in minutes.

We can continue to make the time unit more smaller, from hour to min to sec and so on.

Concept:

We can make the Binomial random variable handle multiple events by dividing a unit time into smaller units. or making n--> large number

If we make the time unit to be infinitesimal, we no longer have to worry about more than one event within the same unit time.

If the expected rate is a fixed value, (i.e. n*p = constant) , then when we increase n--> inf, the probability p--> 0

Math

As n-->∞ for k is given, and the expected rate ($$\lambda$$) is a fixed value, (i.e. n*p = constant)

Example 1:

From the previous example, the lamda was 17 ppl/week

Unlike the Binomial, Possion distribution does not require to use the value of n and p _****_Poisson is usually used for rare events (n is a large number), but not always.

As lambda becomes bigger, the graph looks more like a normal distribution. It assumed the lambda is a constant value but in real application, it may not be.

Also, it assumed the events are independent, but it may not be in real application.

Poisson distribution is discrete.

  • For continuous distribution, use Exponential distribution

ML example:

(Under construction)

Exponential Distribution

Definition

Exponential distribution is the probability distribution of the time between the events in a Poisson process. λ * e^(−λt).

It is continuous distribution, unlike Possion that is discrete.

Intuition

It predicts the amount of waiting time until the next event occurs.

  • Example: the time until the OS fails again.

What does X~Exp(0.25) means? 0.25 events?

  • 0.25 is not time duration. it is an event rate

  • X~Exp(lamda), lamda = Possion parameter rate

    Example: lamda=17claps/week, is a rate of the unit time of 1 week.

In terms of the unit of time of the event, time = 1/lamda

  • this is the decay paramter or rate.

  • 17claps/week --> (1/17) week per clap.

rate=0.25 means 0.25 events in the time unit(e.g. hours) 4 time unit(e.g. hours) until the event occurs

Understanding λ * e^(−λt)

We want to find the time between the events in Poisson process. The waiting period until the next event occurs means there is NO single event has happened

  • Possion (X=0)

If you want to model the probability distribution of “nothing happens during the time duration** _t**_,” not just during one unit time

P(Nothing happens during t time units)
= P(X=0 in the first time unit) 
  * P(X=0 in the second time unit) 
  ** P (X=0 in the t-th time unit) 
= e^−λ * e^−λ ** e^−λ = e^(-λt)

The Poisson distribution assumes that events occur independent of one another.

Therefore, we can calculate the probability of zero success during t units of time by multiplying P(X=0 in a single unit of time) t times.

P(T > t) = P(X=0 in t time units) = e^−λt
* T : the random variable of our interest!
      the random variable for the waiting time until the first event
* X : the # of events in the future which follows the Poisson dist.
* P(T > t) : The probability that the waiting time until the first event is greater than t time units
* P(X = 0 in t time units) : The probability of zero successes in t time units

A PDF is the derivative of the CDF. Since we already have the CDF, 1 - P(T > t), of exponential, we can get its PDF by differentiating it.

ML example:

(Under construction)

Sum of Exponential Random Variables

Read this article for simple and clear explanations

X1 and X2 are independent exponential random variables with the rate λ.

X1~EXP(λ) X2~EXP(λ)

Let Y=X1+X2.

Question : What is the PDF of Y? Where do we use the distribution of Y?

In the Poisson Process with rate λ, X1+X2 would represent the time at which the 2nd event happens (addition of time x1 and x2).

An Erlang distribution is then used to answer the question:

“How long do I have to wait before I see n success events occurs?”

The answer is a sum of independent exponentially distributed random variables, which is an Erlang(n, λ) distribution. The Erlang distribution is a special case of the Gamma distribution. The difference between Erlang and Gamma is that in a Gamma distribution, n can be a non-integer

  • Erlang (2, λ) distribution

Example

[Queuing Theory] You went to Chipotle and joined a line with two people ahead of you. One is being served and the other is waiting. Their service times S1 and S2 are independent, exponential random variables with mean of 2 minutes. (Thus the mean service rate is .5/minute.

Your conditional time in the queue is T = S1 + S2, given the system state N = 2. T is Erlang distributed.

What is the probability that you wait more than 5 minutes in the queue?

Let’s plug λ = 0.5 into the CDF that we have already derived.

A less-than-30% chance that I’ll wait for more than 5 minutes at Chipotle sounds good to me

Last updated