Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Data can take the form of counts:
Compliments or complaints received
Items returned
Number of E. coli cases
Data can also be expressed in rates:
Percent of web traffic from a user permissions type
Percent of businesses in a region passing a safety audit
A random variable X has the Poisson distribution with parameter lambda if
This is the distribution of the number of events that should occur during a time interval, if
we expect lambda occurrences on average and if events occur at a constant rate.
Example
Let’s say a business receives 22 complaints a month on average. What is the probability that 30 or more complaints are received in a given month?
In R the probability of receiving 29 or less complaints in a particular month is generated with the ppois function:
ppois(29, lambda=22) [1] 0.9397826
This would specify that the probability of receiving 30 or more complaints a month is located in the upper tail of the probability density function (pdf).
ppois(29, lambda=22, lower=FALSE) [1] 0.06021738
In plain english this would indicate that given the probability density function, the likelihood of receiving 30 or more complaints in a given month is six percent or 6%.
Understanding these probability distributions allow you to explain some of variance in your observations. in other words it can help you understand if an observation in question is likely given a distribution or if a true exogenous variable is at play.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.