standard deviation of the data. The y-intercept
indicates the mean of the data. For
data
sets of < 30 observations, only substantial departure from linearity should
be
interpreted as conclusive evidence of non normality. How the plot differs
from a
straight line can give you
information about how the data distribution differs from a
Normal distribution.
If the graph is (X(i)
, Zi), A light-tailed distribution (relative to the Normal) will
give an
S-shaped plot with the left end of
the plot curving upward. A heavy-tailed distribution
(relative to the Normal)
will give an S-shaped plot with the left end of the plot curving
downward. A right-skewed distribution will give a plot having
middle points falling above
the line and end points
falling below the line. Note: Some statistical packages plot (X(i)
, Zi)
instead of (Zi, X(i)). The
interpretation of the shape of the plot as an indication of how the
distribution differs from
normality will then be reversed.
Probability
plots can be used to check distributional assumptions for distributions other
than
Normal.
For example, to check whether data comes from an exponential distribution,
compare
percentiles
of data to percentiles of a standard Exponential (with rate λ = 1). Additionally,
to
check whether two sets of data come from the same underlying distribution, plot
the
percentiles
of the first data set versus the percentiles of the second data set. The plot
should
be approximately linear.
The
exponential probability density function is widely used in engineering to
describe the
distribution
of many types of variables, most often, the distribution of waiting times
between
occurrences of successive events. A random variable X has an exponential
distribution
with parameter λ
(λ >0) if its pdf is f(x) = λ exp(-λx), x >0,.
The
mean of an exponential r.v. with parameter λ is 1/λ . The standard deviation of an
exponential r.v. with parameter λ is also 1/λ . The StatConcepts Lab “How are
Populations
Distributed” allows you to visualize exponential pdfs with different parameters
λ.
Suppose
the number of occurrences of an event in a time interval of length t follows
a
Poisson
process with rate αt and the
number of occurrences in nonoverlapping intervals
are
independent. It can be shown that the waiting time until the first
occurrence of the
event
follows an exponential distribution with parameter α.
The
exponential distribution has the memoryless property, meaning that the
probability we
wait
at least b additional minutes for an event to occur, given that we’ve
already waited at
least
a minutes (a<b) is the same as the probability that we have to
wait b minutes from
the
start. In other words, the distribution of additional waiting time is exactly
the same as
the
distribution of original waiting time, or distribution of additional waiting
time is
independent
of how long you’ve already waited. (The distribution of the number of
occurrences
until the first success, given the events are independent with constant
probability
of success, p, also has this property.)