described using a Normal Distribution

- For large data sets, analysis
of the shape of the histogram can help you decide if the data

follows approximately a Normal distribution. **Normal Probability Plot**(Normal QQ Plot): For data sets of any size (greater than about

6 or so), the percentiles of the data can be compared to the corresponding percentiles of a

standard Normal distribution. If the data can be adequately described using a Normal

distribution, the percentiles of the data should be approximately linearly related to the

percentiles of the standard Normal distribution. To assess whether this holds, plot pairs

(Z_{i}, X_{(i)}), where Z_{i}denotes the [100(i-.5)/n]-th percentile of a standard Normal random

variable and X_{(i)}denotes the i-th smallest observation. The points should fall along a

straight line if the data follows a Normal distribution. The slope of the line indicates the

standard deviation of the data. The y-intercept
indicates the mean of the data. **For
data
sets of < 30 observations, only substantial departure from linearity should
be
interpreted as conclusive evidence of non normality. **How the plot differs
from a

straight line can give you
information about how the data distribution differs from a

Normal distribution.

If the graph is (X_{(i)}
, Z_{i}), A light-tailed distribution (relative to the Normal) will
give an

S-shaped plot with the left end of
the plot curving upward. A heavy-tailed distribution

(relative to the Normal)
will give an S-shaped plot with the left end of the plot curving

downward. A right-skewed distribution will give a plot having
middle points falling above

the line and end points
falling below the line. **Note:** Some statistical packages plot (X_{(i)}
, Z_{i})

instead of (Z_{i}, X_{(i)}). The
interpretation of the shape of the plot as an indication of how the

distribution differs from
normality will then be reversed.

Probability
plots can be used to check distributional assumptions for distributions other
than

Normal.
For example, to check whether data comes from an exponential distribution,
compare

percentiles
of data to percentiles of a standard Exponential (with rate** **λ = 1). Additionally,

to
check whether two sets of data come from the same underlying distribution, plot
the

percentiles
of the first data set versus the percentiles of the second data set. The plot

should
be approximately linear.

The
exponential probability density function is widely used in engineering to
describe the

distribution
of many types of variables, most often, the distribution of waiting times

between
occurrences of successive events. A random variable X has an exponential

distribution
with parameter λ** _{
}**(λ

The
mean of an exponential r.v. with parameter λ** _{ }**is 1/λ

exponential r.v. with parameter λ is also 1/λ

Populations
Distributed” allows you to visualize exponential pdfs with different parameters
λ._{ }

Suppose
the **number of occurrences of an event in a time interval of length t follows
a **

**Poisson
process with rate αt**_{ } and the
number of occurrences in nonoverlapping intervals

are
independent. It can be shown that the **waiting time until the first
occurrence of the **

**event
follows an exponential distribution with parameter α.**

The
exponential distribution has the **memoryless **property, meaning that the
probability we

wait
at least *b* additional minutes for an event to occur, given that we’ve
already waited at

least
*a* minutes (*a<b*) is the same as the probability that we have to
wait *b *minutes from

the
start. In other words, the distribution of additional waiting time is exactly
the same as

the
distribution of original waiting time, or distribution of additional waiting
time is

independent
of how long you’ve already waited. (The distribution of the number of

occurrences
until the first success, given the events are independent with constant

probability
of success, p, also has this property.)