Q-Q plot

From Wikipedia, the free encyclopedia

Jump to: navigation, search
A normal Q-Q plot of randomly generated exponential(1) data.
A normal Q-Q plot of randomly generated normal(0,1) data.

In statistics, a Q-Q plot ("Q" stands for quantile) is a graphical method for diagnosing differences between the probability distribution of a statistical population from which a random sample has been taken and a comparison distribution. An example of the kind of differences that can be tested for is non-normality of the population distribution.

For a sample of size n, one plots n points, with the (n + 1)-quantiles of the comparison distribution (e.g. the normal distribution) on the horizontal axis (for k = 1, ..., n), and the order statistics of the sample on the vertical axis. If the population distribution is the same as the comparison distribution this approximates a straight line, especially near the center. In the case of substantial deviations from linearity, the statistician rejects the null hypothesis of sameness.

Contents

[edit] Plotting positions

For the quantiles of the comparison distribution typically the formula k/(n + 1) is used. Several different formulas have been used or proposed as symmetrical plotting positions. Such formulas have the form (k − a)/(n + 1 − 2a) for some value of a in the range from 0 to 1/2. The above expression k/(n + 1) is one example of these, for a = 0. Other expressions include:

  • (k − 1/3)/(n + 1/3) [1]
  • (k − 0.3175)/(n + 0.365) [2]
  • (k − 0.326)/(n + 0.348) [3]
  • (k − 0.375)/(n + 0.25)[4]
  • (k − 0.44)/(n + 0.12)[5]

For large sample size, n, there is little difference between these various expressions.

[edit] Relation with probability plots

Q-Q plots are similar to probability plots (which for a normal distribution are called normal probability plots or rankit plots). The difference is that in a probability plot, instead of using the quantile of the distribution as the x-axis, one uses the expected value of the kth order statistic from the distribution. Only when n is small is there a substantial difference between a Q-Q plot and a probability plot.

[edit] See also

[edit] References

  1. ^ A simple (and easy to remember) formula for plotting positions.
  2. ^ Engineering Statistics Handbook: Normal Probability Plot – Note that this also uses a different expression for the first & last points. [1] cites the original work by Filliben 1975.
  3. ^ Distribution free plotting position, Yu & Huang
  4. ^ This is Blom's earlier approximation 1953 and is the expression used in MINITAB.
  5. ^ This plotting position was used by Gringorten 1963 to plot points in tests for the Gumbel distribution.

[edit] Links

Personal tools