The Shapiro-Wilk goodness-of-fit test for normality
MuPAD® notebooks will be removed in a future release. Use MATLAB® live scripts instead.
MATLAB live scripts support most MuPAD functionality, though there are some differences. For more information, see Convert MuPAD Notebooks to MATLAB Live Scripts.
x1, x2, …) stats::swGOFT(
[x1, x2, …]) stats::swGOFT(
…]) applies the Shapiro-Wilk goodness-of-fit
test for the null hypothesis: “the data x1, x2,
… are normally distributed (with unknown mean
and variance)”. The sample size must not be larger than 5000 and not
smaller than 3.
An error is raised by
stats::swGOFT if any
of the data cannot be converted to a real floating-point number or
if the sample size is too large or too small.
…, yn be
the input data x1,
…, xn arranged
in ascending order.
stats::swGOFT returns the list
= p, StatValue = w] containing the following information:
w is the attained value of the
Here, the ai are
the Shapiro-Wilk coefficients, and
S^2 is the statistical variance of the sample.
p is the observed significance
level of the Shapiro-Wilk statistic W.
The observed significance level
PValue = p returned
stats::swGOFT has to be interpreted in the following
p is smaller than a given significance
the null hypothesis may be rejected at level α.
p is larger than α,
the null hypothesis should not be rejected at level α.
The function is sensitive to the environment variable
determines the numerical working precision.
We test a list of random data that purport to be a sample of normally distributed numbers:
f := stats::normalRandom(0, 1, Seed = 123): data := [f() $ i = 1..400]: stats::swGOFT(data)
The observed significance level is not small. Consequently, one should not reject the null hypothesis that the data are normally distributed.
Next, we dote the data with some uniformly continuous deviates:
impuredata := data . [frandom() $ i = 1..101]: stats::swGOFT(impuredata)
The doted data may be rejected as a sample of normal deviates at significance levels as small as .
delete f, data, impuredata:
We create a sample consisting of one string column and two non-string columns:
s := stats::sample( [["1996", 1242, PI - 1/2], ["1997", 1353, PI + 0.3], ["1998", 1142, PI + 0.5], ["1999", 1201, PI - 1], ["2001", 1201, PI] ])
"1996" 1242 PI - 1/2 "1997" 1353 PI + 0.3 "1998" 1142 PI + 0.5 "1999" 1201 PI - 1 "2001" 1201 PI
We check whether the data of the third column are normally distributed:
The observed significance level returned by the test is not small: the test does not indicate that the data are not normally distributed.
The statistical data: real numerical values
A sample of domain type
An integer representing a column index of the sample
List of two equations
[PValue = p, StatValue = w] with
See the `Details' section below for the interpretation of these values.
The implemented algorithm for the computation of the Shapiro-Wilk coefficients, the Shapiro-Wilk statistic and the observed significance level is based on: Patrick Royston, “Algorithm AS R94”, Applied Statistics, Vol.44, No.4 (1995).
Following Royston, the Shapiro-Wilk coefficients ai are computed by an approximation of
where M denotes the expected values of standard normal order statistic for a sample, V is the corresponding covariance matrix, and MT is the transpose of M.