Mean and standard deviation in six sigma
96 ビュー (過去 30 日間)
古いコメントを表示
I wrote code to calculate the mean and standard deviation (sigma) for values like 1 sigma, 2 sigma, and 3 sigma. However, I noticed that the 2 sigma and 3 sigma lines are plotted outside the curve. According to statistical theory, +- 1 sigma, +- 2 sigma, and +- 3 sigma should encompass 68.26%, 95.44%, and 99.73% of the area under the curve, respectively. Therefore, these sigma lines should be inside the curve. The data file is attached.
My code is:
Corrected_UV_D4_spectra_1 = xlsread('corrected_uvd4_spectra_1','Sheet1','A26:B341');
data = Corrected_UV_D4_spectra_1;
x = data(:,1);
y = data(:,2);
figure;scatter(x,y);
mu = mean(x)
xline(mu,'g--')
sd = std(x)
xline(mu + sd,'m--')
xline(mu - sd,'m--')
xline(mu + 2*sd,'b:')
xline(mu - 2*sd,'b:')
xline(mu + 3*sd,'k-.')
xline(mu - 3*sd,'k-.')
Result I am getting is also attached
0 件のコメント
回答 (1 件)
akshatsood
2023 年 9 月 13 日
編集済み: akshatsood
2023 年 9 月 13 日
Hi SAKSHAM,
I understand that you want to visualize what we call the empirical rule in the statistical theory. I reviewed the attached code and noticed that that the plot was not intersecting the lines specified by μ ± 2σ and μ ± 3σ. I perceive it as an issue with the range of x data. To be more clear, consider the following observations
x = data(:,1);
min(x); % 425
max(x); % 740
Further, observe the range for μ ± 3σ
mu - 3*sd % 308.4033
mu + 3*sd % 856.5967
It can be easily observed that, min(x) >= mu - 3*sd and max(x) <= mu + 3*sd. This is not desired because, to illustrate the empirical rule effectively, the range of x should extend beyond the range of μ ± 3σ. To achieve this, the x data can be tweaked by incorporating the mean and standard deviation from the original data. Here is a code snippet that demonstrates this
Corrected_UV_D4_spectra_1 = xlsread('corrected_uvd4_spectra_1','sheet1','A26:B341');
data = Corrected_UV_D4_spectra_1;
x = data(:,1);
y = data(:,2);
mu = mean(x);
xline(mu,'g--')
sd = std(x);
x0 = (mu-4*sd):0.1:(mu+4*sd); % tweaking x data to adjust the range
% pdf of the normal distribution with mean mu and standard deviation sigma
pdf_values = normpdf(x0, mu, sd);
plot(x0, pdf_values); % plot normal distribution
xline(mu + sd,'m--')
xline(mu - sd,'m--')
xline(mu + 2*sd,'b:')
xline(mu - 2*sd,'b:')
xline(mu + 3*sd,'k-.')
xline(mu - 3*sd,'k-.')
Have a look at the documentation page for better understanding
I hope this helps.
3 件のコメント
akshatsood
2023 年 9 月 14 日
編集済み: akshatsood
2023 年 9 月 14 日
Hi SAKSHAM,
I understand that you want to include Y data. As you said that, you have the same X for the two datasets then It would be helpful if you could explain how the two datasets differ in terms of mean and standard deviation. Additionally, I would like to highlight a possible reason for the sigma lines not being inside the curve, which could be due to insufficient data points. The empirical rule states that
for normal distributions, 68.26% of observed data points will lie inside one standard deviation of the mean, 95.44% will fall within two standard deviations, and 99.73% will occur within three standard deviations.
However, it is important to note that the empirical rule assumes a truly normal distribution. If the datasets deviate significantly from normality or if there are outliers present, the empirical rule may not hold true. One possible workaround to replicate the behaviour is by using interpolation and extrapolation. However, it is important to note that this approach may not yield a good fit due to the limited number of data points available.
Have a look at the below references for interpolation and extrapolation
参考
カテゴリ
Help Center および File Exchange で Interpolation についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!