Mean and standard deviation in six sigma

96 ビュー (過去 30 日間)
SAKSHAM SHRIVASTAVA
SAKSHAM SHRIVASTAVA 2023 年 9 月 12 日
編集済み: SAKSHAM SHRIVASTAVA 2023 年 9 月 22 日
I wrote code to calculate the mean and standard deviation (sigma) for values like 1 sigma, 2 sigma, and 3 sigma. However, I noticed that the 2 sigma and 3 sigma lines are plotted outside the curve. According to statistical theory, +- 1 sigma, +- 2 sigma, and +- 3 sigma should encompass 68.26%, 95.44%, and 99.73% of the area under the curve, respectively. Therefore, these sigma lines should be inside the curve. The data file is attached.
My code is:
Corrected_UV_D4_spectra_1 = xlsread('corrected_uvd4_spectra_1','Sheet1','A26:B341');
data = Corrected_UV_D4_spectra_1;
x = data(:,1);
y = data(:,2);
figure;scatter(x,y);
mu = mean(x)
mu = 582.5000
xline(mu,'g--')
sd = std(x)
sd = 91.3656
xline(mu + sd,'m--')
xline(mu - sd,'m--')
xline(mu + 2*sd,'b:')
xline(mu - 2*sd,'b:')
xline(mu + 3*sd,'k-.')
xline(mu - 3*sd,'k-.')
Result I am getting is also attached

回答 (1 件)

akshatsood
akshatsood 2023 年 9 月 13 日
編集済み: akshatsood 2023 年 9 月 13 日
Hi SAKSHAM,
I understand that you want to visualize what we call the empirical rule in the statistical theory. I reviewed the attached code and noticed that that the plot was not intersecting the lines specified by μ ± 2σ and μ ± 3σ. I perceive it as an issue with the range of x data. To be more clear, consider the following observations
x = data(:,1);
min(x); % 425
max(x); % 740
Further, observe the range for μ ± 3σ
mu - 3*sd % 308.4033
mu + 3*sd % 856.5967
It can be easily observed that, min(x) >= mu - 3*sd and max(x) <= mu + 3*sd. This is not desired because, to illustrate the empirical rule effectively, the range of x should extend beyond the range of μ ± 3σ. To achieve this, the x data can be tweaked by incorporating the mean and standard deviation from the original data. Here is a code snippet that demonstrates this
Corrected_UV_D4_spectra_1 = xlsread('corrected_uvd4_spectra_1','sheet1','A26:B341');
data = Corrected_UV_D4_spectra_1;
x = data(:,1);
y = data(:,2);
mu = mean(x);
xline(mu,'g--')
sd = std(x);
x0 = (mu-4*sd):0.1:(mu+4*sd); % tweaking x data to adjust the range
% pdf of the normal distribution with mean mu and standard deviation sigma
pdf_values = normpdf(x0, mu, sd);
plot(x0, pdf_values); % plot normal distribution
xline(mu + sd,'m--')
xline(mu - sd,'m--')
xline(mu + 2*sd,'b:')
xline(mu - 2*sd,'b:')
xline(mu + 3*sd,'k-.')
xline(mu - 3*sd,'k-.')
Have a look at the documentation page for better understanding
I hope this helps.
  3 件のコメント
akshatsood
akshatsood 2023 年 9 月 14 日
編集済み: akshatsood 2023 年 9 月 14 日
Hi SAKSHAM,
I understand that you want to include Y data. As you said that, you have the same X for the two datasets then It would be helpful if you could explain how the two datasets differ in terms of mean and standard deviation. Additionally, I would like to highlight a possible reason for the sigma lines not being inside the curve, which could be due to insufficient data points. The empirical rule states that
for normal distributions, 68.26% of observed data points will lie inside one standard deviation of the mean, 95.44% will fall within two standard deviations, and 99.73% will occur within three standard deviations.
However, it is important to note that the empirical rule assumes a truly normal distribution. If the datasets deviate significantly from normality or if there are outliers present, the empirical rule may not hold true. One possible workaround to replicate the behaviour is by using interpolation and extrapolation. However, it is important to note that this approach may not yield a good fit due to the limited number of data points available.
Have a look at the below references for interpolation and extrapolation
SAKSHAM SHRIVASTAVA
SAKSHAM SHRIVASTAVA 2023 年 9 月 21 日
編集済み: SAKSHAM SHRIVASTAVA 2023 年 9 月 22 日
Hi Akshatsood, Thanks for your detailed explaination. All of your answers are really helps me a lot.
I am working on the normal distribution as you mentioned in one of the above comments. I have a doubt that the 1sigma line in the figure doesnot coinciding with the standard deviation line drawn by the data statistics tool bar in the figure. Please see the attached image.
And also I am using area function to get the area under the curve between -1sigma to +1sigma but i am not getting numerical value of that area.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeInterpolation についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by