How pwelch computes confidence intervals
30 ビュー (過去 30 日間)
古いコメントを表示
Can Mathworks please point to exactly which formulae are being used to compute confidence intervals in pwelch? Perhaps state the formulae being used in the code? This would include how exacty the number of degrees of freedom are being computed.
3 件のコメント
William Rose
2021 年 2 月 15 日
I read one of the two books which is cited in document pwelch, which was recommended by Kiran. The book I read is Statistical digital signal processing and modeling by M.H. Hayes, 1996. The relevant portion is seciton 8.2.5, pp.415-419 (which are pages 435-439 in the electronic version). It does not give formulas for the confidence interval for the Welch periodogram. Therefore it does not answer the original question: how is the confidence interval determined.
回答 (2 件)
Kiran Felix Robert
2021 年 2 月 5 日
Hi Stephen,
2 件のコメント
William Rose
2021 年 2 月 14 日
Kiran, thak you for this answer but plese see my request above for more info. Thanks.
William Rose
2021 年 2 月 15 日
Kiran,
I read the source code for pwelch.m (C:\Program Files\MATLAB\R2018b\toolbox\signal\signal\pwelch.m). It does not reveal how the confidence interval is computed. pwelch.m calls welch.m (C:\Program Files\MATLAB\R2018b\toolbox\signal\signal\+spectrum\welch.m), which I inspected. It calls another version of pwelch() which I oculd not find, probably because it is in a compiled library. Therefore I would appreciate an actual answer to Stephen's question. Thank you.
William Rose
2021 年 2 月 26 日
The 2-sided confidence interval (C.I.) with probability p on the power spectral denisty (p.s.d.), which is returned by pwelch(), is given by
Pxxhat(f)*k/chi2((1+p)/2,k) < Pxx(f) < Pxxhat(f)*k/chi2((1-p)/2,k)
where Pxxhat(f) is the experimental estimate of the p.s.d. at frequency f, and Pxx(f) is the true, but unknown, value of the p.s.d., and k is the degrees of freedom. This is analagous to a confidence interval for a variance. The degrees of freedom is given by
k = 2*K
where upper case K is the number of segments or windows used when esitmating the p.s.d.
If the window length and offset are chosen so that the windows fit the signal without any leftover bits, then
K=(N-L)/D+1
where N=sinal length, L=window length, D=offset. (Don't confuse overlap and offset. Overlap=L-D. pwelch() takes overlap as an argument, but Welch (1967) uses offset=D in his formulas.
Example: Signal length=1024, window=Hamming, with length 256, half-overlapped. Then N=1024, L=256, and D=offset=128. Therefore K=7, by the formula above, and we can verify that 7 half overlapped windows cover the signal exactly. Then k=2*K = 14.
Prob{Pxxhat*14/chi2(.975,14) < Pxx < Pxxhat*14/chi2(.025,14)} = 0.95
=> Prob{Pxxhat*0.536 < Pxx < Pxxhat*2.487} = 0.95
I have checked the above formulas for various combinations of N, L, D, and probability p. The formuals correctly reproduce the confidence intervals reported by pwelch().
The formulas above are not correct from a statistical point of view, when the overlap is more than zero, but they are the formulas that pwelch() uses. The error is that k, the degrees of freedom, should be somewhat less than 2*K when the windows overlap. The exact formula is given by Welch, IEEE Trans Audio Electroacoustics AU-15:70-73, 1967, https://ieeexplore.ieee.org/document/1161901. It is complicated so I will not reproduce it here. It has a varible number of terms, depending on window shape and amount of overlap. The C.I. from pwelch() does not take window shape or overlap into account. pwelch()'s confidence intervals are correct when the windows do not overlap. The error in the confidence interval reported by pwelch() is relatively small when using Hamming or Hann or similar windows at half-overlap. In general, the C.I. reported by pwelch() is narrower than it should be. See table below:
The errors in the confidence intervals reported by pwelch() become more severe if the overlap is greater. By more severe I mean that the C.I. from pwelch() is significantly more narrow that than the true C.I.
5 件のコメント
William Rose
2023 年 11 月 25 日
No problem.
The estimate of the power spectrum at a specific frequency, when scaled appropriately, has an approximately chi-squared distribution.* Therefore the formula that says the 95% C.I. is approximately twice the standard deviation does not apply. That formula is for estimates that have a normal distribution - such as the mean of a set of measurements, which is approximately normal, due to the central limit theorem.
* I say "approximately" because, if overlapped segments are used to compute the spectrum, as is usually done, then the different components of the estimate are not all independent - due to the partial overlap. A true chi-squared random variable is the sum of the squares of independent standard normal random variables.
William Rose
2023 年 11 月 26 日
You said you are trying to understand the CIs given by pwelch(). I too am interested in this topic.
You may wonder if there is a simple formula, comparable to
which relates μ and σ for a group of power spectrum estimates to the upper 95% C.I. of the power spectrum. I do not think such a formula exists. The C.I. returned by pwelch() uses the formula I gave in an earlier comment. I have done millions of Monte Carlo simulations of power spectrum estimates. I conclude that the formula which pwelch() uses for confidence intervals is accurate when there is no overlap of segments in the spectrum estimate. The CI returned by pwelch() is somewhat narrower than it should be, when overlapping segments are used to estimate the spectrum. In other words, the actual 95% C.I. is slightly wider than the C.I. returned by pwelch().
If you wish to discuss this further, please email me separately by clicking on the envelope at the upper right corner of the box that pops up when you click on my name.
参考
カテゴリ
Help Center および File Exchange で Spectral Estimation についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!