Meaning of box plot notches
36 ビュー (過去 30 日間)
古いコメントを表示
I am confused by the explanation of the notches on box plots. Matlab help indicates that the notches are at q+/-1.57*iqr/sqrt(n), where q is the median and iqr is the interquartile range. It is then stated that this is equivalent to the 5% confidence limits on the median. From what I learned about statistics a multiplier of 1.96 would be the 95% confidence limits on the median, so I'm not sure 1) why matlab chose to use the 1.57 multiplier, 2) where the 5% confidence limit result comes from in using this multiplier. Looks more like 67% conf limits, or equivalent to 1 sigma for a normal distribution? Is there a way to get box plots to make notches at the 95% confidence limits of the median (ie, using the 1.96 multiplier)?
Thanks for clarifying.
0 件のコメント
回答 (1 件)
the cyclist
2021 年 7 月 26 日
The value 1.57 is not something that "MATLAB chose", but rather is directly out of the original research paper that introduced the box-and-whiskers plot. There is a nice explanation in this CrossValidated answer.
If you want to change the value -- which I would only do if you develop your own rigorous theory of confidence intervals of medians -- you could copy the boxplot.m file to your own directory, and edit the value there.
2 件のコメント
the cyclist
2021 年 7 月 30 日
I think the language in the original paper is not crystal clear, but here is my interpretation (which aligns with what is stated in the MATLAB documenation). The paper states ...
"Should one desire a notch indicating a 95 percent confidence interval about each median, C = 1.96 would be used." [I added the emphasis.]
To me, this sentence is very clear, that the notch around a single box is not the 95th confidence interval of that individual median (because C = 1.96 was not chosen).
The next sentence of the paper is ...
" ... a form of 'gap gauge' ... at the 95th percent level was desired."
The exact meaning of "gap gauge" is not perfectly clear to me, but I interpret that to mean that if the notches do not overlap, then the medians are significantly different. This is consistent with both the sentences from the documentation that you quoted in your comment.
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!