A frequency distribution table shows us a summarized grouping of data orderly arranged, divided into mutually exclusive classes (no data value can fall into two different classes), inclusive or exhaustive (all data values must be included) and the number of occurrences in a class. It is a way of showing unorganized data. Some of the graphs that can be used with frequency distributions are histograms, line charts, bar charts and pie charts.
Frequency distributions are used for both qualitative and quantitative data. Here, we are presenting it for quantitative data (measuring observations).
An essential requirement for a frequency distribution is to decide about the number of classes. Theory recomends it should be between 5 and 20 classes. However, some times it it required more classes. Too many classes or too few classes might not reveal the basic shape of the data set, also it will be difficult to interpret such frequency distribution. The maximum number of classes may be determined by a formula. Generally the class interval or class width is the same for all classes.
There are several mathematical procedures which can help calculate the number of classes, all of them have their pros and cons, and can be found in many statistical texts. Here, we include a menu to choose one from:
-- Square root rule
-- 2 to the k rule
-- Rice rule
-- Sturges rule
-- Doane formula
-- Freedman-Diaconis rule
-- Scott rule
-- Shimazaki-Shinomoto method*
In other case you must give the number of classes you need.
*For the last option, it is necessary to download the sshist m-function (Histogram Binwidth Optimization). It returns the optimal number of bins in a histogram used for density estimation. Optimization principle is to minimize expected L2 loss function between the histogram and an unknown underlying density function. An assumption made is merely that samples are drawn from the density independently each other. It can found at
http://www.mathworks.com/matlabcentral/fileexchange/24913-histogram-binwidth-optimization
Why one should organize data in a frequency distribution table?
-- To organize data in a meaningful, intelligible way.
-- To enable the reader to determine the nature or shape of the
distribution (can make patterns within the data more evident).
-- To facilitate computational procedures for measuring the center,
variation, distribution shape, outlier(s), and time.
-- To enable the researcher to draw charts and graphs for the
presentation of data.
-- To enable the reader to make comparison among different data sets.
This m-function also offer a data graph dispaly menu you can select one
option:
-- Histogram
-- Frequency polygon
-- Absolut ogive
-- Relative ogive (here with the observed and the predicted cdf)
-- All
Syntax: function [y] = fdt(x)
Input:
x - data vector (from a menu can choose the number of classes
procedure)
Output:
- frequency (distribution) table and a data graph display optionally from a menu)
[y] - frequency (distribution) table, a data graph display (optionally from a menu), and absolut frequencies and class marks matrix (optionally). This last matrix can be further used for some grouped statistics procedure you can find in my Matlab FEX author page.
引用
Antonio Trujillo-Ortiz (2024). fdt (https://www.mathworks.com/matlabcentral/fileexchange/47955-fdt), MATLAB Central File Exchange. に取得済み.
MATLAB リリースの互換性
プラットフォームの互換性
Windows macOS Linuxカテゴリ
タグ
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!