フィルターのクリア

how to know the distribution of my data

5 ビュー (過去 30 日間)
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018 年 9 月 24 日
コメント済み: MAHMOUD ALZIOUD 2018 年 9 月 26 日
Dear Matlab Community, I have attached an excel file for some data I have. this data represents the percent of loads in each load bin with their histogram , my question is how can I know using MATLAB what distribution my data follows? is it normal? exponential or something else? and after that how to know the parameters of the distribution. Thanks alot
  8 件のコメント
dpb
dpb 2018 年 9 月 25 日
Your institution may have access to more; ask your advisor for what is available for your use on university machines.
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018 年 9 月 25 日
thank you for your help

サインインしてコメントする。

採用された回答

Image Analyst
Image Analyst 2018 年 9 月 26 日
When I fit the data to the sum of 3 Gaussians, the fit looks pretty reasonable. What do you think? And why do you need analytical equation(s) for the distribution rather than just using the ACTUAL distribution obtained from the histogram.
% Uses fitnlm() to fit a non-linear model (sum of three Gaussians with an offset) through noisy data.
% Requires the Statistics and Machine Learning Toolbox, which is where fitnlm() is contained.
% Initialization steps.
clc; % Clear the command window.
close all; % Close all figures (except those of imtool.)
clear; % Erase all existing variables. Or clearvars if you want.
workspace; % Make sure the workspace panel is showing.
format long g;
format compact;
fontSize = 20;
% % Create the X coordinates from 0 to 20 every 0.5 units.
% X = linspace(0, 40000, 4000);
% mu1 = 6000; % Mean, center of Gaussian.
% sigma1 = 2000; % Standard deviation.
% mu2 = 13000; % Mean, center of Gaussian.
% sigma2 = 2500; % Standard deviation.
%
% % Define function that the X values obey.
% a = 0 % Arbitrary sample values I picked.
% b = 3
% c = 18
% Y = a + b * exp(-((X - mu1)/sigma1) .^ 2) + ...
% c * exp(-((X - mu2)/sigma2) .^ 2); % Get a vector. No noise in this Y yet.
% X=X';
% Y=Y';
data = xlsread('matlab.xlsx');
X = data(:, 1);
Y = data(:, 2);
% Now we have noisy training data that we can send to fitnlm().
% Plot the noisy initial data.
plot(X, Y, 'b.', 'LineWidth', 2, 'MarkerSize', 15);
grid on;
drawnow;
% Convert X and Y into a table, which is the form fitnlm() likes the input data to be in.
tbl = table(X, Y);
% Define the model as Y = a + exp(-b*x)
% Note how this "x" of modelfun is related to big X and big Y.
% x((:, 1) is actually X and x(:, 2) is actually Y - the first and second columns of the table.
modelfun = @(b,x) b(1) + b(2) * exp(-((x(:, 1) - b(3))/b(4)).^2) + ...
b(5) * exp(-((x(:, 1) - b(6))/b(7)).^2) + ...
b(8) * exp(-((x(:, 1) - b(9))/b(10)).^2);
beta0 = [0, 2, 6000, 2000, 18, 13000, 2000, 2, 14000, 9000]; % Guess values to start with. Just make your best guess.
% Now the next line is where the actual model computation is done.
mdl = fitnlm(tbl, modelfun, beta0);
% Now the model creation is done and the coefficients have been determined.
% YAY!!!!
% Extract the coefficient values from the the model object.
% The actual coefficients are in the "Estimate" column of the "Coefficients" table that's part of the mode.
coefficients = mdl.Coefficients{:, 'Estimate'}
% Let's do a fit, but let's get more points on the fit, beyond just the widely spaced training points,
% so that we'll get a much smoother curve.
X = linspace(min(X), max(X), 1920); % Let's use 1920 points, which will fit across an HDTV screen about one sample per pixel.
% Create smoothed/regressed data using the model:
yFitted = coefficients(1) + coefficients(2) * exp(-((X - coefficients(3))/ coefficients(4)) .^2) + ...
coefficients(5) * exp(-((X - coefficients(6))/ coefficients(7)) .^2) + ...
coefficients(8) * exp(-((X - coefficients(9))/ coefficients(10)) .^2);
% Now we're done and we can plot the smooth model as a red line going through the noisy blue markers.
hold on;
plot(X, yFitted, 'r-', 'LineWidth', 2);
grid on;
title('Exponential Regression with fitnlm()', 'FontSize', fontSize);
xlabel('X', 'FontSize', fontSize);
ylabel('Y', 'FontSize', fontSize);
legendHandle = legend('Noisy Y', 'Fitted Y', 'Location', 'northeast');
legendHandle.FontSize = 25;
% Set up figure properties:
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'OuterPosition', [0 0 1 1]);
% Get rid of tool bar and pulldown menus that are along top of figure.
% set(gcf, 'Toolbar', 'none', 'Menu', 'none');
% Give a name to the title bar.
set(gcf, 'Name', 'Demo by ImageAnalyst', 'NumberTitle', 'Off')
  1 件のコメント
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018 年 9 月 26 日
this is genius and beautiful, I thank you very very much for your amazing help

サインインしてコメントする。

その他の回答 (2 件)

dpb
dpb 2018 年 9 月 25 日
Plotting the data it definitely is not normal; has long RH tail and isn't symmetric.
For hypothesis testing it would be better to go back to the underlying data from which the histogram was made if you have it.
  4 件のコメント
MAHMOUD ALZIOUD
MAHMOUD ALZIOUD 2018 年 9 月 25 日
actually when i went back to the original data (55000 rows) i found out that it is normal
dpb
dpb 2018 年 9 月 25 日
By what measure? As IA says, it looks bimodal (if not tri, that's kinda suspicious hump at the LH side of the central lobe) and the RH tail is definitely not consistent with Gaussian.
If the raw data look markedly different that would be surprising.

サインインしてコメントする。


Image Analyst
Image Analyst 2018 年 9 月 25 日
Since your data didn't look like one Gaussian to me, I fit it to the sum of two Gaussians with the attached m-file. I got this:

カテゴリ

Help Center および File ExchangeHypothesis Tests についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by