wrong values in histogram plotting
6 ビュー (過去 30 日間)
古いコメントを表示
Hello,
I'm trying to plot a histogram of an array. I have a csv file with a list of double values, and I want to see how many elements have a value that is less or equal to 10% of the maximal value, 20%, 30% and etc.. I tried using the following code, but I get wrong statistics, when I check how many elements have a lesser or equal value to 10% of the maximal element, I see that there are 11173940 such elements. I did so by using the following code:
maxElement = max(array);
elementCount = sum(array < maxElement * 0.1);
when I print the histogram it shows like there are less than 180 elements that constitute this condition. this is the code I used (I have a lot of csv files that I want to read and analyze in the same manner, that's why the filename loop):
clear; clc;
dataDir = 'hist_res_rel';
fileList = dir(strcat(dataDir, '/*.csv'));
plotDir = 'plot_dir_rel';
for i = 1:numel(fileList)
fileName = fileList(i).name;
epoch = fileName(length(fileName)-5:length(fileName)-4);
if contains(fileName,'a_rel')
plot_title = strcat('A Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
if contains(fileName,'b_rel')
plot_title = strcat('B Realtive Value Change Between Epochs: ', epoch, '-', num2str(str2double(epoch)+10));
end
rel_val = readmatrix(strcat(dataDir, fileName));
rel_val = abs(rel_val);
Max = max(rel_val);
p = 0.1;
x = zeros(10, 1);
y = zeros(10, 1);
for index = 1:10
percentage = Max * p;
x(index) = percentage;
if index == 1
y(index) = sum(rel_val <= x(index));
else
y(index) = sum(rel_val <= x(index) & rel_val > x(index-1));
end
p = p + 0.1;
end
f = histogram(rel_val, x);
xticks(x);
title(plot_title);
xlabel('Percantage of Relative Change');
ylabel('Amount of Parameters');
xticklabels({'0', '10','20','30', '40', '50', '60', '70', '80', '90', '100'});
saveas(f, strcat(plotDir, '/plot_', fileName(1:length(fileName)-3), '.jpg'));
end
this is the histogram that I get:
and this is the csv file that I'm trying to analyze just to make sure everything works (sorry, it's so large I had to use an external site for the upload):
Thank you so much for your time and attention, I appreciate your help.
0 件のコメント
採用された回答
Ganesh
2023 年 12 月 27 日
I understand that your histogram is inconsistent with the data you have. The issue you are facing can be easily resolved by adding 0 at the start of the variable "x".
When using a histogram, the histogram calculates the number of data points between edges. As your variable "x" begins with Max*0.1, the histogram plots interval between Max*0.1 and Max*0.2 and so on. By adding 0 at the start you can make the first edge to be 0, Max*0.1, which will give you the right result.
x = [0;x] % Add this line before plotting the histogram
Kindly refer to the following document for more information and examples on using the "histogram()" function:
Hope this helps!
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Histograms についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!