Is it possible to create a histogram with fractional entries for each bin?

Question

Leonard 2015 年 8 月 6 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/232653-is-it-possible-to-create-a-histogram-with-fractional-entries-for-each-bin

コメント済み: Leonard 2015 年 8 月 7 日

Thank you for looking at my question! I have included a brief introduction below; any suggestions or comments would be greatly appreciated!

Traditional histograms are generated using an array (e.g. sample_array = [1,1,1,2,2,3,3,3,3,4]) and the histogram is generated using h = histogram(sample_array,nbins);. In this example, with nbins = 4, I would have a simple histogram of column height associated with the number of times a particular value is observed in the sample array.

However, in my work I have come upon the need to instead use an array in place of a single value. For example:

sample_array = [1,1,[1,2],2,2,3,[2,3,4,5],3,4];

I am aware this is not an array. For convenience I am instead using a cell to contain the data:

sample_cell = {1,1,[1,2],2,2,3,[2,3,4,5],3,4};

What I need to do is generate the resulting histogram of sample_cell where I give EACH ENTRY of the cell EQUAL WEIGHT. The corresponding weights would be as follows:

sample_weight = {1,1,[1/2,1/2],1,1,1,[1/4,1/4,1/4,1/4],1,1};

From this, the resulting histogram would have the following counts in the bins for 1 thru 4:

Bin: Count

1: 2.5

2: 2.75

3: 2.25

4: 1.25

I am looking for a way to generate this resulting histogram which does not include using the least common multiple of the sizes of each entry. (I have a temporary solution to the problem including this quantity, however, I am unable to scale it up properly as I am dealing with very large prime numbers which result in LCM > 10^9.)

Again, any help or suggestions that you might have would be greatly appreciated!

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

David Young 2015 年 8 月 6 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/232653-is-it-possible-to-create-a-histogram-with-fractional-entries-for-each-bin#answer_188477

編集済み: David Young 2015 年 8 月 6 日

MATLAB Online で開く

If all the samples are positive integers, and the bins are all centred on the positive integers and with unit width, as in the initial example, you can just do this:

% data
sample_cell = {1,1,[1,2],2,2,3,[2,3,4,5],3,4};
samples = cat(2, sample_cell{:});
weight_cell = cellfun(@(a) ones(size(a))/length(a), sample_cell, ...
    'UniformOutput', false);
weights = cat(2, weight_cell{:});
counts = accumarray(samples(:), weights(:)).';

If this isn't the case (as in your more accurate example in the comments), you have to modify the code above by putting the samples into bins before weighting and counting them. This then looks like this:

% data and histogram parameters
sample_cell = {[0,0.41],0.32,[0.13,0.67,0.2],0.9,[0.3,1,0.89]};  
edges = 0:0.1:1;
% put all the samples into one vector, and make a vector of their weights
samples = cat(2, sample_cell{:});
weight_cell = cellfun(@(a) ones(size(a))/length(a), sample_cell, ...
    'UniformOutput', false);
weights = cat(2, weight_cell{:});
% work out which bin of the histogram each sample falls into
bins = discretize(samples, edges);
% Now form the counts, applying the weights for each sample
wtdcounts = accumarray(bins(:), weights(:)).';
% and normalise to probabilities
normcounts = wtdcounts/sum(wtdcounts);    % normalise to sum to 1
% plot like histogram
centres = conv(edges, [0.5 0.5], 'valid');
bar(centres, normcounts, 1);

This gives the same results as the code in your comment, but will be a great deal more economical I think.

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Leonard 2015 年 8 月 6 日

MATLAB Online で開く

Thank you for your response! To answer your question: No, my full problem begins with a cell of sets of unique values [0,1] which will require binning. For example:

sample_cell = {[0,0.41],0.32,[0.13,0.67,0.2],0.9,[0.3,1,0.89]};

For the time being, my histogram is generated using:

length_list = []; 
for i = 1:length(sample_cell)
   length_list = [length_list,length(sample_cell{i})];    
end
LCM_length_list = lcms(length_list); % I got this program from MFEX
final_array = []; 
for i = 1:length(sample_cell)
   array = sample_cell{i}; 
   for j = 1:length(array)
      for k = 1:(LCM_length_list/length(array))
         final_array = [final_array,array(j)]; 
      end
   end
end
h = histogram(final_array,0:0.1:1,'Normalization','Probability');

While this works as a temporary solution, I am ultimately looking to combine the histograms of many "sample_cell" sets of data while maintaining the overall number of entries in "sample cell" as the "integral" of the histogram. For example, in my above code "sample_cell" has 5 entries of equal weight. Another cell, sample_cell_2, could have 8 entries of equal weight. I am not able to combine the two resulting "final_array" arrays, however, because the least common multiple could potentially result in having upwards of 10^5 entries (due to large, prime numbers).

David Young 2015 年 8 月 6 日

I've modified my answer to deal with the more general case. The second piece of code in the answer gives the same results as your lcm code above on the test data.

Leonard 2015 年 8 月 7 日

Exactly what I was looking for! Thank you so much!

サインインしてコメントする。

Is it possible to create a histogram with fractional entries for each bin?

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

Is it possible to create a histogram with fractional entries for each bin?

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示