Generate random numbers given distribution/histogram
    30 ビュー (過去 30 日間)
  
       古いコメントを表示
    
MATLAB provides built-in functions to generate random numbers with an uniform or Gaussian (normal) distribution. My question is: if I have a discrete distribution or histogram, how can I can generate random numbers that have such a distribution (if the population (numbers I generate) is large enough)?
Please post here if anyone knows of a good method of doing this.
Thanks, David
0 件のコメント
採用された回答
  Jonathan Epperl
      
 2012 年 10 月 27 日
        Since nobody has any suggestions, here's one. If you have a discrete distribution, say it is a Nx2 matrix PD, first column the discrete values, second the probabilities of the corresponding value -- so sum(PD(:,2))==1.
Then map the probablities to the unit interval and use rand. What mean by that:
% Those are your values and the corr. probabilities:
PD =[
  1.0000    0.1000
  2.0000    0.3000
  3.0000    0.4000
  4.0000    0.2000];
% Then make it into a cumulative distribution
D = cumsum(PD(:,2));
% D = [0.1000    0.4000    0.8000    1.0000]'
Now for every r generated by rand, if it is between D(i) and D(i+1), then it corresponds to an outcome PD(1,i+1), with the obvious extension at i==0. Here's a way you could do that, even though I'm sure there are better ones:
R = rand(100,1); % Your trials
p = @(r) find(r<pd,1,'first'); % find the 1st index s.t. r<D(i);
% Now this are your results of the random trials
rR = arrayfun(p,R);
% Check whether the distribution looks right:  
hist(rR,1:4)
% It does, roughly 10% are 1, 30% are 2 and so on
If you want more help you should post a minimal example of the form in which you have the discrete distribution.
5 件のコメント
  Aasheesh Dixit
 2020 年 6 月 8 日
				one change is required: 
p = @(r) find(r<d,1,'first'); % find the 1st index s.t. r<D(i);
  Himanshu Tanwar
 2022 年 5 月 25 日
				Overall code may be written (with some slight modifications) as:
dx = 0.001;
x = -100 : dx : 100; % limits depending on the random variable definition.
% For example, (dx : dx : 100) for Rayleigh (0 not recommended).
f = PDF(x);
F = cumsum(f) * dx; % CDF
F = F / F(end); % recommended when max(F) is close to 1. Otherwise increase x - points to achieve F(end) - max value - close to 1.
N = 10000;
U = rand(1, N);
pdf2rand = @(u) x(find(u <= F, 1, 'first'));
X = arrayfun(pdf2rand, U);
X % desired random variable points
function f = PDF(x)
% Probability Distribution Function
...
end
その他の回答 (2 件)
  Image Analyst
      
      
 2012 年 10 月 28 日
        I didn't notice your question or I would have answered, especially since it's asked so often. Try RANDRAW ( http://www.mathworks.com/matlabcentral/fileexchange/7309-randraw) for a list of common distributions. Or to refresh yourself on the theory of why using the CDF works, see Wikipedia: http://en.wikipedia.org/wiki/Inverse_transform_sampling
  Theron FARRELL
      
 2019 年 4 月 30 日
        
      編集済み: Theron FARRELL
      
 2019 年 4 月 30 日
  
      Hi there,
I use this naive function to generate artificial outliers applied in machine learning. Hope that it will be a bit help in your case.
function [Out_Data, Out_PDF, CHist] = Complement_PDF(Hist, Data_Num, p)
% Generate a 1D vector of data with a PDF specified as the complementary PDF of input historgram. Note that the larger
% Data_Num is, the more Out_PDF will resemble to CHist
% Input
% Hist: PDF/Histogram of data
% Data_Num: Desired number of data to be generated
% p: Precision given by number of digits after 0
% Output
% Out_Data: Generated data as per the complementary PDF
% Out_PDF: The complementary PDF as per Out_Data
% CHist: The complementary PDF as per Hist
% Example
% Hist = [1, 6, 7, 100, 0, 0, 0, 2, 3, 5];
% Data_Number = 100000;
% p = 3
Hist = Hist/sum(Hist);
CHist = 1- Hist;
CHist = CHist/sum(CHist);
CDF_CHist = cumsum(CHist);
CDF_CHist = double(int32(CDF_CHist*10^p))/10^p;
Out_Data = zeros(1, Data_Num);
Out_PDF = zeros(1, length(CDF_CHist));
for i = 1:Data_Num
    % Generate a uniformly distributed variable
    x = double(int32(rand*10^p))/10^p;
    % Inversely index CDF
    Out_Data(i) = Inverse_CDF(x, CDF_CHist);
    temp = floor(Out_Data(i) * length(CDF_CHist));
    Out_PDF(temp) = Out_PDF(temp) + 1;
end
figure;
subplot 221, bar(Hist); 
subplot 222, bar(CHist); 
subplot 223, plot(CDF_CHist);
subplot 224, bar(Out_PDF);
end
function [y] = Inverse_CDF(x, CDF_CHist)
CDF_CHist_Ext = [0, CDF_CHist];
y = 1;
for ind = 1:length(CDF_CHist)
    if (x >= CDF_CHist_Ext(ind)) && (x < CDF_CHist_Ext(ind+1))
        y = ind/length(CDF_CHist);
        break;
    end
end
end

0 件のコメント
参考
カテゴリ
				Help Center および File Exchange で Random Number Generation についてさらに検索
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!







