Randsample with matrix: extract multiple values from every column of a matrix without loop!

3 ビュー (過去 30 日間)
I have a matrix of weights pV = rand(N,F),
e.g. pV= [0.5522 0.3922 0.0221 1 0;...
0.0947 0.4357 0.0000 0 0.0064;...
0.0214 0.0000 0.0062 0 0;...
0.3317 0.1720 0.9717 0 0.9936];
For F times, I want to extract z_k numbers from a vector 1:N, using weights taken from the matrix pV. The loop version of the code is:
F = 5;
N = 4;
z_k = 2;
for f=1:F
seller(f,:)=randsample(1:N,z_k,'true',pV(:,f));
end
I am looking for solutions without the loop to improve efficiency. I have found the following solution (pV is already normalised to 1 by column, i.e. sum(pV) = 1 1 1 1 1) but I do not know how to fill the final matrix "seller" for those columns where less than z_k numbers satisfying the condition below are found.
If not enough numbers are found for a colum, I would like to fill it first with the numbers satisfying the condition and then with random numbers.
pV= [0.5522 0.3922 0.0221 1.0000 0;...
0.0947 0.4357 0.0000 0 0.0064;...
0.0214 0.0000 0.0062 0 0;...
0.3317 0.1720 0.9717 0 0.9936];
[rand_pV_val,rand_pV_rank_idx] = sort(pV,1);
pV_cdf = cumsum(rand_pV_val,1);
%rand_pV = rand(1,size(pV,2)); Use this for reproducibility:
rand_pV = [0.1119 0.2180 0.0649 0.4878 0.7268];
seller_full = repmat(rand_pV,size(pV,1),1)<pV_cdf;
seller_zk = cumsum(seller_full,1)==1|cumsum(seller_full,1)==2 & seller_full; %alternative ways?
enough_sellers = any(cumsum(seller_zk,1)>=z_k);
seller = zeros(z_k,F);
seller(:,enough_sellers) = reshape(rand_pV_rank_idx(seller_zk(:,enough_sellers)),z_k,[]);
seller =
2 1 0 0 0
4 2 0 0 0
%How can i fill the matrix seller where I didnt find enough sellers seller(:,~ enough_sellers)?
My desired results:
seller = 2 1 4 1 4
4 2 2 3 3
where 4, 1, 4 are the rand_pV_rank_idx in the position of the ones of the seller_zk matrix; the 2, 3, 3 are just random numbers different from 4, 1, 4. It can also happen that some columns of seller_zk have only zeros.
How can I do this, filling the matrix seller(:,~ enough_sellers) as I wrote above?
Also, I'm looking for alternative ways to find the matrix seller_zk as I would like it to be flexible to different values of z_k.
Thanks

採用された回答

the cyclist
the cyclist 2023 年 2 月 2 日
編集済み: the cyclist 2023 年 2 月 2 日
Here is a pretty obfuscated one-liner, but I think it does what you want, and should be fast:
% Your data
z_k = 2;
pV= [0.5522 0.3922 0.0221 1.0000 0;...
0.0947 0.4357 0.0000 0 0.0064;...
0.0214 0.0000 0.0062 0 0;...
0.3317 0.1720 0.9717 0 0.9936];
F = width(pV);
N = height(pV);
% The algorithm
seller = (squeeze(sum(rand(1,F,z_k) >= cumsum(pV))) + 1)'
seller = 2×5
1 2 4 1 4 1 1 4 1 4
There are two potentially non-intuitve elements to this:
  • Generating z_k*F draws from a uniform distribution, but lining those up in the 3rd dimension. This is going to take advantage of the fact that MATLAB will implicitly expand those vectors into an array for the comparison I describe next.
  • Compare those random draws to the cumulative sum of pV. This comparison is checking when the number is smaller than the cumulative distribution function (CDF) of your weights. This is a well known (but perhaps not widely known) method of doing the weighting you want.
I suggest you unpeel the algorith from the "inside out", to understand what it is doing. I did a little bit of testing to make sure the results are sensible and accurate, but not a ton.
  3 件のコメント
the cyclist
the cyclist 2023 年 2 月 2 日
I am confused by your comments here (specifically about not wanting repeated elements), and by the first code you posted.
My first point of confusion is that the 4th column of pV is [1; 0; 0; 0]. This means that a weighted draw from that column will always select the value 1. So, it is no consistent to say you want no repeats, but also that you want that weighting.
Second, in the call to randsample in your original code, you set the 'replacement' parameter to 'true', meaning that you are explicitly saying that repeated elements should be allowed.
Maybe I did not read your question carefully enough the first time, but when I look at it now, the second half of your question seems to ask for something quite different from the first half of the question. I think perhaps you edited it, after you first posted it?
I have to admit, I am now pretty lost in trying to fully understand what you want as output.
esperanta
esperanta 2023 年 2 月 3 日
Sorry, my mistake. I was actually looking for randsample with replacement, so your answer is correct. Thanks!

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeMATLAB についてさらに検索

製品


リリース

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by