Random Sorting within a set of columns
1 回表示 (過去 30 日間)
古いコメントを表示
Hi,
I have a matrix (ParmXStrtMatrix) with 1 million rows and 4 columns (providing names for easy reference) :
Parm1,Parm2,XStrtPoint1, XStrtPoint2
Let's assume I have 10,000 different combinations of parameters Parm1/Parm2. Within each parameter combination, lets suppose I have 100 Start points XStrtPoint1/XStrtPoint2.
How can I randomly sort by parameter combination so that in every set of 10K rows starting from the top there is only one instance of a particular parameter combination and any one random start point combination attached to that particular parameter combination?
Background (for context): - The parameter columns in the matrix serve as inputs to definition of an anonymous function. I'm running fmincon in a loop across all the rows in order to minimize the function for different sets of parameters. For any particular combination of parameters I also have the remaining 2 columns as start points for suppyling to fmincon.
Since each individual run of fmincon takes ~2 seconds (using symbolic function for high precision), so it does take almost 3 weeks...In order to not wait till complete execution, I can pause the algorithm after let's say 2 days, look at results and depending on its quality I can be happy and quit or let it continue from the paused point.
Presently I use,
ParmXStrtMatrix = ParmXStrtMatrix(randperm(size(ParmXStrtMatrix,1)),:)
Only issue with above approach is, I would like to get representative results from fmincon whenever I pause it. But with randperm there is no specification around sorting randomly by groups so there will not be equal representation from each group. For example if at the time of pausing I have already processed 100K rows then I would like to have 10% of the start points from each of the parameter combinations.
One potentially correct (but very inefficient and hardcoded) approach is to use a loop for each of the 10,000 parameter combination and within the loop use randperm to randomly sort the Xstart points and then creating a new column to give numbers from 1 to 100 corresponding to the randomly sorted 100 X start points. Finally outside the loop sorting using the newly created column.
Any ideas appreciated.
Thanks
2 件のコメント
Yu Jiang
2014 年 8 月 13 日
Hi Hari
It would be easier to understand what you would like to acheive, if you could provide a simple example.
-Yu Jiang
採用された回答
Roger Stafford
2014 年 8 月 14 日
The following code is a bit awkward and could be slow for a 1000000 x 4 matrix, but if you are taking three weeks to use it, I doubt if rearranging it initially is an important consideration.
I'm going to call your matrix 'P' for brevity instead of 'ParmXStrtMatrix'.
The use of 'randperm' in the first line below is meant to make the ordering in your "start points" random if that is desired rather than being determined by the initial ordering in P.
P = P(randperm(size(P,1)),:); % <-- Optional
[~,~,ib] = unique(P(:,1:2),'rows');
[t,p] = sort(ib);
f = find([true;diff(t)~=0;true]);
n = length(f)-1;
f1 = f(1:n);
f2 = f(2:n+1);
P2 = zeros(size(P));
k = 0;
b = true;
while b
b = false;
for ix = 1:n
if f1(ix) < f2(ix)
k = k+1;
P2(k,:) = P(p(f1(ix)),:);
f1(ix) = f1(ix)+1;
b = true;
end
end
end
P2 contains all the rows that are in P, but in a different ordering. Each succeeding "block" in P2 will have one copy of each possible parameter pair from the first two columns, along with single pairs of "start points" in the second two columns. Different pairs of "start points" are taken in different blocks for like parameter pairs.
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Loops and Conditional Statements についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!