special kind of sort

Question

0 投票

My data has two columns and I am trying to find a way to sort the rows such that the values in the second column appear before they are mentioned in the first column

To make this more clear some example data where row5 should be placed before row3:

row1    NaN        br1_0_40
row2    br1_0_40   br1_40_100
row3    br2_0_20   br1_100_120
row4    NaN        br1_120_200
row5    NaN        br2_0_20
row6    br2_0_20   br2_20_25

In reality my database is of course much larger so I am trying to find an efficient algorithm to sort in this manner so any help is appreciated

2 件のコメント
なしを表示なしを非表示

Guru 2013 年 7 月 4 日

Maybe I'm just missing something but based on your criteria do you always want row5 before row 3, no matter what?? If not, what is exactly is your criteria for sorting?

dpb 2013 年 7 月 4 日

編集済み: dpb 2013 年 7 月 4 日

Clarification of objective...is the objective to only have the condition met w/o sorting anything? Or is sorting on either column first desirable or not?

Can there be duplicates in either/both lists?

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

dpb 2013 年 7 月 4 日

MATLAB Online で開く

0 投票

One possible solution...

>> sortrows(a,[2,3])
ans = 
    'r1'    'NaN'         'br1_0_40'   
    'r4'    'NaN'         'br1_120_200'
    'r5'    'NaN'         'br2_0_20'   
    'r2'    'br1_0_40'    'br1_40_100' 
    'r3'    'br2_0_20'    'br1_100_120'
    'r6'    'br2_0_20'    'br2_20_25'  
>>

2 件のコメント
なしを表示なしを非表示

Ingrid 2013 年 7 月 4 日

This does not necessarily lead to a correct solution...

for example if row5 has br3_100_130 in the second column instead of NaN than it would not be sorted before row 3

I wish it would be so simple but I think a more complicated solution is required but I hope someone can prove me wrong

dpb 2013 年 7 月 4 日

Not positive, but it would seem there isn't necessarily a solution otomh...

But, best I can do at the moment is to suggest a heuristic approach something like (after an initial sort, of course)--

Identify "bad" places in your list - and remember them as indices. For each index shuffle that offending row until it looks ok relative to your condition.

In this way you should perturb sorting only in limited intervals, and general sorting shouldn't change much.

Again, whether there's a solution possible in the general case is, in my mind, questionable altho I've not tried to actually prove it one way or t'other...

HTH w/ thought process, anyway...

サインインしてコメントする。

Answer 2

dpb 2013 年 7 月 4 日

編集済み: dpb 2013 年 7 月 6 日

MATLAB Online で開く

0 投票

OK, here's a simple demonstration of what had in mind that works for the sample dataset -- I've not tested it on anything more complex; in particular haven't thought a lot (as in any) about the ramifications on traversing the unique list after the perturbation...

MATLAB code
u=unique(a(:,3));
for i=1:length(u)
  [~,~,ib]=intersect(u(i),a(:,2),'stable');
  if ~isempty(ib)&(i>ib) 
    a(ib:i,:)=circshift(a(ib:i,:),1);
  end
end
%At the command line w/ the sample dataset as modified...
>> a=t
a = 
  'r1'    'NaN'            'br1_0_40'   
  'r2'    'br1_0_40'       'br1_40_100' 
  'r3'    'br2_0_20'       'br1_100_120'
  'r4'    'NaN'            'br1_120_200'
  'r5'    'br3_100_130'    'br2_0_20'   
  'r6'    'br2_0_20'       'br2_20_25'  
>> u=unique(a(:,3));
>> for i=1:length(u),
[~,~,ib]=intersect(u(i),a(:,2),'stable');
if ~isempty(ib)&(i>ib) 
a(ib:i,:)=circshift(a(ib:i,:),1);end,end
>> a
a = 
  'r1'    'NaN'            'br1_0_40'   
  'r2'    'br1_0_40'       'br1_40_100' 
  'r5'    'br3_100_130'    'br2_0_20'   
  'r3'    'br2_0_20'       'br1_100_120'
  'r4'    'NaN'            'br1_120_200'
  'r6'    'br2_0_20'       'br2_20_25'  
>>

Salt to suit... :)

How much any of this might be vectorized w/ arrayfun and friends I've not even begun to consider, either...

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

dpb 2013 年 7 月 4 日

MATLAB Online で開く

OK, just had an idea to test it just a little...

M
>> sortrows(a,[3 2])
ans = 
  'r1'    'NaN'            'br1_0_40'   
  'r3'    'br2_0_20'       'br1_100_120'
  'r4'    'NaN'            'br1_120_200'
  'r2'    'br1_0_40'       'br1_40_100' 
  'r5'    'br3_100_130'    'br2_0_20'   
  'r6'    'br2_0_20'       'br2_20_25'  
>> for i=1:length(u),[~,~,ib]=intersect(u(i),a(:,2),'stable');if ~isempty(ib)&(i>ib) a(ib:i,:)=circshift(a(ib:i,:),1);end,end
>> a
a = 
  'r1'    'NaN'            'br1_0_40'   
  'r2'    'br1_0_40'       'br1_40_100' 
  'r5'    'br3_100_130'    'br2_0_20'   
  'r4'    'NaN'            'br1_120_200'
  'r3'    'br2_0_20'       'br1_100_120'
  'r6'    'br2_0_20'       'br2_20_25'  
>>

IOW, resorted which screwed it up again but in a different initial order. Looks like the same logic still worked. Granted this is still undoubtedly a toy case, but...

サインインしてコメントする。

Answer 3

Ingrid 2013 年 7 月 10 日

MATLAB Online で開く

0 投票

for some strange reason I did not get notification that answers were posted on this topic (although I indicated to be alerted) so I have given it a go anyway and this is what I can up with. My data can have several columns with input names (variable number, determined by code before this piece of code), so I placed the blocknames in the first column and the inputnames in the following columns. The sort now makes sure that no inputnames occurs before the blocknames (so that I can perform calculations later in the correct order) so it does the trick I think but it is not vectorized so probably not the fastest solution

ii=1;
while ii < totalNumberBlocks + 1
    shiftOccured = 0;
    inNames = sortedInputsOutputs(:,2:end);
    indexFound = strcmpi(sortedInputsOutputs{ii,1},inNames);
    rowsFound = find(indexFound);
    for kk = 1:numel(rowsFound)
        while rowsFound(kk) > totalNumberBlocks
            rowsFound(kk) = rowsFound(kk) - totalNumberBlocks;
        end
        if rowsFound(kk) < ii
            % move current row up to above rowFound
            sortedInputsOutputs = [sortedInputsOutputs(1:rowsFound(kk)-1,:);...
                sortedInputsOutputs(ii,:); sortedInputsOutputs(rowsFound(kk):ii-1,:);
                sortedInputsOutputs(ii+1:end,:)];
            ii = 1; % start back from the top to make sure that because of shifting no other order is changed!
            shiftOccured = 1;
        end
    end
    if shiftOccured == 0
        ii = ii + 1;
    end
end

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

special kind of sort

2 件のコメント
なしを表示なしを非表示

回答 (3 件)

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

タグ

Community Treasure Hunt

special kind of sort

2 件のコメント なしを表示 なしを非表示

回答 (3 件)

2 件のコメント なしを表示 なしを非表示

1 件のコメント -1 件の古いコメントを表示 -1 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

カテゴリ

タグ

参考

Community Treasure Hunt

2 件のコメント
なしを表示なしを非表示

2 件のコメント
なしを表示なしを非表示

1 件のコメント
-1 件の古いコメントを表示 -1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示