フィルターのクリア

sortrows not working with logical column

5 ビュー (過去 30 日間)
avleed
avleed 2017 年 1 月 27 日
編集済み: dpb 2017 年 1 月 29 日
The cell array attached has a character first column, logical second column and numeric third column. I would like to sort by second column in descending order to put rows with the logical value =1 at the top. sortrows(TCRes,2) would be great, but it doesn't work and the error I get is below
Error using char
Cell elements must be character arrays.
Error in sortrows>sort_cell_back_to_front (line 135)
tmp = char(x(ndx,k));
Error in sortrows (line 87)
ndx = sort_cell_back_to_front(x_sub, col);
sortrows(TCRes,3) and
sortrows(TCRes,1) work, but not
sortrows(TCRes,2).
In the description of the command https://www.mathworks.com/help/matlab/ref/sortrows.html#inputarg_A says it should support logical inputs.

採用された回答

dpb
dpb 2017 年 1 月 27 日
編集済み: dpb 2017 年 1 月 28 日
The doc at the link is addressing an array input which array can be one of the various types listed. It isn't describing a cell array. It appears, in fact, the documentation, while giving an example for a cell array of string values, doesn't address cell array inputs at all in the input description.
Hence, not sure it can be called a bug but appears to be a quality of implementation failure, for sure.
Looks like the workaround will be to either cast the logical to a numeric value before sorting or convert the cell array to a table and sortrows it--I'm guessing that would work.
I would submit a bug report, though...at least make TMW commit on what it is supposed to do.
ADDENDUM
Well, after the earlier comments, turns out we can go further and it can undoubtedly be made to work...the wrapper around the mex guts routines that do the bulk of the work is an m-file so can (at least thru R2014b here) see what's going on...in the end one gets to the following in the internal function sort_cell_back_to_front that is the one that errors.
if ~isempty(x)
for k = n:-1:1
if isnumeric(x{1, k})
...error test for array cell content elided for clarity in function...
tmp = cell2mat(x(ndx, k));
ind = sortrowsc(tmp, col(k));
ndx = ndx(ind);
else
tmp = char(x(ndx,k));
ind = sortrowsc(tmp, sign(col(k))*(1:size(tmp,2)));
ndx = ndx(ind);
end
end
and we see that since a LOGICAL is not numeric, it's going to be treated as string. Fixing up this test in a copy of sortrows might correct the issue. I'd suggest pasting this code snippet with the bug report.
ADDENDUM 2
I tried the simple expedient of
if isnumeric(x{1, k}) || islogical(x{1, k})
but it then errored on the subsequent cell2mat line with mixed types inside cell2mat. I don't have sufficient time at the moment to try to delve more deeply into the guts to work on a fixup, but that's where the issues are, anyways.
You can try to either patch this or use the workaround of making the type change or use table that may be more robust; not sure on that count. Well, let's just try and see...
>> res=table2cell(sortrows(cell2table(TCRes),-2));
>>
does seem to work ok. It didn't seem to take long for your dataset size to create the table and doing it as above makes it a temporary so doesn't clutter up workspace afterwards. The additional conversion back to cell didn't seem to take a reliably measurable additional amount of time over that without. If datasets got really large possibly...
The base case still deserves a bug report probably, but TMW may not treat it with great urgency...
ADDENDUM 3
I had a few minutes to look at this a little more--seemed puzzling that the above fixup failed on mixed metaphors so I dug into the dataset you supplied a little more...first try out of the box croaks--
>> tmp=cell2mat(TCRes(:,2));
Error using cell2mat (line 45)
All contents of the input cell array must be of the same data type.
>>
So, the error in the modified sortrows really isn't anything to do with it at all...
So, let's just look at the elements--
>> cellfun(@(x) x,TCRes(:,2));
Error using cellfun
Mismatch in type of outputs, at index 16, output 1 (logical versus double).
Set 'UniformOutput' to false.
>>
Your data are corrupt; the 16th value is a double instead of logical, how many more?
>> sum(cellfun(@islogical,arrayfun(@(x) x,TCRes(:,2))))
ans =
554
>>
so there are 16 bad apples in the barrel. Fixing those and the above code change likely will then work.
IN FINE
>> idx=find(cellfun(@isnumeric,arrayfun(@(x) x,TCRes(:,2))))'
idx =
16 54 92 130 168 206 244 282 320 358 396 434 472 510 548 552
>> diff(idx)
ans =
38 38 38 38 38 38 38 38 38 38 38 38 38 38 4
>>
It looks like whatever logic you used to build the array has some regularity in setting values and every 38th after the 16th plus the last is a double instead of logical. This is a bug in your code logic that needs fixing.
With that and the change to sortrows noted above, then the "classic" sortrows works on your cell array.
>> for i=1:length(idx),TCPRes(idx(i),2)={logical(TCPRes{idx(i),2})};end
>> out=sortrows(TCPRes,-2);
>>
after the aforementioned code change. I made local copy to alias the original for the test; I'd recommend if you choose to go this route to make a copy of the original and name it something else.
  5 件のコメント
avleed
avleed 2017 年 1 月 29 日
Agreed that there was an issue with the cell array. But I tried basic stuff - created a 1*4 logical cell array and tried putting it through sortrows. Got the same error.
dpb
dpb 2017 年 1 月 29 日
編集済み: dpb 2017 年 1 月 29 日
As noted in Answer Addendum 2 posted after had time to actually look at the code, internally logical in a cell array is treated as char, not numeric. I showed a patch that will fix that problem.
That patch is to replace the line
if isnumeric(x{1, k})
by
if isnumeric(x{1, k}) || islogical(x{1, k})
in the internal function sort_cell_back_to_front in sortrows (or, more safe in a copy thereof with another name rather than the original).
I'm simply pointing out that it can't really be claimed to be a bug from the documentation as it doesn't address cell arrays as inputs specifically, only arrays and the one example that does use a cell array is for strings only.
I also agree it is a "quality of implementation" issue and is worthy of a service request (aka bug report).
>> z=logical(randi(2,4,1)-1); % logical array
>> sortrows(mat2cell(z,ones(size(z)))) % original sortrows
Error using char
Cell elements must be character arrays.
Error in sortrows>sort_cell_back_to_front (line 135)
tmp = char(x(ndx,k));
Error in sortrows (line 87)
ndx = sort_cell_back_to_front(x_sub, col);
>>
{edit a copy of sortrows in current working directory here with patch)
>> sortrows(mat2cell(z,ones(size(z))))
ans =
[0]
[0]
[1]
[1]
>>
and Voila!! it does work as expected. Also note the result is a cell array of logical in second case--
>> whos ans
Name Size Bytes Class Attributes
ans 4x1 244 cell
>> all(cellfun(@islogical,ans))
ans =
1
>>
Also note that if the logical array z is passed to the original, unmodified, sortrows it also succeeds as it is an array of logical, not a cell array containing logical elements which are, in the original, the only case where they do get treated as character. IOW, cell arrays are not arrays and the function internally is quite different and the documentation doesn't address what is expected to work and what isn't for the former.
The upshot is, you can make sortrows work, but to do so it needs to be patched; if you're uncomfortable doing this (understandable, but that's the reason I suggested the copy at least originally although I've made a number of patches to installed Matlab releases here over the years as I don't have to worry about whether it makes me incompatible with the world or not and have enough "time in grade" that doing so no longer seems scary) then you'll need to use one of the other workarounds.
It is true that that particular patch is still susceptible to the issue of there not being a uniform data type in a cell array column so that to use it you do have to fix that issue in your original data array (probably a good idea anyway).
There have also been three separate workarounds; the cast to/from table that I showed and two others with greater or lesser degrees of complexity required to use them. I'd submit that from a coding standpoint table is simplest, the FEX submittal of natural string sorting is not bad although may need to be able to write a regexp pattern for floating point and the workaround using the index row via unique is clever but probably the most non-intuitive.
It's your call as to which way to approach your problem, I'll just note I've gone ahead and made the patch in the distribution code here going forward, but that's only appropriate probably if you're working alone as I. :)
Again, I urge you to submit this and as much of my analysis or a link to the Answer to TMW as a bug report; it should be addressed even though in strictest interpretation they can say it isn't actually a bug owing to the doc not covering the case.

サインインしてコメントする。

その他の回答 (1 件)

Andrei Bobrov
Andrei Bobrov 2017 年 1 月 29 日
[~,~,c]=unique(TCRes(:,1));
[~,ii] = sortrows([c,reshape([TCRes{:,2:3}],[],2)],[-2,1,3]);
out = TCRes(ii,:);

カテゴリ

Help Center および File ExchangeMatrix Indexing についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by