How can my code, which collects bad data with 'cellfun' and 'try, catch', be improved?

11 ビュー (過去 30 日間)
Simon
Simon 2022 年 9 月 11 日
編集済み: Simon 2022 年 9 月 12 日
My real situatoin is that I have a large number of files. Some I suspect are bad. I want to know the file names of all the bad files. So they will not go to the downstream workflow.
Here is a toy example I write to show the problem I have. I only want to keep the bad data. So from cellfun's output, I must remove the good data (coded as '0'). Is there any better way to do the whole thing? I appreciate any suggestion you have.
list = {'a', 'bc', 'defg'};
T = cellfun(@(x) func(x), list, 'UniformOutput',false)
Warning: --- error caused by bad data ---.\n
a MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 1.' cause: {} stack: [5×1 struct] Correction: []
Warning: --- error caused by bad data ---.\n
bc MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 2.' cause: {} stack: [5×1 struct] Correction: []
T = 1×3 cell array
{'a'} {'bc'} {[0]}
function out = func(x)
try
x(4); % This would create error for a character vecctor shorter than 4.
out = 0;
catch ME
warning("--- error caused by bad data ---.\n")
out = x; % This is to collect the bad data.
disp(x)
disp(ME)
end
end
  2 件のコメント
dpb
dpb 2022 年 9 月 11 日
I'd revert to asking what defines a "bad" file vis a vis a "good" one...and what is the input format for the files?
Simon
Simon 2022 年 9 月 12 日
I have lots of .html files. They will be further processed into Matlab tables. When they are procecesed with readtable(htmlfile), two of them, just detected by my codes, have the following error.
MException with properties:
identifier: 'MATLAB:io:common:xmlTree:ParseError'
message: 'Error in XML: Premature end of data in tag body line 71↵'

サインインしてコメントする。

採用された回答

Paul
Paul 2022 年 9 月 11 日
Have func() return a logical with true indicating the file is bad and false indicating the file is good. In this case we'd have
list = {'a', 'bc', 'defg'};
T = cellfun(@(x) func(x), list)
Warning: --- error caused by bad data ---.\n
a MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 1.' cause: {} stack: [5×1 struct] Correction: []
Warning: --- error caused by bad data ---.\n
bc MException with properties: identifier: 'MATLAB:badsubscript' message: 'Index exceeds the number of array elements. Index must not exceed 2.' cause: {} stack: [5×1 struct] Correction: []
T = 1×3 logical array
1 1 0
% remove the good data
list(T)
ans = 1×2 cell array
{'a'} {'bc'}
The try/catch scheme may not be necessary depending on what the criteria really are for determing the goodness of a file and how those criteria can be tested.
function out = func(x)
out = false;
try
x(4); % This would create error for a character vecctor shorter than 4.
catch ME
warning("--- error caused by bad data ---.\n")
out = true; % This is to collect the bad data.
disp(x)
disp(ME)
end
end
  1 件のコメント
Simon
Simon 2022 年 9 月 12 日
編集済み: Simon 2022 年 9 月 12 日
That's a wonderful clean solution. Even more beneficial is that I learn a good place to use local values. I am grateful for your help.
I want to accept your answer, but when I click 'Accept', I receive an error message, asking me to reload this page. I did that, and it still would not let me click 'Accept.' I will try again later.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeStructures についてさらに検索

製品


リリース

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by