Use terminal to speed up file removal

Question

Pete 2017 年 10 月 17 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/361748-use-terminal-to-speed-up-file-removal

回答済み: Stephen23 2017 年 10 月 17 日

Hi all, I've got large number of CSVs generated each time a system changes state. Basically, the CSVs start as a single row [1x3] array, and any data is added as a new row. I've written simple loop that checks for any "empty" CSVs (only containing the single row) and remove this file. This however takes many (>10) minutes to complete and I want to try the same in terminal. Code as shown:

CSV_Filenames_STRUCT    = dir(sprintf('%s/*.csv',ResultDirectory));
CSV_Filenames_CELL      = {CSV_Filenames_STRUCT.name};
StartingNumberOfFiles    = size(CSV_Filenames_CELL,2);
for NthFile = 1:StartingNumberOfFiles
  NumberOfPeaks = size(textread(sprintf('%s/%s',ResultDirectory,CSV_Filenames_CELL{1,NthFile}),'%s'),1) - 1;  % Number of rows less one for the 'x,y,value'
  if ~NumberOfPeaks  % Essentially empty
    delete(sprintf('%s/%s',ResultDirectory,CSV_Filenames_CELL{1,NthFile}));
  end
end

I've not used terminal much, and wondering if it'd be faster for the above when there are many files to process, and how to code the check for the single line check So far, I've got something like:

for f in *.csv;
do
    L=`wc -l "$f" |  awk '{print $1}'`
    if test $L -eq 1
    then
        mv $f ./MT;
    fi
done

which isn't quite working (there's spaces in the filename as shown below), but I'm out of my depth here so calling for help on how to use the "system"/"unix" options through Matlab. I'm running OS-X and Kubuntu Linux. I should also mention that the filenames have spaces in them like: "Filter 0000001 Fwd,Alignment Black Screen - Ref_01 Input_19 (2017-10-17 @ 13.30.20.103).csv"

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Pete 2017 年 10 月 17 日

Just started a set with 2,000,000 files, but only expect about 10% of these to have genuine results (200k), so the rest just 'empty' CSVs (one row of (title) data). Looking at profiler, I think the Matlab functions called from textread are possibly taking time. I've removed sprintf's and replaced with concatenation strings i.e. [PathPart1 '/' PathPart2] etc. Sped up a bit, but still a long time for processing. Any other suggestions?

Jan 2017 年 10 月 17 日

You mean "shell", not "terminal".

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Jan 2017 年 10 月 17 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/361748-use-terminal-to-speed-up-file-removal#answer_286236

MATLAB Online で開く

I'm not sure if I understand your question correctly: You want to delete all files, which have one column only - correct?

FULLFILE is smarter than creating file names by sprintf().

CSV_Filenames_STRUCT  = dir(fullfile(ResultDirectory, '*.csv'));
CSV_Filenames_CELL    = {CSV_Filenames_STRUCT.name};
StartingNumberOfFiles  = numel(CSV_Filenames_CELL);
for NthFile = 1:StartingNumberOfFiles
  File = fullfile(ResultDirectory, CSV_Filenames_CELL{NthFile});
  fid  = fopen(File, 'r');
  if fid == -1, error('Cannot open file: %s', File); end
  line1 = fgetl(fid);
  line2 = fgetl(fid);
  fclose(fid);
  if ~ischar(line2)
    delete(File);
  end
end

Is this faster? It tries to import 2 lines only.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Stephen23 2017 年 10 月 17 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/361748-use-terminal-to-speed-up-file-removal#answer_286241

MATLAB Online で開く

Remove the textread and replace it with something like this (pseudocode):

fid = fopen(...,'rt');
fgetl(fid); % read first row
if feof(fid) % check if end of file
    delete(...)
end

"I've removed sprintf's and replaced with concatenation strings "

I would recommend using fullfile: it actually makes the intention clearer.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Use terminal to speed up file removal

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

Use terminal to speed up file removal

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示