Searching through files for missing data
1 ビュー (過去 30 日間)
I have a set of 8000 files in the format of
The files should increase by 5 minutes each time and I need to write a function that would check the file names are named in a logical way. And if any files are missing it can identify this for me.
The function strcmp and join have been reccomended to me.
Does anyone know how to do this?
KSSV 2021 年 2 月 4 日
You may follow something like this:
files = dir('*txt') ; % give your extension
% Create a datetime vector for the files present with the names mentioned
[P,N,E] = cellfun(@fileparts,f,'UniformOutput',0) ;
t = datetime(N,'InputFormat','yyyyMMddHHmmSS') ; % this is datetime for the files present
%% Create 5 mins possible datetime vector
file1 = files(1).name ;
[path, name1, extension] = fileparts(file1) ;
t0 = datetime(name1,'InputFormat','yyyyMMddHHmmSS') ;
file2 = files(end).name ;
[path, name2, extension] = fileparts(file2) ;
t1 = datetime(name2,'InputFormat','yyyyMMddHHmmSS') ;
% Make datetime arrray
t0 = t0:minutes(5):t1 ; % this is used for comparison
% Get the indices which are present
idx = ismember(t0,t1) ;
% Dates which donot exist
その他の回答 (1 件)
Adam Danz 2021 年 2 月 4 日
>Does anyone know how to do this?
Lots of people know how to do this and we're here to help but few people will devote a portion of their day to do it for you.
Let's start by figuring out where you're stuck. There are just a few basic steps in your process and you can find lots of information in this forum, on the web, and in the documentation for each step.
- Get a list of files. See dir()
- Read in the file. There are lots of ways to read files depending on the filetype and content (review).
- Are your time stamps in datetime format? If not convert them to datetime.
- If all you want to do is check whether a file is missing, you just need to store the following 3 data points for each file as 2 separate variables. This will be done in your loop: The first and last datetime value can be stored in an nx2 matrix for n files and the filename stored as an nx1 string array.
- Once all files are read and the 3 data points are stored for each file, you can sort the datetime values in case the files are read out of order and then compare the first datetime of file n with the last datetime from file n-1. If that difference is more than 5 minutes, you know you're missing a file and you can use the filename array to help identify which file is missing.
If you get stuck on any step leave a comment below and show us where you're at with the code and what the problem is.