Readtable not reading time as expected

Hi there
I have run into a puzzling mishap by readtable. I'm running it as such::
T = readtable('file.data','FileType','text','NumHeaderLines',7);
Here is a cropped screenshot of the dataset I'm running it on (headerlines not shown):
I am specifically puzzled by the last column shown. This is the time of day. In the resulting table it comes back looking like this:
So it seems readtable has misinterpreted what is a time signature as a duration value. And that it is only able to determine this duration value when the third column (a nanosecond value) cycles back to zero.
Any ideas as to how I might resolve this problem?
Thanks in advance!

2 件のコメント

Jon
Jon 2023 年 11 月 16 日
Someone may be able to help you just by looking at your screenshot, but it would be helpful for me if you could attach your data file so I could try to import it myself and see what is going on.
Jakob Sievers
Jakob Sievers 2023 年 11 月 16 日
Ah yes, ofcourse. I've attached a zipped version here :)

サインインしてコメントする。

 採用された回答

Jon
Jon 2023 年 11 月 16 日
編集済み: Jon 2023 年 11 月 16 日

0 投票

I think you are having problems because your time column is of the form 19:10:54:300 and in particular the last separator is a colon not a decimal point. I couldn't find a simple way of dealing with this as it seems all of the MATLAB time formats use a decimal fraction for the last portion.
Here is some code that I wrote to handle this. Probably could be streamlined but I think it does what you want
filename = '2022-08-29T191054_AIU-2371.data'
% Define import options
opts = detectImportOptions(filename,"FileType","text")
% Modify options to have time column read in as a character type for further
% manipulation
opts = setvaropts(opts,'Time','Type','char')
% Read in the table using the customized options
T = readtable('2022-08-29T191054_AIU-2371.data',opts);
% Replace milliseconds with fractions of a second to allow using MATLAB time
% formatting
T.Time = cellfun(@fun,T.Time,'UniformOutput',false);
% Convert to duration
T.Time = duration(T.Time,'InputFormat','hh:mm:ss.SSSSSSSSS')
function t = fun(tChar)
% Replace nanoseconds with fractions of a second to allow using MATLAB time
% formatting
parts = strsplit(tChar,':');
fsec = str2double(parts{4})/1e9; % fraction of second
parts{4} = num2str(fsec,'%.9f'); % e.g '0.000000300'
parts{4} = parts{4}(2:end); % e.g. '.000000300'
% build new character vector with decimal time
t = [parts{1},':',parts{2},':',parts{3},parts{4}];
end

9 件のコメント

Jon
Jon 2023 年 11 月 16 日
Oh, maybe you wanted that fractional time in milliseconds not nano, let me adjust
Jon
Jon 2023 年 11 月 16 日
編集済み: Jon 2023 年 11 月 16 日
filename = '2022-08-29T191054_AIU-2371.data'
% Define import options
opts = detectImportOptions(filename,"FileType","text")
% Modify options to have time column read in as a string for further
% manipulation
opts = setvaropts(opts,'Time','Type','char')
% Read in the table using the customized options
T = readtable('2022-08-29T191054_AIU-2371.data',opts);
% Replace millisecond with fractions of a second to allow using MATLAB time
% formatting
T.Time = cellfun(@fun,T.Time,'UniformOutput',false);
% Convert to duration
T.Time = duration(T.Time,'InputFormat','hh:mm:ss.SSS','Format','hh:mm:ss.SSS')
%
function t = fun(tChar)
t = tChar;
t(9) = '.';
end
Walter Roberson
Walter Roberson 2023 年 11 月 16 日
When MATLAB thinks something might be duration type, then it uses the duration format rules to parse the data -- and for reasons I do not understand, duration formats just are not nearly as flexible as time formats. That's why in my Answer I said make it a time element and set the input format up for it... and then afterwards convert it to duration relative to the beginning of the day.
Jon
Jon 2023 年 11 月 16 日
@Walter Roberson What did you do about the colon separator instead of a decimal point e.g 19:10:54:300 rather than 19:10:54.300 ?
Jon
Jon 2023 年 11 月 16 日
@Walter Roberson Oh, I see, what you mean about the flexibility, as datetime it allows the colon separator
Walter Roberson
Walter Roberson 2023 年 11 月 16 日
編集済み: Walter Roberson 2023 年 11 月 16 日
datetime() is happy to support : at that point.
str = '19:10:56:400';
t = datetime(str, 'InputFormat', 'HH:mm:ss:SSS')
t = datetime
16-Nov-2023 19:10:56
t1 = t - dateshift(t, 'Start', 'day') %what I used
t1 = duration
19:10:56
t2 = timeofday(t) %Cris suggests
t2 = duration
19:10:56
Jon
Jon 2023 年 11 月 16 日
Thanks, I learned a lot!
Jakob Sievers
Jakob Sievers 2023 年 11 月 17 日
Thank you very much for taking your time to look deeper into this question. I'm glad we both learned something from this :-)
Jakob Sievers
Jakob Sievers 2023 年 11 月 22 日
I finally had a moment to test this out and it did the job splendedly. Thanks again. I will accept this as the answer :)

サインインしてコメントする。

その他の回答 (1 件)

Walter Roberson
Walter Roberson 2023 年 11 月 16 日

0 投票

Use detectImportOptions() on the file. Then use setvartype() to set variable 8 to datetime instead of duration. Then use setvaropts() to set the InputFormat to 'HH:mm:ss:SSS'
Now readtable() the file passing in the modified options.
Variables 7 and 8 will now both be datetime.
Take
day_offset = TheTable{:,8} - dateshift(TheTable{:,8}, 'start', 'day');
TheTable.DateTime = TheTable{:,7} + day_offset;
Now TheTable.DateTime variable will hold the full-precision date and time together.
Yes, there are other ways, including passing a format to readtable() indicating what the datetype is for each column... but using %D and %T properly gets a bit messy. And in order for a time format to not automatically end at the space between date and time, you have to do the hack of telling the parser that whitespace is not a field delimiter... gets ugly.

4 件のコメント

Cris LaPierre
Cris LaPierre 2023 年 11 月 16 日
Or use timeofday instead of dateshift.
opts = detectImportOptions('2022-08-29T191054_AIU-2371.data','FileType','text','NumHeaderLines',7);
opts = setvartype(opts,"Time","datetime");
opts = setvaropts(opts,"Time","InputFormat","HH:mm:ss:SSS");
data = readtable('2022-08-29T191054_AIU-2371.data',opts)
data.Time = timeofday(datetime(data.Time))
Jon
Jon 2023 年 11 月 16 日
You may also want to set the 'DatetimeFormat' for Time to be "HH:mm:ss.SSS" so it displays the milliseconds
opts = setvaropts(opts,"Time","InputFormat","HH:mm:ss:SSS",'DatetimeFormat','HH:mm:ss.SSS');
Jakob Sievers
Jakob Sievers 2023 年 11 月 17 日
Thank you so much for looking into this question! I'll explore all answers as soon as I can!
Peter Perkins
Peter Perkins 2023 年 11 月 17 日
Yes, the root cause here is that reading in duration text timestamps in the format hh:mm:ss:SSS is not currently supported. So Walter is correct, use datetime's more flexible parsing, and Chris is right, use timeofday to convert those to durations.
I have made note of adding support for this duration format, it's something we've had questions about before.

サインインしてコメントする。

カテゴリ

製品

リリース

R2021a

質問済み:

2023 年 11 月 16 日

コメント済み:

2023 年 11 月 22 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by