Create 30 minute bins by reading time stamps

Question

Bradley 2023 年 12 月 13 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2060074-create-30-minute-bins-by-reading-time-stamps

コメント済み: Mathieu NOE 2023 年 12 月 19 日

Data.xlsx

I have a file with time stamps, due to error in the files (gaps in time due to faulty equipment) binning 30 minute sections using the following code creates errors in the actograms produced.

function[Average] = Av_30min(y)

Average = zeros(288,size(y,2));

k = 1;

for l =1:288

Average(l,1) = mean(y(k:k+(round(length(y)/288)-1),1));

k = k+((round(length(y)/288)-1));

end

I want to read from say 11:30:00 - 12:00:00 and average this bin and so on and so forth. Can anyone help?

Thanks

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Star Strider 2023 年 12 月 14 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2060074-create-30-minute-bins-by-reading-time-stamps#answer_1371843

MATLAB Online で開く

What do the ‘Calculated time’ values represent? Are they fractions of a day (from about ¼ to about 3¼ days) or something else?

Until that is resolved, converting them into something useful is not likely to be possible.

Here, I used sortrows and then fillmissing and resample to create some sort of order.

% imshow(imread('Timestamps in output.PNG'))

% function readfaultydata

% clear; close all

Data = xlsread('Data.xlsx',1,'A2:F60099');

T1 = readtable('Data.xlsx', 'VariableNamingRule','preserve');

t = Data(:,1);

dt = t(2)-t(1); % Your sample rate

t = (1:length(t))*dt; % Linear timeline

x = Data(:,3);

y = Data(:,4);

z = Data(:,5);

subplot(311);plot(t,x);grid;

subplot(312);plot(t,y);grid;

subplot(313);plot(t,z);grid;

T1 = sortrows(T1,1);

T1 = fillmissing(T1, 'nearest') % Sorted With NaN Values Interpolated

T1 = 60103×9 table

Calculated time Device X Y Z LDR Var7 Var8 Var9 _______________ __________ __ __ __ ___ ______ _______ _______ 0.26653 {'BSB_6A'} 0 70 56 0 0.2611 0.85946 0.96134 0.26664 {'BSB_6A'} 0 61 72 0 0.2611 0.85946 0.96134 0.26676 {'BSB_6A'} 5 57 67 0 0.2611 0.85946 0.96134 0.26687 {'BSB_6A'} 9 28 57 0 0.2611 0.85946 0.96134 0.26699 {'BSB_6A'} 60 0 44 0 0.2611 0.85946 0.96134 0.26711 {'BSB_6A'} 62 52 57 0 0.2611 0.85946 0.96134 0.26722 {'BSB_6A'} 48 36 70 0 0.2611 0.85946 0.96134 0.26734 {'BSB_6A'} 66 33 63 0 0.2611 0.85946 0.96134 0.26745 {'BSB_6A'} 70 50 60 0 0.2611 0.85946 0.96134 0.26757 {'BSB_6A'} 34 26 56 0 0.2611 0.85946 0.96134 0.26769 {'BSB_6A'} 51 47 75 0 0.2611 0.85946 0.96134 0.2678 {'BSB_6A'} 17 56 64 0 0.2611 0.85946 0.96134 0.26792 {'BSB_6A'} 50 40 29 0 0.2611 0.85946 0.96134 0.26803 {'BSB_6A'} 26 65 79 0 0.2611 0.85946 0.96134 0.26815 {'BSB_6A'} 22 73 85 0 0.2611 0.85946 0.96134 0.26826 {'BSB_6A'} 51 70 79 0 0.2611 0.85946 0.96134

VN = T1.Properties.VariableNames;

% timestats = [mean(diff(T1{:,1})) std(diff(T1{:,1}))]

Ts = mean(diff(T1{:,1}))

Ts = 4.9837e-05

Fs = 1/Ts

Fs = 2.0065e+04

Fn = Fs/2;

[XYZr, tr] = resample(T1{:,[3 4 5]}, T1{:,1}, Fs);

% format longg

% XYZr(end-4:end,:)

XYZr = XYZr(1:end-3,:);

tr = tr(1:end-3);

% XYZr(end-4:end,:)

x = XYZr(:,1);

y = XYZr(:,3);

z = XYZr(:,3);

figure

plot(tr, XYZr)

grid

xlabel([string(VN{1}) + " (Sorted & Resampled)"])

ylabel("Amplitude")

figure

tiledlayout(3,1)

for k = 1:size(XYZr,2)

nexttile

plot(tr, XYZr(:,k))

grid

ylabel("Amplitude")

title(["Column "+k])

end

xlabel([string(VN{1}) + " (Sorted & Resampled)"])

figure

scatter3(x,y,z, 10, z, '.')

colormap(turbo)

grid on

xlabel('X')

ylabel('Y')

zlabel('Z')

.

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

Bradley 2023 年 12 月 15 日

Thanks so much for all of this, making it seem like this problem is ever more unsolveable due to the errored data, hopefully our device gets fixed so the next experiments are simpler to analyse!

The device records movement every 10 seconds but due to the error there are time gaps resulting in this complexity. I have revised the data to include date aswell as the time, I'm not sure whether that makes it easier to analyse or not. See attached document.

Basically I want to get to a point where I have a 48x6 double ie 6 days from 7pm to 7am, but because of the time jumps in the data my results have been time shifted in everything I have tried so far.

Also would like to get an 8640x3 double for 1 minute bins, the same 6 days of data but 1 minute averages. But again get massive time drift when i do this because of the error in the data.

Using the following function for this part:

function[Min_Average] = Av_1min(y)

k = 1;

Min_Average = zeros(8640,size(y,2));

for l =1:8640

Min_Average(l) = mean(y(k:k+(round(length(y)/8640)-1)));

k = k+((round(length(y)/8640)-1));

end

Hopefully this makes my aim much more clearer: have been trying to use datetime to work with the data but my level of MATLAB is far too low at this moment in time.

Star Strider 2023 年 12 月 15 日

MATLAB Online で開く

Using timetable and retime to do the 30-minute averages —

files = dir('*.xlsx');

for k = 1:numel(files)

filename{k} = files(k). name;

Data{k,:} = readtable(filename{k}, 'VariableNamingRule','preserve');

end

filename{:}

ans = '30min bins.xlsx'

ans = 'DataRevised.xlsx'

Data{1}

ans = 332×9 table

Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 ____________________ ____ ________ _________ ________ ____ _______ _______ ____ 01-Jan-1900 07:30:02 NaN 4.5746 0.022099 0.14365 1 0.32697 0.33529 NaN 01-Jan-1900 08:00:02 NaN 5.4972 12.884 0.9116 1 0.34578 0.35404 NaN 01-Jan-1900 08:30:02 NaN 0.22099 0 13.928 1 NaN NaN NaN 01-Jan-1900 09:00:02 NaN 0.32044 0 0.044199 1 NaN NaN NaN 01-Jan-1900 09:30:02 NaN 0.25414 4.5967 0 1 NaN NaN NaN 01-Jan-1900 10:00:02 NaN 0.1547 0.01105 0 1 NaN NaN NaN 01-Jan-1900 10:30:02 NaN 10.105 0.01105 0.033149 1 NaN NaN NaN 01-Jan-1900 11:00:02 NaN 5.8398 0.1989 0.40884 1 NaN NaN NaN 01-Jan-1900 11:30:02 NaN 0.14917 45.188 17.475 1 NaN NaN NaN 01-Jan-1900 12:00:02 NaN 0.033149 0.071823 0.1768 1 NaN NaN NaN 01-Jan-1900 12:30:02 NaN 0.47514 0.016575 0.038674 1 NaN NaN NaN 01-Jan-1900 13:00:02 NaN 0.093923 0.033149 0.1547 1 NaN NaN NaN 01-Jan-1900 13:30:02 NaN 3.1713 0.0055249 5.8674 1 NaN NaN NaN 01-Jan-1900 14:00:02 NaN 3.9006 2.884 7.8674 1 NaN NaN NaN 01-Jan-1900 14:30:02 NaN 0.082873 6.337 0 1 NaN NaN NaN 01-Jan-1900 15:00:02 NaN 0.21547 0.044199 0 1 NaN NaN NaN

Data{2} % Analyse 'Data{2}'

ans = 50317×6 table

Calculated time Device KO1 KO2 WT LDR ____________________ __________ ___ ___ __ ___ 15-Nov-2023 19:00:58 {'BSB_6A'} 0 12 0 0 15-Nov-2023 19:01:08 {'BSB_6A'} 0 53 0 0 15-Nov-2023 19:01:18 {'BSB_6A'} 0 78 0 0 15-Nov-2023 19:01:28 {'BSB_6A'} 0 83 0 0 15-Nov-2023 19:01:38 {'BSB_6A'} 0 45 0 0 15-Nov-2023 19:01:47 {'BSB_6A'} 0 70 0 0 15-Nov-2023 19:01:57 {'BSB_6A'} 0 70 0 0 15-Nov-2023 19:02:07 {'BSB_6A'} 7 59 0 0 15-Nov-2023 19:02:17 {'BSB_6A'} 10 53 0 0 15-Nov-2023 19:02:27 {'BSB_6A'} 0 55 0 0 15-Nov-2023 19:02:37 {'BSB_6A'} 0 81 0 0 15-Nov-2023 19:02:47 {'BSB_6A'} 0 33 0 0 15-Nov-2023 19:02:57 {'BSB_6A'} 0 39 6 0 15-Nov-2023 19:03:07 {'BSB_6A'} 0 60 66 0 15-Nov-2023 19:03:17 {'BSB_6A'} 0 60 83 0 15-Nov-2023 19:03:27 {'BSB_6A'} 0 59 82 0

DataRevisedTT = table2timetable(Data{2}(:,[1 3 4 5 6])) % Need To Leave Out 'Device' For This to Work

DataRevisedTT = 50317×4 timetable

Calculated time KO1 KO2 WT LDR ____________________ ___ ___ __ ___ 15-Nov-2023 19:00:58 0 12 0 0 15-Nov-2023 19:01:08 0 53 0 0 15-Nov-2023 19:01:18 0 78 0 0 15-Nov-2023 19:01:28 0 83 0 0 15-Nov-2023 19:01:38 0 45 0 0 15-Nov-2023 19:01:47 0 70 0 0 15-Nov-2023 19:01:57 0 70 0 0 15-Nov-2023 19:02:07 7 59 0 0 15-Nov-2023 19:02:17 10 53 0 0 15-Nov-2023 19:02:27 0 55 0 0 15-Nov-2023 19:02:37 0 81 0 0 15-Nov-2023 19:02:47 0 33 0 0 15-Nov-2023 19:02:57 0 39 6 0 15-Nov-2023 19:03:07 0 60 66 0 15-Nov-2023 19:03:17 0 60 83 0 15-Nov-2023 19:03:27 0 59 82 0

% DataRevisedTT.('Calculated time')

DataRevisedTT30 = retime(DataRevisedTT, 'regular', 'mean', 'TimeStep',minutes(30)) % Use 'retime' To Calculate 30-Minute 'mean' Values

DataRevisedTT30 = 289×4 timetable

Calculated time KO1 KO2 WT LDR ____________________ ______ ________ ______ ___ 15-Nov-2023 19:00:00 16.046 57.131 61.857 0 15-Nov-2023 19:30:00 39.322 60.456 60.317 0 15-Nov-2023 20:00:00 39.067 60.972 60.206 0 15-Nov-2023 20:30:00 26.85 61.622 65.95 0 15-Nov-2023 21:00:00 23.8 62.456 73.011 0 15-Nov-2023 21:30:00 20.111 56.239 52.256 0 15-Nov-2023 22:00:00 41.272 53.517 65.856 0 15-Nov-2023 22:30:00 23.8 55.422 66.206 0 15-Nov-2023 23:00:00 27.711 48.967 66.189 0 15-Nov-2023 23:30:00 47.567 44.594 57.467 0 16-Nov-2023 00:00:00 30.083 43.65 60.528 0 16-Nov-2023 00:30:00 36.522 44.494 49.117 0 16-Nov-2023 01:00:00 35.667 39.033 61.156 0 16-Nov-2023 01:30:00 20.161 25.444 46.489 0 16-Nov-2023 02:00:00 7.9111 0.083333 41.811 0 16-Nov-2023 02:30:00 27.933 0.083333 4.1222 0

VN = DataRevisedTT30.Properties.VariableNames; % Variable Names

figure

stairs(DataRevisedTT30.('Calculated time'), DataRevisedTT30{:,1:end})

grid

xlabel('Calculated time')

ylabel('Dependent Variable Value')

legend(VN{:}, 'Location','best')

title(["Entire ‘"+string(filename{2})+" Record: 30-Minute Averages"])

[y,m,d] = ymd(DataRevisedTT30.('Calculated time'));

[G,ID] = findgroups(d);

Days = accumarray(G, (1:numel(G)).', [], @(x){DataRevisedTT30(x,:)}) % 'timetable' Arrays For Each Day

Days = 7×1 cell array

{10×4 timetable} {48×4 timetable} {48×4 timetable} {48×4 timetable} {48×4 timetable} {48×4 timetable} {39×4 timetable}

figure % Plot Daily 'timetable' Arrays

tiledlayout(4,2)

for k = 1:numel(Days)

nexttile

stairs(Days{k}.('Calculated time'), Days{k}{:,1:end})

grid

end

hl = legend(VN{:}, 'NumColumns',2);

nexttile;

Ax = gca;

Ax.Visible = 0;

pos = Ax.Position;

hl.Position = pos;

title(hl, "Common Legend")

sgtitle('Daily 30-Minute Averages')

Having accurate datetime values in the ‘DataRevised.xlsx’ file made all the difference.

Plotting the individual days makes the data a bit easier to visualise.

.

Bradley 2023 年 12 月 18 日

Thank you very much! This works perfectly!

Star Strider 2023 年 12 月 18 日

My pleasure!

If my Answer helped you solve your problem, please Accept it!

.

サインインしてコメントする。

Answer 2

Alexander 2023 年 12 月 13 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2060074-create-30-minute-bins-by-reading-time-stamps#answer_1370764

DataNew.xlsx

Whers is this file Data.xlsx from? I think someone has corrupted the "Calculated time" column. In this column is a mixture of formulars and manual input (see cell A1793:A1795, e.g.). If you copy A3 and paste it to A60099 your timeline is correct your program might run (I'll attach a modified xlsx). Hopefully this was your intension.

2 件のコメント
なしを表示なしを非表示

Bradley 2023 年 12 月 14 日

Hi Alexander,

Thanks for the suggestion but that is exactly the issue, if the data ran sequentially then my analytical code would have worked fine, but alas here we are trying to find another solution to atleast make some use of this data.

Alexander 2023 年 12 月 14 日

Here some code, w/o editing the xlsx:

function readfaultydata

clear; close all

Data = xlsread('Data.xlsx',1,'A2:F60099');

t = Data(:,1);

dt = t(2)-t(1); % Your sample rate

t = (1:length(t))*dt; % Linear timeline

x = Data(:,3);

y = Data(:,4);

z = Data(:,5);

subplot(311);plot(t,x);grid;

subplot(312);plot(t,y);grid;

subplot(313);plot(t,z);grid;

end

Hope it helps.

サインインしてコメントする。

Answer 3

Mathieu NOE 2023 年 12 月 13 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/2060074-create-30-minute-bins-by-reading-time-stamps#answer_1370839

MATLAB Online で開く

hello

this is certainly not the best and most modern way to solve your problem , but as I have not really started digging with timetables and alike , so here's a (very) low level approach

someone younger / smarter may come up with a 2 lines solutions, but for the time being this is what I can offer

in very short , I am simply looking where 30 or 00 min appears in your data and the duration is exactly 180 samples (= 30 mins) then this data is averaged and saved

the new time data correspond to the beginning of the 30 min buffer

also I didn't copy paste the header line, I think you could manage that by yourself...

result should look like :

code

y = xlsread('Data.xlsx');
%% convert y firt column (time HH:MM:SS) to hours (h), minutes (m) and seconds (s)
h = y(:,1)*24;
m = 60*(h - floor(h));
s = 60*(m - floor(m));
h = floor(h);
m = floor(m);
s = round(s);
id1 = (m ==30); % find time index matching the 30 min timestamp
[begin1,ends1] = find_start_end_group(id1);
id2 = (m ==00); % find time index matching the 00 min timestamp
[begin2,ends2] = find_start_end_group(id2);
begin = sort([begin1;begin2]); % concat all index corresponding to 00 or 30 min time stamps
%% main loop 
k = 0;
for ci = 1:numel(begin)-1
    ind_start = begin(ci);
    ind_stop = begin(ci+1);
    samples = ind_stop - ind_start;
    if samples == 180 % only valid segments are processed
        k = k+1;
        data = y(ind_start:ind_stop,2:end);
        mean_data{k,1} = mean(data,'omitnan');
        date{k,1} = [sprintf('%02d', h(ind_start)) ':' sprintf('%02d', m(ind_start)) ':' sprintf('%02d', s(ind_start))]; % a very crude method to create the time string 
    end
end
% export the data
out = [date mean_data];
writecell(out,'toto.xlsx');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [begin,ends] = find_start_end_group(ind)
% This locates the beginning /ending points of data groups
% Important : ind must be a LOGICAL array
    D = diff([0;ind(:);0]);
    begin = find(D == 1);
    ends = find(D == -1) - 1;
end

4 件のコメント
2 件の古いコメントを表示2 件の古いコメントを非表示

Bradley 2023 年 12 月 19 日

Thanks Mathieu. If you see above Star Strider has answered this in a more succint way.

Mathieu NOE 2023 年 12 月 19 日

I know I would not win this time (again !)

サインインしてコメントする。