How to compare data series?

Question

0 投票

Suppose, I'm performing five time-domain simulations with 5 different coefficient values (say beta = [0.2, 0.3, 0.4, 0.5, 0.6]). For each beta value, I obtain a complete set of results as a time series (say, for one beta value, results = [speed, angle, wind force, heat]). I can then plot each of those results against time. The goal is to identify the effect of beta on each simulated parameter.

I can plot each result for each beta against time to see the qualitative difference/comparison. But the problem is that the minute deviations cannot be seen clearly in these type of plots.

I read about Dynamic Time Warping (DTW) but it is a bit difficult to wrap my head around. Is there any other (rather simpler) method that one can sue to analyse these type of time series data?

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

William Rose 2024 年 5 月 10 日

1 投票

@Jake,

It is hard to say without having the actual data.

If you restrict your analysis to one variable at a time, then you may want to plot "deviation from the mean" as a function of time, for the different value of beta, where " the mean" is the mean at each instant, for all vaues of beta examined. This could reveal features that might be hard to see otherwise.

If you want to analyze effects of beta on all four variables simultaneously: you may think of your system as evolving in a four-dimensional space, over rtime (time would be a 5th dimension). Since the four variables have different units, you may want to remove the mean from each, and normalize each variable by its standard deviation, in order to have a dimensionless "z-score" for each variable, as a function of time.

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示

William Rose 2024 年 5 月 10 日

MATLAB Online で開く

sampleData.mat

@Jake,

load('sampleData');

beta=[.2,.3,.4,.5,.6];

p=[p1;p2;p3;p4;p5];

pzm=p-mean(p);

fs=(length(time)-1)/(time(end)-time(1)); % sampling rate

% Find peaks in each trace:

% For p1, p2: Find peak heights and locations, to make an illustrative plot.

% For p3, p4, p5: Find locs only, since only need locs to compute instataneous freq.

[pks1,locs1]=findpeaks(p1,fs);

[pks2,locs2]=findpeaks(p2,fs);

[~,locs3]=findpeaks(p3,fs);

[~,locs4]=findpeaks(p4,fs);

[~,locs5]=findpeaks(p5,fs);

% compute instantaneopus frequency

instFreq={1./diff(locs1); 1./diff(locs2); 1./diff(locs3); 1./diff(locs4); 1./diff(locs5)};

% Next: time associated with each estimate of instFreq

tInstFreq={locs1(2:end);locs2(2:end);locs3(2:end);locs4(2:end);locs5(2:end)};

% plot results

figure

subplot(311)

plot(time,p1,'-r',time,p2,'-g',time,p3,'-b',time,p4,'-c',time,p5,'-m')

title('Raw p(t)');

for i=1:5, legstr{i}=sprintf('b=%.1f',beta(i)); end

legend(legstr)

subplot(312)

plot(time,p1,'-r',locs1,pks1,'r*',time,p2,'-b',locs2,pks2,'bx')

legend('p1','p1 peaks','p2','p2 peaks')

%plot(time,pzm(1,:),'-r',time,pzm(2,:),'-g',time,pzm(3,:),'-b',time,pzm(4,:),'-c',time,pzm(5,:),'-m')

%legend('\beta=1','2','3','4','5')

title('p(t) with peaks')

subplot(313)

plot(tInstFreq{1},instFreq{1},'-r.',tInstFreq{2},instFreq{2},'-g.',tInstFreq{3},instFreq{3},'-b.',...

tInstFreq{4},instFreq{4},'-c.',tInstFreq{5},instFreq{5},'-m.')

legend(legstr)

title('Instantaneous Frequency'); xlabel('Time');

The middle plot above shows that findpeaks() is working as we hope it will. I have defined instantaneous frequency as the reciprocal of the time between successive peaks. The plot of instantaneous frequency versus time confirms what I said in my earlier post: instFreq is initially the same for all values of beta, but then instFreq diverges, with instFreq being higher when beta is smaller. The plot also shows that instFreq oscillates slowly.

With appropriate smoothing, you will be able to show how the mean value of p(t) (mean over approximately one cycle) is different for different vaues of beta.

William Rose 2024 年 5 月 10 日

MATLAB Online で開く

sampleData.mat

@Jake,

Here are plots which show more about how p(t) is affected by the value of beta.

load('sampleData');

beta=[.2,.3,.4,.5,.6];

% Compute smoothed versions of p

ps=[smooth(p1,220),smooth(p2,220),smooth(p3,220),smooth(p4,220),smooth(p5,220)];

% Compute pzm=p_zeromean and smoothed version of pzm

p=[p1;p2;p3;p4;p5];

pzm=p-mean(p);

pzms=[smooth(pzm(1,:),220),smooth(pzm(2,:),220),smooth(pzm(3,:),220),...

smooth(pzm(4,:),220),smooth(pzm(5,:),220)];

% plot results

figure

subplot(211)

plot(time,p1,'-r',time,p2,'-g',time,p3,'-b',time,p4,'-c',time,p5,'-m')

for i=1:5, legstr{i}=sprintf('b=%.1f',beta(i)); end

legend(legstr,Location='southwest'); title('Raw p(t)')

subplot(212)

plot(time,ps(:,1),'-r',time,ps(:,2),'-g',time,ps(:,3),'-b',time,ps(:,4),'-c',time,ps(:,5),'-m')

legend(legstr,Location='southwest'); title('Smoothed p(t)'); xlabel('Time')

figure

subplot(211)

plot(time,pzm(1,:),'-r',time,pzm(2,:),'-g',time,pzm(3,:),'-b',time,pzm(4,:),'-c',time,pzm(5,:),'-m')

legend(legstr,Location='southwest'); title('p_{zm}(t)');

subplot(212)

plot(time,pzms(:,1),'-r',time,pzms(:,2),'-g',time,pzms(:,3),'-b',time,pzms(:,4),'-c',time,pzms(:,5),'-m')

legend(legstr,Location='southwest'); title('Smoothed p_{zm}(t)'); xlabel('Time');

The code above uses smooth() with a width of 220 points. I chose this width because it is about 2 cycles long, so it does a moving average of approximately two cycles of data. The third plot in the previous post showed that the mean frequency (mean across all times and across all five values of beta) is in the ballpark of 0.09, which means the duration of one cycle is about 11, and two cycles is 22. Sampling rate is 10, so that is 220 points per two cycles. Which is just an approximate value. You could try to get fancier, for example, by taking the mean value between successive peaks on each separate trace.

The top figure shows that the smoothed p(t) traces are together initially, then diverge, with smoothed p(t) being higher when beta is greater. The bottom figure shows the samed thing, but the differences are more obvious than in the upper figure, because the bottom figure shows the zero-mean version of p. In both figures, the smoothing is not perfect, because the width of the smoothing window does not exactly equal the oscillation period, which varies over time and from trace to trace. The smoothed signals are less smooth as time approaches 200, because the width of the smoothing window decreases at the edge.

William Rose 2024 年 5 月 10 日

@Jake,

You wrote: "this is very nice, and I can understand the differences of the approach. One question though, in the middle plot (p(t) with peaks vs Time), you have chosen p1 and p2 and not p1,p2,p3,... (all). Was there a specific reason for this, or did you simply chose 2 to convey that findpeaks() work in this context?"

Yes I only showed p1, p2 to show that findpeaks() is working in a reasonable way. If the data were not so smooth, then findpeaks would probably find spurious peaks.

and you wrote "I'm not sure if I understood what you meant by the last sentence ("With appropriate smoothing, you will be able to show how the mean value of p(t) (mean over approximately one cycle) is different for different vaues of beta.") though."

See my recent comment, which demonstrates smoothing by finidng the moving average. I ended up using a moving average width of approximately two cycles, rather than one cycle, which I had originally suggested.

サインインしてコメントする。

How to compare data series?

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

Community Treasure Hunt

How to compare data series?

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

6 件のコメント 4 件の古いコメントを表示 4 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示