I have timeseries data in an array which I want to compare in order to build clusters of similar time series.
Generate sample data using the following piece of code:
timeseries = [1, 2, 3, 4; 1, 2, 3, 4; 1, 2, 3, 4; 4, 5, 6, 7; 4, 5, 6, 8; 4, 5, 6, 9; 4, 5, 6, 10];
Here we have 7 timeseries where each row represent a timeseries and each column represents the timestamp.
First I compute the eucledian distance of the data generated above. This can be done through
distance = squareform(pdist(timeseries));
From the above distance matrix we can find out unique distances by code below
unique_distances = unique(distance);
I want to create a n (number of time series i.e 4) by m (number of unique distances i.e. 8). See below
t1 , t2 .. represent time series 1, 2 and so on.
First row and first column of the matrix would show how many timeseries have zero distance with the first time series and so on so forth.
First row and second column of matrix represent how many timeseries have distance of 1 with first timeseries and so on and so forth.
I am new to MATLAB I've done the desired result using code below;
dist = nan(size(timeseries, 1), size(unique_distances,1));
for i = 1:size(timeseries, 1)
for j = 1:size(unique_distances,1)
dist(i,j) = sum(distance(i,:) == unique_distances(j));
I am looking for a vectorised approach for above code.
Also I need to cluster based on time series which has zero distance with maximum number of other time series therefore I need to sort the matrix based on that as well. In this example it is already sorted as t1 had distance of zero with 3 timeseries as it can be seen from the matrix. an 3 is the max value aswell.