Different performances of STL decomposition in MATLAB and Python
34 ビュー (過去 30 日間)
古いコメントを表示
I used trenddecomp fucntion in MATLAB and STL function in Python to decomposite a time series and the results are pretty different in these two software. I don't know if it is something different in the processing of this function?
Here are the code scripts.
MATLAB
data = readtable('stldata.txt');
index = data.index;
datas = data.data;
[LT,ST,R] = trenddecomp(datas,'stl',12);
figure(1);
plot(index,LT)
hold on
plot(index,ST)
plot(index,R)
plot(index,datas)
legend('Long term','Seasonal','Residual','Original')
Python
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import STL
file_path = './stldata.txt'
data = pd.read_csv(file_path, delim_whitespace=True)
stl = STL(data['data'],period=12)
result = stl.fit()
fig = result.plot()
fig.set_size_inches(10, 6)
plt.show()
0 件のコメント
回答 (1 件)
Pavl M.
2024 年 11 月 19 日 16:37
編集済み: Pavl M.
約3時間 前
Are they "pretty" different? It looks ok your plots, the curves are not very different, just scale y axis in Python or plot as in Matlab in 1 figure 3-4 different curves in different colours and see.
So the differences you percept are mainly due the 2 Matlab and Python plot types differences ( in Matlab it is all curves in 1 subplot, in Python there are 3 subplots per each curve and so the y-axis scaling) and also the difference is due to that that in Python the STL(...) algorithm made less low pass filtering (less high frequencies components rejection, higher cut-off frequency) with their moving average and convolution than that of Matlab (more smooth and so more high frequency components were rejected and so more high passs filtering out and so lower cut-off frequency of matlab internal STL algorithm realization).
Differences in this question lie in 1) How your input initial source of data sampled(spaced), need to be uniform for Matlab, 2) There may be 2 seasonal trends, you found yet only 1, 3) Scaling/zoom of plots, for Python matplotlib.pyplot try different plot initialization fig, ax = plt.subplots(layout='constrained') vertical y axis size and scaling, fig.set_size_inches(), subplot mosaic and ax.set_yscale(...)
Should I look up for original STL and SSA algorithms realizations in Matlab and Python for comparisons? Are they implemented in Matlab as Fortran, C++ codes or already built libs?
In Python STL decomposition is implemented using moving average and convolution filter:
What about Matlab?
Hope this will help.
Can you accept my answer?
3 件のコメント
Pavl M.
約2時間 前
編集済み: Pavl M.
約1時間 前
Kindly see I provided the updates to the answers:
I meant that the differences in the 2 LOESS STL algorithms implementation visualizations are relatively not that drastical, not so disastrous in comparison to closer other real world picture regions and recent events and very serious deals to manage, only to settle by help.
I found for you next Python setup of function invocation arguments order for closer to TCE NCE MPPL Matlab resuls:
stl = STL(data['data'],period=12, seasonal=9, trend=None, low_pass=19, seasonal_deg=1, trend_deg=1, low_pass_deg=1, robust=True, seasonal_jump=2, trend_jump=2, low_pass_jump=2)
or you can use more advanced:
from statsmodels.tsa.seasonal import MSTL
stl_kwargs = {"seasonal_deg": 0}
model = MSTL(data, periods=(24, 24 * 7, 24*7*4, 24*7*4*30))
So the differences you percept are mainly due the 2 reasons:
First:
Matlab and Python plot types differences ( in Matlab it is all curves in 1 subplot, in Python there are 3 subplots per each curve and so the y-axis scaling, can be made ) and also
Second:
Both Matlab and Python rely on some Fortran routines for STL algorithm.
Python uses:
NETLIB fortran written by [1]. The original code contains a bug that appears in the determination of the median that is used in the robust weighting. This version matches the fixed version that uses a correct partitioned sort to determine the median.
References
R. B. Cleveland, W. S. Cleveland, J.E. McRae, and I. Terpenning (1990) STL: A Seasonal-Trend Decomposition Procedure Based on LOESS. Journal of Official Statistics, 6, 3-73
While whiich Fortran library is used for the same TCE NCE MPPL Matlab is left for commercial research subject to already order from me.
the difference is due to that that in Python the STL(...) algorithm made less low pass filtering (less high frequencies components rejection, higher cut-off frequency) with their moving average and convolution than that of Matlab (more smooth and so more high frequency components were rejected and so more high passs filtering out and so lower cut-off frequency of matlab internal STL algorithm realization).
I found how to adjust is by next arguments in Python STL function list:
Length of the low-pass filter. Must be an odd integer >=3. If not provided, uses the smallest odd integer > period.
Degree of low pass LOESS. 0 (constant) or 1 (constant and trend).
Flag indicating whether to use a weighted version that is robust to some forms of outliers.
Positive integer determining the linear interpolation step. If larger than 1, the LOESS is used every seasonal_jump points and linear interpolation is between fitted points. Higher values reduce estimation time.
Positive integer determining the linear interpolation step. If larger than 1, the LOESS is used every trend_jump points and values between the two are linearly interpolated. Higher values reduce estimation time.
Positive integer determining the linear interpolation step. If larger than 1, the LOESS is used every low_pass_jump points and values between the two are linearly interpolated. Higher values reduce estimation time.
In Python use next template to equalize the plots with that of TCE NCE MPPL Matlab:
...
res = stl.fit()
plt.gca().set_color_cycle(['red', 'green', 'blue', 'yellow', 'black'])
plt.plot(data)
plt.plot(result.trend)
plt.plot(result.seasonal)
plt.plot(result.resid)
plt.plot(result.observed)
plt.show()
Your can forecast with it using next:
%For univariate time series:
from statsmodels.tsa.sarima.model import SARIMAX
%For multi-variate analysis:
from statsmodels.tsa.varmax.model import VARMAX
%Vector Autoregressive Moving Average with eXogenous regressors model
from statsmodels.tsa.forecasting.stl import STLForecast
stlf = STLForecast(data, SARIMAX)
stlf_res = stlf.fit()
horiz = 3;
forecast = stlf_res.forecast(3)
plt.plot(forecast)
plt.show()
data = readtable('https://uk.mathworks.com/matlabcentral/answers/uploaded_files/1809698/stldata.txt');
index = data.index;
datas = data.data;
per1 = 12;
[LT,ST,R] = trenddecomp(datas,'stl',per1);
figure;
subplot(4, 1, 1);
plot(index, datas, 'r');
title('Input');
xlabel('time');
ylabel('Val[units]');
grid on;
subplot(4, 1, 2);
plot(index, LT, 'b');
title('LongTermTrend');
xlabel('time');
ylabel('Val[units]');
grid on;
subplot(4, 1, 3);
plot(index, ST, 'g');
title('SeasonalPeriodic12');
xlabel('time');
ylabel('Val[units]');
grid on;
subplot(4, 1, 4);
plot(index, R, 'm');
title('Residual');
xlabel('time');
ylabel('Val[units]');
grid on;
参考
カテゴリ
Help Center および File Exchange で Call Python from MATLAB についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!