File Exchange

image thumbnail

Inverse Short-Time Fourier Transform (ISTFT) with Matlab

version 5.0.0.0 (383 KB) by Hristo Zhivomirov
Time-Frequency resynthesis via Inverse Short-Time Fourier Transform (ISTFT).

56 Downloads

Updated 28 Feb 2019

View License

The present code is a Matlab function that provides an Inverse Short-Time Fourier Transform (ISTFT) of a given spectrogram STFT(k, l) with time across columns and frequency across rows. The output of the function is:
1) the reconstructed signal in time domain;
2) a time vector.
For convenience, the input and output arguments are given in the beginning of the function

An example is given in order to clarify the usage of the function. It represents the time-frequency analysis and then the perfect reconstruction of a given signal x[n], using Blackman and Hamming windows for the analysis and resynthesis, respectively. Also, a GUI named OLAExam is supplied to represent the perfect reconstruction via overlap-add (OLA) method and to assist the choice of the analysis/resynthesis window(s) (their length and hop size).

The code is based on the theory described in:

[1] H. Zhivomirov. On the Development of STFT-analysis and ISTFT-synthesis Routines and their Practical Implementation. TEM Journal, ISSN: 2217-8309, DOI: 10.18421/TEM81-07, Vol. 8, No. 1, pp. 56-64, Feb. 2019. (http://www.temjournal.com/content/81/TEMJournalFebruary2019_56_64.pdf)

Cite As

H. Zhivomirov. On the Development of STFT-analysis and ISTFT-synthesis Routines and their Practical Implementation. TEM Journal, ISSN: 2217-8309, DOI: 10.18421/TEM81-07, Vol. 8, No. 1, pp. 56-64, Feb. 2019. (http://www.temjournal.com/content/81/TEMJournalFebruary2019_56_64.pdf)

Hristo Zhivomirov (2019). Inverse Short-Time Fourier Transform (ISTFT) with Matlab (https://www.mathworks.com/matlabcentral/fileexchange/45577-inverse-short-time-fourier-transform-istft-with-matlab), MATLAB Central File Exchange. Retrieved .

Comments and Ratings (39)

Thanks a lot for sharing, very very helpful XD

Works perfectly, thank you very much !

I wonder why you use OLA but not griffin lim algorithm to reconstruct the signal?

Hello Raden! Thank you for your interest in my submission! About the issue, at both the beginning and the end of the reconstructed signal the OLA-constraint is not fulfilled because there is not enough overlapped window segments. One way to address this is to pad appropriate number of zeros the beginning and the end of the original signal and then to remove the same number of zeros at the both ends of the reconstructed signal.

All best,
Hristo Zhivomirov

Raden

Thank you for your submission Hristo,
I have a problem, if the beginning and the end of the signal is 0 the reconstructed signal is perfect. But if the boundaries are not 0 the code converts them to 0 (so the beginning and the end of the signal looked like a cone). How can i fix this, I've tried changing the window and hop size?

Thanks, great implementation of the overlap and add method.

tormii

Huan Zhong

Hello Benjamin,
In the real-world time-frequency analysis and resynthesis two types of windows are used. The STFT analysis window is chosen based on its frequency resolution and sidelobe behavior. The ISTFT synthesis window minimizes the reconstruction artifacts (if any) via time domain smoothing. Both windows should be carefully selected in order to provide perfect reconstruction in conjunction with each other. In cases where modifications of the STFT are carried out prior to the synthesis, the synthesis window is a must. In my ISTFT function I use the Hamming window as a synthesis window (and also for the analysis in my STFT function). Please carefully check [1] Eqs. (12.8) – (12.11) and [3] Eqs. (5.5) – (5.8).

Best regards,
Hristo Zhivomirov

Hello Hristo,

Thank you for making the STFT and ISTFT functions, but upon reading the "Handbook of Speech Processing" and checking the code, I believe there may be a slight mistake in the istft.m function. Please Correct me if I'm wrong.

In the STFT a Hamming window is used. Let's say the window length is 1024 samples and the hop size is wlen/4, as in your example. The sum of the hamming windows in time make a constant (besides the very beginning and very end). In the STFT code, a window w[n] is applied at each frame index l, to give formula 12.1 x_l[n]=w[n].*x[n+lL] 0<=n<=N-1.

When performing the ifft at each time frame l , this should then return x_l[n]=w[n].*x[n+lL] ;or with respect to global time t=n+lL:
x_l[t-lL]=w[t-lL]x[t] as in formula 12.7.

As the sum of the windows w[t-lL] in time is a constant, reconstruction should be given by simply adding each signal frame at the correct location in time after the ifft, then dividing by the constant originally given by the addition of the hamming windows in the stft. This constant is the synthesis window v[t-lL]. This gives the reconstructed signal as in 12.8.

On lines 42 and 58 of the istft.m code however, you have multiplied by an additional hamming window. It seems from the Handbook of Speech Processing and your stft.m code that the ".*win" should be removed, and W0 should also be changed.

Do you agree? If not, please tell me where I have gone wrong.

Thank you very much again for writing the code.

rabbasi

I checked my input to the ISTFT. you know, I filtered the data, so the frequency dimension in changed. that is why it gives me that error. when I changed the input to the main STFT output, everything goes through.

but now,my problem is that,I should denoise tihs data in time-frequency scale, and then return them to the time scale. what can I do????

rabbasi

I mean I should convert this frequency dimension data to time scale. what can I do with this error?

rabbasi

Dear Hristo Zhivomirov,
I have the same problem as "Paul C".
I used the same features that I used in STFT.
I have a signal with 498900 samples, fs=250KHZ, h=125,wlen=nfft=625.
and as I said I used these parameters again for ISTFT, but I got the error "Matrix dimensions must agree."
as I checked, that is because of this multiplication:
(xprim.*win)
and also I have another problem, that the length of main signal is not equal to the length, that is produced in the first line of ISTFT code.
xlen = nfft + (coln-1)*h;
Thanks in advance for taking your time

Hello Hristo,
Thanks for your explanation and the reference was really helpful. If the window chosen as follows, then it satisfies OLA constraint for hop=wlen/2 too [chapter 12 of Springer handbook]
w=hamming(wlen, 'periodic');
K=sum(w)/h;
w=sqrt(w/K);

Could you please give me a reference which explains the effect of time domain interpolation on resynthesis (one of the things you explained in the comment below)?

Thanks,
Pramod

Hi Pramod, thanks for your interest in my submission! A fulfillment of the OLA constraint is a must in order to achieve a perfect reconstruction of the signal in the time domain. In practice, the STFT analysis window is chosen based on its frequency resolution and sidelobe behavior and the ISTFT synthesis window is selected to provide perfect reconstruction in conjunction with the analysis window. In cases where modifications of the STFT are carried out prior to synthesis, the synthesis window minimizes the reconstruction artifacts. In my ISTFT function I used the Hamming window as a synthesis window (and also for the analysis). I believe that you have some issues because perhaps you use different analysis/synthesis windows that not meet OLA constraint. For further reading, please check the Springer Handbook of Speech Processing: Chapter 12.

All best,
Hristo Zhivomirov

Hi Hendrik, thank you for your interest! My STFT function programmatically is not the same as a Matlab spectrogram function, but gives an identical result. Take a look at https://www.mathworks.com/matlabcentral/fileexchange/45197-short-time-fourier-transformation--stft--with-matlab-implementation?requestedDomain=www.mathworks.com

Regards,
Hristo Zhivomirov

Hi Hristo,
Is this code (stft) same with spectrogram function in Matlab?
Thx

Hello Hristo, thanks for your codes.
When I observe and listen to the waveform reconstructed using weighted OLA, its not as good as the original wavfile. But when I remove the window in istft.m (x((b+1):(b+nfft)) = x((b+1):(b+nfft)) + (xprim)';), then the reconstruction is exactly same as the original (both waveform and quality). So my question is why you have used the weighted OLA for reconstruction?

J.R

Thanks Hristo!But if I hope that the x is a complex signal,how should I do to recover the complex signal from s with ISTFT? Please give me some ideas,thank you!

Hi J.R! Thank you for the comment and for your interest in my submission.

1) choose h = wlen/(4*n), where n = 1, 2, 3, ... and also nfft = wlen.
2) the function assumes that the x is a real signal, not complex.

Regards,
Hristo Zhivomirov

adia woo

hi
i was trying to get time-freq map first ,and the main frequenccy of the sigal is about 900MHz ,but it is about 1300MHz in the result,

J.R

Hi Hristo!
1、why i need 130*x_istft to get the correct signal
2、when x is complex-value(% x = hilbert(x);),but x_istft is not correct and x_istft is real-value

clear, clc, close all
% define signal parameters
% sine-wave signal (stationary signal)

fs = 10*10^6;
t = 0:1/fs:4999/fs;
x = 10*sin(2*pi*t*100000);
% x = hilbert(x);

% define analysis and synthesis parameters
wlen = 256;
h = 8;
nfft = 4095;
% perform analysis and resynthesis
[s, f, t_stft] = stft(x, wlen, h, nfft, fs);
[x_istft, t_istft] = istft(s, h, nfft, fs);
% plot the original signal
figure(1)
plot(t, x, 'b')
grid on
axis([0 0.0005 -15 15])
set(gca, 'FontName', 'Times New Roman', 'FontSize', 14)
xlabel('Time, s')
ylabel('Amplitude, V')
title('Original and reconstructed signal')

% plot the resynthesized signal
hold on
plot(t_istft, 130*x_istft, '-.r')
legend('Original signal', 'Reconstructed signal')

Hi Paul C! Thanks for the interest! It's hard to adrress the problem from distance, but I'll try :) First of all: check your STFT-matrix - it must contain only unique fft points with time across columns and frequency across rows. Furthermore, the h and nfft parameters must be the same as in STFT-analysis.

Paul C

Hi Hristo
I'm trying to use istft but it keeps giving me:
Error using .*
Matrix dimensions must agree.

Error in istft (line 49)
x((b+1):(b+nfft)) = x((b+1):(b+nfft)) + (xprim.*win)';

I have changed the parameters h, nfft as you have suggested before but doesn't solve it. I can't figure where's the problem coming from. Can you please assist?
I'm applying the function on a SxN matrix.

Many thanks!

Paul

TS Sharma

ok. Thanks Hristo!

Hi TS Sharma! Thanks for the interest in my submission! At both the beginning and the end of the reconstructed signal the OLA-constraint is not fulfilled because there is not enough overlapped window segments. One way to address this issue is to choose a smaller hop size h = wlen/(4*n) (i.e. shorter window or bigger n).

Best Regards,
Hristo Zhivomirov

TS Sharma

Hi Hristo!
Thanks a lot for your code. I observe attenuation in the reconstructed signal only at the boundries. In the middle area, the reconstruction is perfect. I observe the same effect even in the examole provided. How can I address this issue?
Thanks and regards,
Tanushree

Hi Indi_u,

Thank you for the comment and for your interest in my submission.
The possible reason for your troubles is because you don't meet the OLA-constraint. Try this:

1) choose h = wlen/(4*n), where n = 1, 2, 3, ...;
2) choose nfft = wlen.

You can check whether your analysis/synthesis settings meet OLA-constraint with WindowChoice.m

Best Regards,
Hristo Zhivomirov

Indi_u

Hi Hristo,

Thanks a lot for your codes on both stft and istft. I used your stft project to convert my time domain signal to the frequency domain and then used this isft project to convert it back to time domain. I have used the same h, nfft, fs parameters I used when calling stft. However the results is different for the actual signals. Cannot think of a reason for this difference and any of your opinion is going to be highly useful.

Darik

Yousef

Yousef

Hi Hristo,
Your comment was very helpful.
Thank you so much.

Hi Yousef,

First of all thank you for the comment and for your interest in my submission.
Now here is the answer:

1) choose h = wlen/(4*n), where n = 1, 2, 3, ... (h = wlen/2 does not meet OLA-constraint, check with WindowChoice.m).
2) choose nfft = wlen (nfft > wlen cause interpolation in the time domain of every signal segment = errors in resynthesis, so if you really want interpolation you must use zero-padding technique in time-frequency domain).

Try this:

clear, clc, close all

% define signal parameters
% sine-wave signal (stationary signal)

fs = 48000;
t = 0:1/fs:1-1/fs;
x = 10*sin(2*pi*t*10);

% define analysis and synthesis parameters
wlen = 64;
h = wlen/4;
nfft = wlen;

% perform analysis and resynthesis
[stft, f, t_stft] = stft(x, wlen, h, nfft, fs);
[x_istft, t_istft] = istft(stft, h, nfft, fs);

% plot the original signal
figure(1)
plot(t, x, 'b')
grid on
axis([0 1 -15 15])
set(gca, 'FontName', 'Times New Roman', 'FontSize', 14)
xlabel('Time, s')
ylabel('Amplitude, V')
title('Original and reconstructed signal')

% plot the resynthesized signal
hold on
plot(t_istft, x_istft, '-.r')
legend('Original signal', 'Reconstructed signal')

Best Regards,

Hristo Zhivomirov

Yousef

Hi everyone
I have transformed my signal of length 1024 into time-frequency domain using stft function. but, when I use istft function to convert my signal into original space, it gives me a signal of length 1984, with a big difference in amplitude.
my pseudo code is:
U=sinusoidal
nfft=1024;
wlen=64;
h=32; fs=1;
[stft, f, t] = stft(U, wlen, h, nfft, fs);
[x, t] = istft(stft, h, nfft, fs);
I will appreciate if you comment.

Updates

5.0.0.0

A new reference literature has been added.

4.0.0.0

A new version of the code has been uploaded.

3.0.0.0

A new version of the code has been uploaded.

2.0.0.0

A new version of the code has been uploaded.

MATLAB Release Compatibility
Created with R2014b
Compatible with any release
Platform Compatibility
Windows macOS Linux