audioTimeScaler

Apply time scaling to streaming audio

Description

The audioTimeScaler object performs audio time scale modification (TSM) independently across each input channel.

To modify the time scale of streaming audio:

  1. Create the audioTimeScaler object and set its properties.

  2. Call the object with arguments, as if it were a function.

To learn more about how System objects work, see What Are System Objects? (MATLAB).

Creation

Description

aTS = audioTimeScaler creates an object, aTS, that performs audio time scale modification independently across each input channel over time.

aTS = audioTimeScaler(speedupFactor) sets the SpeedupFactor property to speedupFactor.

aTS = audioTimeScaler(___,'Name',Value) sets each property Name to the specified Value. Unspecified properties have default values.

Example: aTS = audioTimeScaler(1.2,'Window',sqrt(hann(1024,'periodic')),'OverlapLength',768) creates an object, aTS, that increases the tempo of audio by 1.2 times its original speed using a periodic 1024-point Hann window and a 768-point overlap.

Properties

expand all

Unless otherwise indicated, properties are nontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and the release function unlocks them.

If a property is tunable, you can change its value at any time.

For more information on changing property values, see System Design in MATLAB Using System Objects (MATLAB).

Speedup factor, specified as a positive real scalar.

Tunable: Yes

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Domain of the input signal, specified as "Time" or "Frequency".

Data Types: char | string

Analysis window, specified as a real vector.

Note

If using audioTimeScaler with frequency-domain input, you must specify Window as the same window used to transform audioIn to the frequency domain.

Data Types: single | double

Overlap length of adjacent analysis windows, specified as a nonnegative integer.

Note

If using audioTimeScaler with frequency-domain input, you must specify OverlapLength as the same overlap length used to transform audioIn to a time-frequency representation.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

FFT length, specified as a positive integer. The default, [], means that the FFT length is equal to the number of rows in the input signal.

Dependencies

To enable this property, set InputDomain to 'Time'.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Apply identity phase locking, specified as true or false.

Data Types: logical

Usage

Description

example

audioOut = aTS(audioIn) applies time-scale modification to the input, audioIn, and returns the time-scaled output, audioOut.

Input Arguments

expand all

Input audio, specified as a column vector or matrix. How audioTimeScaler interprets audioIn depends on the InputDomain property.

  • If InputDomain is set to "Time", audioIn must be a real N-by-1 column vector or N-by-C matrix. The number of rows, N, must be equal to or less than the hop length (size(audioIn,1) <= numel(Window)-OverlapLength). Columns of a matrix are interpreted as individual channels.

  • If InputDomain is set to "Frequency", specify audioIn as a real or complex NFFT-by-1 column vector or NFFT-by-C matrix. The number of rows, NFFT, is the number of points in the DFT calculation, and is set on the first call to the audio time scaler. NFFT must be greater than or equal to the window length (size(audioIn,1) >= numel(Window)). Columns of a matrix are interpreted as individual channels.

Data Types: single | double
Complex Number Support: Yes

Output Arguments

expand all

Time-stretched audio, returned as a column vector or matrix.

Data Types: single | double

Object Functions

To use an object function, specify the System object™ as the first input argument. For example, to release system resources of a System object named obj, use this syntax:

release(obj)

expand all

stepRun System object algorithm
releaseRelease resources and allow changes to System object property values and input characteristics
resetReset internal states of System object

Examples

expand all

To minimize artifacts caused by windowing, create a square root Hann window capable of perfect reconstruction. Use iscola to verify the design.

win = sqrt(hann(1024,'periodic'));
overlapLength = 896;
iscola(win,overlapLength)
ans = logical
   1

Create an audioTimeScaler with a speedup factor of 1.5. Change the value of alpha to hear the effect of the speedup factor.

alpha = 1.5;
aTS = audioTimeScaler( ...
    'SpeedupFactor',alpha, ...
    'Window',win, ...
    'OverlapLength',overlapLength);

Create a dsp.AudioFileReader object to read frames from an audio file. The length of frames input to the audio time scaler must be less than or equal to the analysis hop length defined in audioTimeScaler. To minimize buffering, set the samples per frame of the file reader to the analysis hop length.

hopLength = numel(aTS.Window) - overlapLength;
fileReader = dsp.AudioFileReader('Counting-16-44p1-mono-15secs.wav', ...
    'SamplesPerFrame',hopLength);

Create an audioDeviceWriter to write frames to your audio device. Use the same sample rate as the file reader.

deviceWriter = audioDeviceWriter('SampleRate',fileReader.SampleRate);

In an audio stream loop, read a frame the file, apply time scale modification, and then write a frame to the device.

while ~isDone(fileReader)
    audioIn = fileReader();
    audioOut = aTS(audioIn);
    deviceWriter(audioOut);
end

As a best practice, release your objects once done.

release(deviceWriter)
release(fileReader)
release(aTS)

Create a window capable of perfect reconstruction. Use iscola to verify the design.

win = kbdwin(512);
overlapLength = 256;
iscola(win,overlapLength)
ans = logical
   1

Create an audioTimeScaler with a speedup factor of 0.8. Set InputDomain to "Frequency" and specify the window and overlap length used to transform time-domain audio to the frequency domain. Set LockPhase to true to increase the fidelity in the time-scaled output.

alpha = 0.8;
timeScaleModification = audioTimeScaler( ...
    "SpeedupFactor",alpha, ...
    "InputDomain","Frequency", ...
    "Window",win, ...
    "OverlapLength",overlapLength, ...
    "LockPhase",true);

Create a dsp.AudioFileReader object to read frames from an audio file. Create a dsp.STFT object to perform a short-time Fourier transform on streaming audio. Specify the same window and overlap length you used to create the audioTimeScaler. Create an audioDeviceWriter object to write frames to your audio device.

fileReader = dsp.AudioFileReader('RockDrums-44p1-stereo-11secs.mp3','SamplesPerFrame',numel(win)-overlapLength);

shortTimeFourierTransform = dsp.STFT('Window',win,'OverlapLength',overlapLength,'FFTLength',numel(win));

deviceWriter = audioDeviceWriter('SampleRate',fileReader.SampleRate);

In an audio stream loop:

  1. Read a frame from the file.

  2. Input the frame to the STFT. The dsp.STFT object performs buffering.

  3. Apply time scale modification.

  4. Write the modified audio to your audio device.

while ~isDone(fileReader)
    x = fileReader();
    X = shortTimeFourierTransform(x);
    y = timeScaleModification(X);
    deviceWriter(y);
end

As a best practice, release your objects once done.

release(fileReader)
release(shortTimeFourierTransform)
release(timeScaleModification)
release(deviceWriter)

Algorithms

audioTimeScaler uses the same phase vocoder algorithm as stretchAudio and is based on the descriptions in [1] and [2].

References

[1] Driedger, Johnathan, and Meinard Müller. "A Review of Time-Scale Modification of Music Signals." Applied Sciences. Vol. 6, Issue 2, 2016.

[2] Driedger, Johnathan. "Time-Scale Modification Algorithms for Music Audio Signals." Master's thesis, Saarland University, 2011.

Extended Capabilities

Introduced in R2019b