stft#

diffsptk.STFT#

alias of ShortTimeFourierTransform

class diffsptk.ShortTimeFourierTransform(frame_length: int, frame_period: int, fft_length: int, *, center: bool = True, zmean: bool = False, mode: str = 'constant', window: str = 'blackman', norm: str = 'power', eps: float = 1e-09, relative_floor: float | None = None, out_format: str = 'power')[source]#

This module is a simple cascade of framing, windowing, and spectrum calculation.

Parameters:
frame_lengthint >= 1

The frame length in samples, \(L\).

frame_periodint >= 1

The frame period in samples, \(P\).

fft_lengthint >= L

The number of FFT bins, \(N\).

centerbool

If True, pad the input on both sides so that the frame is centered.

zmeanbool

If True, perform mean subtraction on each frame.

mode[‘constant’, ‘reflect’, ‘replicate’, ‘circular’]

The padding method.

window[‘blackman’, ‘hamming’, ‘hanning’, ‘bartlett’, ‘trapezoidal’, ‘rectangular’, ‘nuttall’]

The window type.

norm[‘none’, ‘power’, ‘magnitude’]

The normalization type of the window.

epsfloat >= 0

A small value added to the power spectrum.

relative_floorfloat < 0 or None

The relative floor of the power spectrum in dB.

out_format[‘db’, ‘log-magnitude’, ‘magnitude’, ‘power’, ‘complex’]

The output format.

forward(x: Tensor) Tensor[source]#

Compute short-time Fourier transform.

Parameters:
xTensor [shape=(…, T)]

The input waveform.

Returns:
outTensor [shape=(…, T/P, N/2+1)]

The output spectrogram.

Examples

>>> x = diffsptk.ramp(1, 3)
>>> x
tensor([1., 2., 3.])
>>> stft = diffsptk.STFT(frame_length=3, frame_period=1, fft_length=8)
>>> y = stft(x)
>>> y
tensor([[1.0000, 1.0000, 1.0000, 1.0000, 1.0000],
        [4.0000, 4.0000, 4.0000, 4.0000, 4.0000],
        [9.0000, 9.0000, 9.0000, 9.0000, 9.0000]])
diffsptk.functional.stft(x: Tensor, *, frame_length: int = 400, frame_period: int = 80, fft_length: int = 512, center: bool = True, zmean: bool = False, mode: str = 'constant', window: str = 'blackman', norm: str = 'power', eps: float = 1e-09, relative_floor: float | None = None, out_format: str = 'power') Tensor[source]#

Compute short-time Fourier transform.

Parameters:
xTensor [shape=(…, T)]

The input waveform.

frame_lengthint >= 1

The frame length in samples, \(L\).

frame_periodint >= 1

The frame period in samples, \(P\).

fft_lengthint >= L

The number of FFT bins, \(N\).

centerbool

If True, pad the input on both sides so that the frame is centered.

zmeanbool

If True, perform mean subtraction on each frame.

mode[‘constant’, ‘reflect’, ‘replicate’, ‘circular’]

The padding method.

window[‘blackman’, ‘hamming’, ‘hanning’, ‘bartlett’, ‘trapezoidal’, ‘rectangular’, ‘nuttall’]

The window type.

norm[‘none’, ‘power’, ‘magnitude’]

The normalization type of the window.

epsfloat >= 0

A small value added to the power spectrum.

relative_floorfloat < 0 or None

The relative floor of the power spectrum in dB.

out_format[‘db’, ‘log-magnitude’, ‘magnitude’, ‘power’, ‘complex’]

The output format.

Returns:
outTensor [shape=(…, T/P, N/2+1)]

The output spectrogram.

See also

frame window spec istft