fbank#

diffsptk.FBANK#: alias of MelFilterBankAnalysis

class diffsptk.MelFilterBankAnalysis(*, fft_length, n_channel, sample_rate, f_min=0, f_max=None, floor=1e-05, use_power=False, out_format='y', device=None, dtype=None)[source]#

See this page for details.

Parameters:

fft_lengthint >= 2: The number of FFT bins, \(L\).
n_channelint >= 1: The number of mel filter banks, \(C\).
sample_rateint >= 1: The sample rate in Hz.
f_minfloat >= 0: The minimum frequency in Hz.
f_maxfloat <= sample_rate // 2: The maximum frequency in Hz.
floorfloat > 0: The minimum mel filter bank output in linear scale.
use_powerbool: If True, use the power spectrum instead of the amplitude spectrum.
out_format[‘y’, ‘yE’, ‘y,E’]: y is mel filber bank output and E is energy. If this is yE, the two output tensors are concatenated and return the tensor instead of the tuple.

References

[1]

Young et al., “The HTK Book,” Cambridge University Press, 2006.

forward(x)[source]#

Apply mel filter banks to the STFT.

Parameters:

xTensor [shape=(…, L/2+1)]: The power spectrum.

Returns:

yTensor [shape=(…, C)]: The mel filter bank output.
ETensor [shape=(…, 1)] (optional): The energy.

Examples

>>> x = diffsptk.ramp(19)
>>> stft = diffsptk.STFT(frame_length=10, frame_period=10, fft_length=32)
>>> fbank = diffsptk.MelFilterBankAnalysis(4, 32, 8000)
>>> y = fbank(stft(x))
>>> y
tensor([[0.1214, 0.4825, 0.6072, 0.3589],
        [3.3640, 3.4518, 2.7717, 0.5088]])

diffsptk.functional.fbank(x, n_channel, sample_rate, f_min=0, f_max=None, floor=1e-05, use_power=False, out_format='y')[source]#

Apply mel-filter banks to the STFT.

Parameters:

xTensor [shape=(…, L/2+1)]: The power spectrum.
n_channelint >= 1: The number of mel filter banks, \(C\).
sample_rateint >= 1: The sample rate in Hz.
f_minfloat >= 0: The minimum frequency in Hz.
f_maxfloat <= sample_rate // 2: The maximum frequency in Hz.
floorfloat > 0: The minimum mel filter bank output in linear scale.
use_powerbool: If True, use the power spectrum instead of the amplitude spectrum.
out_format[‘y’, ‘yE’, ‘y,E’]: y is mel filber bank output and E is energy. If this is yE, the two output tensors are concatenated and return the tensor instead of the tuple.

Returns:

yTensor [shape=(…, C)]: The mel filter bank output.
ETensor [shape=(…, 1)] (optional): The energy.

fbank#

This Page