fbank#

class diffsptk.MelFilterBankAnalysis(n_channel, fft_length, sample_rate, f_min=0, f_max=None, floor=1e-05, out_format='y')[source]#

See this page for details.

Parameters:
n_channelint >= 1 [scalar]

Number of mel-filter banks, \(C\).

fft_lengthint >= 2 [scalar]

Number of FFT bins, \(L\).

sample_rateint >= 1 [scalar]

Sample rate in Hz.

f_minfloat >= 0 [scalar]

Minimum frequency in Hz.

f_maxfloat <= sample_rate // 2 [scalar]

Maximum frequency in Hz.

floorfloat > 0 [scalar]

Minimum mel-filter bank output in linear scale.

out_format[‘y’, ‘yE’, ‘y,E’]

y is mel-filber bank outpus and E is energy. If this is yE, the two output tensors are concatenated and return the tensor instead of the tuple.

forward(x)[source]#

Apply mel-filter banks to STFT.

Parameters:
xTensor [shape=(…, L/2+1)]

Power spectrum.

Returns:
yTensor [shape=(…, C)]

Mel-filter bank output.

ETensor [shape=(…, 1)]

Energy.

Examples

>>> x = diffsptk.ramp(19)
>>> stft = diffsptk.STFT(frame_length=10, frame_period=10, fft_length=32)
>>> fbank = diffsptk.MelFilterBankAnalysis(4, 32, 8000)
>>> y = fbank(stft(x))
>>> y
tensor([[0.1214, 0.4825, 0.6072, 0.3589],
        [3.3640, 3.4518, 2.7717, 0.5088]])

See also

stft mfcc