fbank#
- diffsptk.FBANK#
- alias of - MelFilterBankAnalysis
- class diffsptk.MelFilterBankAnalysis(*, fft_length, n_channel, sample_rate, f_min=0, f_max=None, floor=1e-05, use_power=False, out_format='y', device=None, dtype=None)[source]#
- See this page for details. - Parameters:
- fft_lengthint >= 2
- The number of FFT bins, \(L\). 
- n_channelint >= 1
- The number of mel filter banks, \(C\). 
- sample_rateint >= 1
- The sample rate in Hz. 
- f_minfloat >= 0
- The minimum frequency in Hz. 
- f_maxfloat <= sample_rate // 2
- The maximum frequency in Hz. 
- floorfloat > 0
- The minimum mel filter bank output in linear scale. 
- use_powerbool
- If True, use the power spectrum instead of the amplitude spectrum. 
- out_format[‘y’, ‘yE’, ‘y,E’]
- y is mel filber bank output and E is energy. If this is yE, the two output tensors are concatenated and return the tensor instead of the tuple. 
 
 - References [1]- Young et al., “The HTK Book,” Cambridge University Press, 2006. 
 - forward(x)[source]#
- Apply mel filter banks to the STFT. - Parameters:
- xTensor [shape=(…, L/2+1)]
- The power spectrum. 
 
- Returns:
- yTensor [shape=(…, C)]
- The mel filter bank output. 
- ETensor [shape=(…, 1)] (optional)
- The energy. 
 
 - Examples - >>> x = diffsptk.ramp(19) >>> stft = diffsptk.STFT(frame_length=10, frame_period=10, fft_length=32) >>> fbank = diffsptk.MelFilterBankAnalysis(4, 32, 8000) >>> y = fbank(stft(x)) >>> y tensor([[0.1214, 0.4825, 0.6072, 0.3589], [3.3640, 3.4518, 2.7717, 0.5088]]) 
 
- diffsptk.functional.fbank(x, n_channel, sample_rate, f_min=0, f_max=None, floor=1e-05, use_power=False, out_format='y')[source]#
- Apply mel-filter banks to the STFT. - Parameters:
- xTensor [shape=(…, L/2+1)]
- The power spectrum. 
- n_channelint >= 1
- The number of mel filter banks, \(C\). 
- sample_rateint >= 1
- The sample rate in Hz. 
- f_minfloat >= 0
- The minimum frequency in Hz. 
- f_maxfloat <= sample_rate // 2
- The maximum frequency in Hz. 
- floorfloat > 0
- The minimum mel filter bank output in linear scale. 
- use_powerbool
- If True, use the power spectrum instead of the amplitude spectrum. 
- out_format[‘y’, ‘yE’, ‘y,E’]
- y is mel filber bank output and E is energy. If this is yE, the two output tensors are concatenated and return the tensor instead of the tuple. 
 
- Returns:
- yTensor [shape=(…, C)]
- The mel filter bank output. 
- ETensor [shape=(…, 1)] (optional)
- The energy.