mfcc#
- diffsptk.MFCC#
- class diffsptk.MelFrequencyCepstralCoefficientsAnalysis(*, fft_length, mfcc_order, n_channel, sample_rate, lifter=1, f_min=0, f_max=None, floor=1e-05, out_format='y')[source]#
See this page for details.
- Parameters:
- fft_lengthint >= 2
The number of FFT bins, \(L\).
- mfcc_orderint >= 1
The order of the MFCC, \(M\).
- n_channelint >= 1
The number of mel filter banks, \(C\).
- sample_rateint >= 1
The sample rate in Hz.
- lifterint >= 1
The liftering coefficient.
- f_minfloat >= 0
The minimum frequency in Hz.
- f_maxfloat <= sample_rate // 2
The maximum frequency in Hz.
- floorfloat > 0
The minimum mel filter bank output in linear scale.
- out_format[‘y’, ‘yE’, ‘yc’, ‘ycE’]
y is MFCC, c is C0, and E is energy.
References
[1]Young et al., “The HTK Book,” Cambridge University Press, 2006.
- forward(x)[source]#
Compute the MFCC from the power spectrum.
- Parameters:
- xTensor [shape=(…, L/2+1)]
The power spectrum.
- Returns:
- yTensor [shape=(…, M)]
The MFCC without C0.
- ETensor [shape=(…, 1)] (optional)
The energy.
- cTensor [shape=(…, 1)] (optional)
The C0.
Examples
>>> x = diffsptk.ramp(19) >>> stft = diffsptk.STFT(frame_length=10, frame_period=10, fft_length=32) >>> mfcc = diffsptk.MFCC(4, 8, 32, 8000) >>> y = mfcc(stft(x)) >>> y tensor([[-7.7745e-03, -1.4447e-02, 1.6157e-02, 1.1069e-03], [ 2.8049e+00, -1.6257e+00, -2.3566e-02, 1.2804e-01]])
- diffsptk.functional.mfcc(x, mfcc_order, n_channel, sample_rate, lifter=1, f_min=0, f_max=None, floor=1e-05, out_format='y')[source]#
Compute the MFCC from the power spectrum.
- Parameters:
- xTensor [shape=(…, L/2+1)]
The power spectrum.
- mfcc_orderint >= 1
The order of the MFCC, \(M\).
- n_channelint >= 1
The number of mel filter banks, \(C\).
- sample_rateint >= 1
The sample rate in Hz.
- lifterint >= 1
The liftering coefficient.
- f_minfloat >= 0
The minimum frequency in Hz.
- f_maxfloat <= sample_rate // 2
The maximum frequency in Hz.
- floorfloat > 0
The minimum mel filter bank output in linear scale.
- out_format[‘y’, ‘yE’, ‘yc’, ‘ycE’]
y is MFCC, c is C0, and E is energy.
- Returns:
- yTensor [shape=(…, M)]
The MFCC without C0.
- ETensor [shape=(…, 1)] (optional)
The energy.
- cTensor [shape=(…, 1)] (optional)
The C0.