fbank
Functions
-
int main(int argc, char *argv[])
fbank [ option ] [ infile ]
-n int
number of channels \((1 \le C)\)
-l int
FFT length \((2 \le N)\)
-s double
sampling rate in kHz \((0 < F_s)\)
-L double
lowest frequency in Hz \((0 \le F_l < F_h)\)
-H double
highest frequency in Hz \((F_l < F_h \le 500F_s)\)
-q int
input format
0
amplitude spectrum in dB1
log amplitude spectrum2
amplitude spectrum3
power spectrum4
windowed waveform
-o int
output format
0
fbank1
fbank and energy
-e double
floor of raw filter-bank output \((0 < \epsilon)\)
infile str
double-type windowed sequence or spectrum
stdout
double-type mel-filter-bank output
The below example extracts the 20-channel mel-filter-bank outputs from a Hamming windowed signal.
frame -l 400 -p 160 < data.d | window -l 400 -L 512 -w 1 | \ fbank -l 512 -n 20 > data.fbank
- Parameters:
argc – [in] Number of arguments.
argv – [in] Argument vector.
- Returns:
0 on success, 1 on failure.
See also
-
class MelFilterBankAnalysis
Perform mel-filter-bank analysis.
The input is the half part of power spectrum:
\[ \begin{array}{cccc} |X(0)|^2, & |X(1)|^2, & \ldots, & |X(N/2)|^2, \end{array} \]where \(N\) is the FFT length. The outputs are the \(C\)-channel mel-filter-bank outputs\[ \begin{array}{cccc} F(1), & F(2), & \ldots, & F(C) \end{array} \]and the log-signal energy \(E\).The implementation is based on HTK. The only difference from the implementation is the constant of mel-scale formula:
\[ m = 1127.01048 \log \left( 1 + \frac{f}{700} \right), \]where HTK use \(1127\) instead of \(1127.01048\).[1] S. Young et al., “The HTK book,” Cambridge University Engineering Department, 2006.
Public Functions
-
MelFilterBankAnalysis(int fft_length, int num_channel, double sampling_rate, double lowest_frequency, double highest_frequency, double floor, bool use_power)
- Parameters:
fft_length – [in] Number of FFT bins, \(N\).
num_channel – [in] Number of channels, \(C\).
sampling_rate – [in] Sampling rate in Hz.
lowest_frequency – [in] Lowest frequency in Hz.
highest_frequency – [in] Highest frequency in Hz.
floor – [in] Floor value of raw filter-bank output.
use_power – [in] If true, use power spectrum instead of amplitude one.
-
inline int GetFftLength() const
- Returns:
FFT size.
-
inline int GetNumChannel() const
- Returns:
Number of channels.
-
inline double GetFloor() const
- Returns:
Floor value.
-
inline bool IsPowerUsed() const
- Returns:
Whether to use power spectrum.
-
inline bool IsValid() const
- Returns:
True if this object is valid.
-
bool GetCenterFrequencies(std::vector<double> *center_frequencies) const
- Returns:
Center frequencies in Hz.
-
bool Run(const std::vector<double> &power_spectrum, std::vector<double> *filter_bank_output, double *energy) const
- Parameters:
power_spectrum – [in] \((N/2+1)\)-length power spectrum.
filter_bank_output – [out] \(C\)-channel filter-bank outputs.
energy – [out] Signal energy \(E\) (optional).
- Returns:
True on success, false on failure.
-
MelFilterBankAnalysis(int fft_length, int num_channel, double sampling_rate, double lowest_frequency, double highest_frequency, double floor, bool use_power)