mgcep

Functions

int main(int argc, char *argv[])

mgcep [ option ] [ infile ]

  • -m int

    • order of coefficients \((0 \le M)\)

  • -a double

    • all-pass constant \((|\alpha| < 1)\)

  • -g double

    • gamma \((|\gamma| \le 1)\)

  • -c int

    • gamma \(\gamma = -1 / C\) \((1 \le C)\)

  • -l int

    • FFT length \((2 \le N)\)

  • -q int

    • input format

      • 0 amplitude spectrum in dB

      • 1 log amplitude spectrum

      • 2 amplitude spectrum

      • 3 power spectrum

      • 4 windowed waveform

  • -o int

    • output format

      • 0 mel-cepstrum

      • 1 MLSA filter coefficients

      • 2 gain normalized mel-cepstrum

      • 3 gain normalized MLSA filter coefficients.

  • -i int

    • number of iterations \((0 \le J)\)

  • -d double

    • convergence threshold \((0 \le \epsilon)\)

  • -e double

    • small value added to power spectrum

  • -E double

    • relative floor in decibels

  • infile str

    • double-type windowed sequence or spectrum

  • stdout

    • double-type mel-generalized cepstral coefficients

In the example below, mel-cepstral coefficients are extracted from data.d.

frame < data.d | window | mgcep > data.mcep

This is equivalents to the below line.

frame < data.d | window | fftr -o 3 -H | mgcep -q 3 > data.mcep
Parameters:
  • argc[in] Number of arguments.

  • argv[in] Argument vector.

Returns:

0 on success, 1 on failure.

class MelCepstralAnalysis

Calculate mel-cepstrum from periodogram.

The input is the half of periodogram:

\[ \begin{array}{cccc} |X(0)|^2, & |X(1)|^2, & \ldots, & |X(N/2)|^2, \end{array} \]
where \(N\) is the FFT length. The output is the \(M\)-th order mel-cepstral coefficients:
\[ \begin{array}{cccc} \tilde{c}(0), & \tilde{c}(1), & \ldots, & \tilde{c}(M). \end{array} \]

In the mel-cepstral analysis, the spectrum of speech signal is modeled by \(M\)-th order mel-cepstral coefficients as follows:

\[ H(z) = \exp \sum_{m=0}^M \tilde{c}(m) \tilde{z}^{-m}, \]
where
\[ \tilde{z}^{-1} = \frac{z^{-1} - \alpha}{1 - \alpha z^{-1}} \]
is first order all-pass function. The phase characteristic of the all-pass function is controlled by \(\alpha\). The typical values that approximate the mel-scale are summarized below.

Sample rate [kHz]

Alpha

8

0.31

10

0.35

12

0.37

16

0.42

22.05

0.45

32

0.50

44.1

0.53

48

0.55

Note that the implemenation is based on an unpublished paper.

Public Functions

MelCepstralAnalysis(int fft_length, int num_order, double alpha, int num_iteration, double convergence_threshold)
Parameters:
  • fft_length[in] Number of FFT bins, \(N\).

  • num_order[in] Order of cepstral coefficients, \(M\).

  • alpha[in] All-pass constant, \(\alpha\).

  • num_iteration[in] Number of iterations of Newton method, \(J\).

  • convergence_threshold[in] Convergence threshold, \(\epsilon\).

inline int GetFftLength() const
Returns:

FFT length.

inline int GetNumOrder() const
Returns:

Order of coefficients.

inline double GetAlpha() const
Returns:

All-pass constant.

inline int GetNumIteration() const
Returns:

Number of iterations.

inline double GetConvergenceThreshold() const
Returns:

Convergence threshold.

inline bool IsValid() const
Returns:

True if this object is valid.

bool Run(const std::vector<double> &periodogram, std::vector<double> *mel_cepstrum, MelCepstralAnalysis::Buffer *buffer) const
Parameters:
  • periodogram[in] \((N/2+1)\)-length periodogram.

  • mel_cepstrum[out] \(M\)-th order mel-cepstral coefficients.

  • buffer[out] Buffer.

Returns:

True on success, false on failure.

class Buffer

Buffer for MelCepstralAnalysis class.

class MelGeneralizedCepstralAnalysis

Calculate mel-generalized cepstrum from periodogram.

The input is the half of periodogram:

\[ \begin{array}{cccc} |X(0)|^2, & |X(1)|^2, & \ldots, & |X(N/2)|^2, \end{array} \]
where \(N\) is the FFT length. The output is the \(M\)-th order mel-generalized cepstral coefficients:
\[ \begin{array}{cccc} \tilde{c}_\gamma(0), & \tilde{c}_\gamma(1), & \ldots, & \tilde{c}_\gamma(M). \end{array} \]

In the mel-generalized cepstral analysis, the spectrum of speech signal is modeled by \(M\)-th order mel-generalized cepstral coefficients as follows:

\[\begin{split}\begin{eqnarray} H(z) &=& s^{-1}_\gamma \left( \sum_{m=0}^M \tilde{c}_\gamma(m) \tilde{z}^{-m} \right) \\ &=& \left\{ \begin{array}{ll} \left( 1 + \gamma \displaystyle\sum_{m=0}^M \tilde{c}_\gamma(m) \tilde{z}^{-m} \right)^{1/\gamma}, & -1 \le \gamma < 0 \\ \exp \displaystyle\sum_{m=0}^M \tilde{c}_\gamma(m) \tilde{z}^{-m}, & \gamma = 0 \end{array} \right. \end{eqnarray}\end{split}\]
where
\[ \tilde{z}^{-1} = \frac{z^{-1} - \alpha}{1 - \alpha z^{-1}}. \]

Public Functions

MelGeneralizedCepstralAnalysis(int fft_length, int num_order, double alpha, double gamma, int num_iteration, double convergence_threshold)
Parameters:
  • fft_length[in] Number of FFT bins, \(N\).

  • num_order[in] Order of cepstral coefficients, \(M\).

  • alpha[in] All-pass constant, \(\alpha\).

  • gamma[in] Exponent parameter, \(\gamma\).

  • num_iteration[in] Number of iterations of Newton method, \(J\).

  • convergence_threshold[in] Convergence threshold, \(\epsilon\).

inline int GetFftLength() const
Returns:

FFT length.

inline int GetNumOrder() const
Returns:

Order of coefficients.

inline double GetAlpha() const
Returns:

All-pass constant.

inline double GetGamma() const
Returns:

Gamma.

inline int GetNumIteration() const
Returns:

Number of iterations.

inline double GetConvergenceThreshold() const
Returns:

Convergence threshold.

inline bool IsValid() const
Returns:

True if this object is valid.

bool Run(const std::vector<double> &periodogram, std::vector<double> *mel_generalized_cepstrum, MelGeneralizedCepstralAnalysis::Buffer *buffer) const
Parameters:
  • periodogram[in] \((N/2+1)\)-length periodogram.

  • mel_generalized_cepstrum[out] \(M\)-th order mel-generalized cepstral coefficients.

  • buffer[out] Buffer.

Returns:

True on success, false on failure.

class Buffer

Buffer for MelGeneralizedCepstralAnalysis class.