mgcep

Functions

int main(int argc, char *argv[])

mgcep [ option ] [ infile ]

  • -m int

    • order of coefficients \((0 \le M)\)

  • -a double

    • all-pass constant \((|\alpha| < 1)\)

  • -g double

    • gamma \((|\gamma| \le 1)\)

  • -c int

    • gamma \(\gamma = -1 / C\) \((1 \le C)\)

  • -l int

    • FFT length \((2 \le N)\)

  • -q int

    • input format

      • 0 amplitude spectrum in dB

      • 1 log amplitude spectrum

      • 2 amplitude spectrum

      • 3 power spectrum

      • 4 windowed waveform

  • -o int

    • output format

      • 0 mel-cepstrum

      • 1 MLSA filter coefficients

      • 2 gain normalized mel-cepstrum

      • 3 gain normalized MLSA filter coefficients.

  • -i int

    • number of iterations \((0 \le J)\)

  • -d double

    • convergence threshold \((0 \le \epsilon)\)

  • -e double

    • small value added to power spectrum

  • -E double

    • relative floor in decibels

  • infile str

    • double-type windowed sequence or spectrum

  • stdout

    • double-type mel-generalized cepstral coefficients

In the example below, mel-cepstral coefficients are extracted from data.d.

frame < data.d | window | mgcep > data.mcep

This is equivalents to the below line.

frame < data.d | window | fftr -o 3 -H | mgcep -q 3 > data.mcep

Parameters
  • argc[in] Number of arguments.

  • argv[in] Argument vector.

Returns

0 on success, 1 on failure.

class sptk::MelCepstralAnalysis

Calculate mel-cepstrum from periodogram.

The input is the half of periodogram:

\[ \begin{array}{cccc} |X(0)|^2, & |X(1)|^2, & \ldots, & |X(N/2)|^2, \end{array} \]
where \(N\) is the FFT length. The output is the \(M\)-th order mel-cepstral coefficients:
\[ \begin{array}{cccc} \tilde{c}(0), & \tilde{c}(1), & \ldots, & \tilde{c}(M). \end{array} \]

In the mel-cepstral analysis, the spectrum of speech signal is modeled by \(M\)-th order mel-cepstral coefficients as follows:

\[ H(z) = \exp \sum_{m=0}^M \tilde{c}(m) \tilde{z}^{-m}, \]
where
\[ \tilde{z}^{-1} = \frac{z^{-1} - \alpha}{1 - \alpha z^{-1}} \]
is first order all-pass function. The phase characteristic of the all-pass function is controlled by \(\alpha\). The typical values that approximate the mel-scale are summarized below.

Sample rate [kHz]

Alpha

8

0.31

10

0.35

12

0.37

16

0.42

22.5

0.45

32

0.50

44.1

0.53

48

0.55

Note that the implemenation is based on an unpublished paper.

Public Functions

MelCepstralAnalysis(int fft_length, int num_order, double alpha, int num_iteration, double convergence_threshold)
Parameters
  • fft_length[in] Number of FFT bins, \(N\).

  • num_order[in] Order of cepstral coefficients, \(M\).

  • alpha[in] All-pass constant, \(\alpha\).

  • num_iteration[in] Number of iterations of Newton method, \(J\).

  • convergence_threshold[in] Convergence threshold, \(\epsilon\).

inline int GetFftLength() const
Returns

FFT length.

inline int GetNumOrder() const
Returns

Order of coefficients.

inline double GetAlpha() const
Returns

All-pass constant.

inline int GetNumIteration() const
Returns

Number of iterations.

inline double GetConvergenceThreshold() const
Returns

Convergence threshold.

inline bool IsValid() const
Returns

True if this object is valid.

bool Run(const std::vector<double> &periodogram, std::vector<double> *mel_cepstrum, MelCepstralAnalysis::Buffer *buffer) const
Parameters
  • periodogram[in] \((N/2+1)\)-length periodogram.

  • mel_cepstrum[out] \(M\)-th order mel-cepstral coefficients.

  • buffer[out] Buffer.

Returns

True on success, false on failure.

class Buffer

Buffer for MelCepstralAnalysis class.

class sptk::MelGeneralizedCepstralAnalysis

Calculate mel-generalized cepstrum from periodogram.

The input is the half of periodogram:

\[ \begin{array}{cccc} |X(0)|^2, & |X(1)|^2, & \ldots, & |X(N/2)|^2, \end{array} \]
where \(N\) is the FFT length. The output is the \(M\)-th order mel-generalized cepstral coefficients:
\[ \begin{array}{cccc} \tilde{c}_\gamma(0), & \tilde{c}_\gamma(1), & \ldots, & \tilde{c}_\gamma(M). \end{array} \]

In the mel-generalized cepstral analysis, the spectrum of speech signal is modeled by \(M\)-th order mel-generalized cepstral coefficients as follows:

\[\begin{split}\begin{eqnarray} H(z) &=& s^{-1}_\gamma \left( \sum_{m=0}^M \tilde{c}_\gamma(m) \tilde{z}^{-m} \right) \\ &=& \left\{ \begin{array}{ll} \left( 1 + \gamma \displaystyle\sum_{m=0}^M \tilde{c}_\gamma(m) \tilde{z}^{-m} \right)^{1/\gamma}, & -1 \le \gamma < 0 \\ \exp \displaystyle\sum_{m=0}^M \tilde{c}_\gamma(m) \tilde{z}^{-m}, & \gamma = 0 \end{array} \right. \end{eqnarray}\end{split}\]
where
\[ \tilde{z}^{-1} = \frac{z^{-1} - \alpha}{1 - \alpha z^{-1}}. \]

Public Functions

MelGeneralizedCepstralAnalysis(int fft_length, int num_order, double alpha, double gamma, int num_iteration, double convergence_threshold)
Parameters
  • fft_length[in] Number of FFT bins, \(N\).

  • num_order[in] Order of cepstral coefficients, \(M\).

  • alpha[in] All-pass constant, \(\alpha\).

  • gamma[in] Exponent parameter, \(\gamma\).

  • num_iteration[in] Number of iterations of Newton method, \(J\).

  • convergence_threshold[in] Convergence threshold, \(\epsilon\).

inline int GetFftLength() const
Returns

FFT length.

inline int GetNumOrder() const
Returns

Order of coefficients.

inline double GetAlpha() const
Returns

All-pass constant.

inline double GetGamma() const
Returns

Gamma.

inline int GetNumIteration() const
Returns

Number of iterations.

inline double GetConvergenceThreshold() const
Returns

Convergence threshold.

inline bool IsValid() const
Returns

True if this object is valid.

bool Run(const std::vector<double> &periodogram, std::vector<double> *mel_generalized_cepstrum, MelGeneralizedCepstralAnalysis::Buffer *buffer) const
Parameters
  • periodogram[in] \((N/2+1)\)-length periodogram.

  • mel_generalized_cepstrum[out] \(M\)-th order mel-generalized cepstral coefficients.

  • buffer[out] Buffer.

Returns

True on success, false on failure.

class Buffer

Buffer for MelGeneralizedCepstralAnalysis class.