plp

Functions

int main(int argc, char *argv[])

plp [ option ] [ infile ]

-n int
- number of channels \((1 \le C)\)
-m int
- order of coeffcients \((1 \le M)\)
-l int
- FFT length \((2 \le N)\)
-c int
- liftering parameter \((1 \le L)\)
-f double
- compression factor \((0 < f)\)
-s double
- sampling rate in kHz \((0 < F_s)\)
-L double
- lowest frequency in Hz \((0 \le F_l < F_h)\)
-H double
- highest frequency in Hz \((F_l < F_h \le 500F_s)\)
-q int
- input format
  - 0 amplitude spectrum in dB
  - 1 log amplitude spectrum
  - 2 amplitude spectrum
  - 3 power spectrum
  - 4 windowed waveform
-o int
- output format
  - 0 PLP
  - 1 PLP and energy
  - 2 PLP and C0
  - 3 PLP, C0, and energy
-e double
- floor value of raw filter-bank output \((0 < \epsilon)\)
infile str
- double-type windowed sequence or spectrum
stdout
- double-type PLP features

The below example extracts the 12-th order PLP from data.short. The analysis condition is that: frame length is 10 ms, frame shift is 25 ms, and sampling rate is 16 kHz. A pre-emphais filter and the hamming window are applied to the input signal.

x2x +sd data.short |
  frame -l 400 -p 160 -n 1 |
  dfs -b 1 -0.97 |
  window -l 400 -L 512 -w 1 -n 0 |
  plp -l 512 -n 40 -c 22 -m 12 -L 64 -H 4000 -f 0.33 -o 2 > data.plp

The corresponding HTK config file is shown as below.

SOURCEFORMAT = NOHEAD
SOURCEKIND   = WAVEFORM
SOURCERATE   = 625.0
TARGETKIND   = PLP_0
TARGETRATE   = 100000.0
WINDOWSIZE   = 250000.0
USEHAMMING   = T
USEPOWER     = T
RAWENERGY    = F
ENORMALIZE   = F
PREEMCOEF    = 0.97
COMPRESSFACT = 0.33
NUMCHANS     = 40
CEPLIFTER    = 22
NUMCEPS      = 12
LOFREQ       = 64
HIFREQ       = 4000

Parameters:

argc – [in] Number of arguments.
argv – [in] Argument vector.

Returns:

0 on success, 1 on failure.

See also

fbank mfcc

class PerceptualLinearPredictiveCoefficientsAnalysis

Perform perceptual linear predictive (PLP) coefficients analysis.

The input is the half part of power spectrum:

\[ \begin{array}{cccc} |X(0)|^2, & |X(1)|^2, & \ldots, & |X(N/2)|^2, \end{array} \]

where \(N\) is the FFT length. The outputs are the \(M\)-th order PLP features with the zeroth cepstral parameter:

\[ \begin{array}{ccccc} c(0), & \bar{c}(1), & \bar{c}(2), & \ldots, & \bar{c}(M) \end{array} \]

and the log-signal energy \(E\).

[1] S. Young et al., “The HTK book,” Cambridge University Engineering Department, 2006.

Public Functions

PerceptualLinearPredictiveCoefficientsAnalysis(int fft_length, int num_channel, int num_order, int liftering_coefficient, double compression_factor, double sampling_rate, double lowest_frequency, double highest_frequency, double floor)

Parameters:

fft_length – [in] Number of FFT bins, \(N\).
num_channel – [in] Number of channels, \(C\).
num_order – [in] Order of cepstral coefficients, \(M\).
liftering_coefficient – [in] A parameter of liftering, \(L\).
compression_factor – [in] Amplitude compression factor.
sampling_rate – [in] Sampling rate in Hz.
lowest_frequency – [in] Lowest frequency in Hz.
highest_frequency – [in] Highest frequency in Hz.
floor – [in] Floor value of raw filter-bank output.

inline int GetFftLength() const

Returns:: FFT size.

inline int GetNumChannel() const

Returns:: Number of channels.

inline int GetNumOrder() const

Returns:: Order of cepstral coefficients.

inline int GetLifteringCoefficient() const

Returns:: Liftering coefficient.

inline double GetCompressionFactor() const

Returns:: Compression factor.

inline bool IsValid() const

Returns:: True if this object is valid.

bool Run(const std::vector<double> &power_spectrum, std::vector<double> *plp, double *energy, PerceptualLinearPredictiveCoefficientsAnalysis::Buffer *buffer) const

Parameters:

power_spectrum – [in] \((N/2+1)\)-length power spectrum.
plp – [out] \(M\)-th order PLP features.
energy – [out] Signal energy \(E\) (optional).
buffer – [out] Buffer.

Returns:

True on success, false on failure.

class Buffer: Buffer for PerceptualLinearPredictiveCoefficientsAnalysis class.