mlsacheck#

class diffsptk.MLSADigitalFilterStabilityCheck(cep_order: int, *, alpha: float = 0, pade_order: int = 4, strict: bool = True, threshold: float | None = None, fast: bool = True, n_fft: int = 256, warn_type: str = 'warn', mod_type: str = 'scale', device: device | None = None, dtype: dtype | None = None)[source]#

See this page for details.

Parameters:

cep_orderint >= 0: The order of the mel-cepstrum, \(M\).
alphafloat in (-1, 1): The frequency warping factor, \(\alpha\).
pade_orderint in [4, 7]: The order of the Pade approximation.
strictbool: If True, prioritizes maintaining the maximum log approximation error over MLSA filter stability.
thresholdfloat > 0 or None: The threshold value. If None, it is automatically computed.
fastbool: Enables fast mode (do not use FFT).
n_fftint > M: The number of FFT bins. Used only in non-fast mode.
warn_type[‘ignore’, ‘warn’, ‘exit’]: The warning type.
mod_type[‘clip’, ‘scale’]: The modification method.
devicetorch.device or None: The device of this module.
dtypetorch.dtype or None: The data type of this module.

References

[1]

S. Imai et al., “Mel log spectrum approximation (MLSA) filter for speech synthesis,” Electronics and Communications in Japan, vol. 66, no. 2, pp. 11-18, 1983.

forward(mc: Tensor) → Tensor[source]#

Check the stability of the MLSA digital filter.

Parameters:

mcTensor [shape=(…, M+1)]: The input mel-cepstrum.

Returns:

outTensor [shape=(…, M+1)]: The modified mel-cepstrum.

Examples

>>> c1 = diffsptk.nrand(4, stdv=10)
>>> c1
tensor([ 1.8963,  7.6629,  4.4804,  8.0669, -1.2768])
>>> mlsacheck = diffsptk.MLSADigitalFilterStabilityCheck(4, warn_type="ignore")
>>> c2 = mlsacheck(c1)
>>> c2
tensor([ 1.3336,  1.7537,  1.0254,  1.8462, -0.2922])

diffsptk.functional.mlsacheck(c: Tensor, *, alpha: float = 0, pade_order: int = 4, strict: bool = True, threshold: float | None = None, fast: bool = True, n_fft: int = 256, warn_type: str = 'warn', mod_type: str = 'scale') → Tensor[source]#

Check the stability of the MLSA digital filter.

Parameters:

cTensor [shape=(…, M+1)]: The input Mel-cepstrum.
alphafloat in (-1, 1): The frequency warping factor, \(\alpha\).
pade_orderint in [4, 7]: The order of the Pade approximation.
strictbool: If True, prioritizes maintaining the maximum log approximation error over MLSA filter stability.
thresholdfloat > 0 or None: The threshold value. If None, it is automatically computed.
fastbool: Enables fast mode (do not use FFT).
n_fftint > M: The number of FFT bins. Used only in non-fast mode.
warn_type[‘ignore’, ‘warn’, ‘exit’]: The warning type.
mod_type[‘clip’, ‘scale’]: The modification method.

Returns:

outTensor [shape=(…, M+1)]: The modified mel-cepstrum.