gmm#
- diffsptk.GMM#
alias of
GaussianMixtureModeling
- class diffsptk.GaussianMixtureModeling(order: int, n_mixture: int, *, n_iter: int = 100, eps: float = 1e-05, weight_floor: float = 1e-05, var_floor: float = 1e-06, var_type: str = 'diag', block_size: list[int] | tuple[int, ...] | ndarray | None = None, ubm: tuple[Tensor, Tensor, Tensor] | None = None, alpha: float = 0, batch_size: int | None = None, verbose: bool | int = False)[source]#
See this page for details. Note that the forward method is not differentiable.
- Parameters:
- orderint >= 0
The order of the vector, \(M\).
- n_mixtureint >= 1
The number of mixture components, \(K\).
- n_iterint >= 1
The number of iterations.
- epsfloat >= 0
The convergence threshold.
- weight_floorfloat >= 0
The floor value for mixture weights.
- var_floorfloat >= 0
The floor value for variance.
- var_type[‘diag’, ‘full’]
The type of covariance matrix.
- block_sizelist[int]
The block size of covariance matrix.
- ubmtuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]
The GMM parameters of a universal background model.
- alphafloat in [0, 1]
The smoothing parameter.
- batch_sizeint >= 1 or None
The batch size.
- verbosebool
If 1, shows the likelihood at each iteration; if 2, shows progress bars.
References
[1]J-L. Gauvain et al., “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,” IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994.
- forward(x: Tensor | DataLoader, return_posterior: bool = False) tuple[tuple[Tensor, Tensor, Tensor], Tensor] | tuple[tuple[Tensor, Tensor, Tensor], Tensor, Tensor] [source]#
Train Gaussian mixture models.
- Parameters:
- xTensor [shape=(T, M+1)] or DataLoader
The input vectors or a DataLoader that yields the input vectors.
- return_posteriorbool
If True, return the posterior probabilities.
- Returns:
- paramstuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]
The estimated GMM parameters.
- posteriorTensor [shape=(T, K)] (optional)
The posterior probabilities.
- log_likelihoodTensor [scalar]
The total log-likelihood.
Examples
>>> x = diffsptk.nrand(10, 1) >>> gmm = diffsptk.GMM(1, 2) >>> params, log_likelihood = gmm(x) >>> w, mu, sigma = params >>> w tensor([0.1917, 0.8083]) >>> mu tensor([[ 1.2321, 0.2058], [-0.1326, -0.7006]]) >>> sigma tensor([[[3.4010e-01, 0.0000e+00], [0.0000e+00, 6.2351e-04]], [[3.0944e-01, 0.0000e+00], [0.0000e+00, 8.6096e-01]]]) >>> log_likelihood tensor(-19.5235)
- set_params(params: tuple[Tensor | None, Tensor | None, Tensor | None]) None [source]#
Set model parameters.
- Parameters:
- paramstuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]
The GMM parameters.
- transform(x: Tensor) tuple[Tensor | None, Tensor, Tensor] [source]#
Transform the input vectors based on a single mixture sequence.
- Parameters:
- xTensor [shape=(T, N+1)]
The input vectors.
- Returns:
- yTensor [shape=(T, M-N)]
The output vectors.
- indicesTensor [shape=(T,)]
The selected mixture indices.
- log_probTensor [shape=(T,)]
The log probabilities.
See also