gmm#
- diffsptk.GMM#
alias of
GaussianMixtureModeling
- class diffsptk.GaussianMixtureModeling(order: int, n_mixture: int, *, n_iter: int = 100, eps: float = 1e-05, weight_floor: float = 1e-05, var_floor: float = 1e-06, var_type: str = 'diag', block_size: list[int] | tuple[int, ...] | ndarray | None = None, ubm: tuple[Tensor, Tensor, Tensor] | None = None, alpha: float = 0, batch_size: int | None = None, verbose: bool | int = False, device: device | None = None, dtype: dtype | None = None)[source]#
See this page for details. Note that the forward method is not differentiable.
- Parameters:
- orderint >= 0
The order of the vector, \(M\).
- n_mixtureint >= 1
The number of mixture components, \(K\).
- n_iterint >= 1
The number of iterations.
- epsfloat >= 0
The convergence threshold.
- weight_floorfloat >= 0
The floor value for mixture weights.
- var_floorfloat >= 0
The floor value for variance.
- var_type[‘diag’, ‘full’]
The type of covariance matrix.
- block_sizelist[int]
The block size of covariance matrix.
- ubmtuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]
The GMM parameters of a universal background model.
- alphafloat in [0, 1]
The smoothing parameter.
- batch_sizeint >= 1 or None
The batch size.
- verbosebool
If 1, shows the likelihood at each iteration; if 2, shows progress bars.
- devicetorch.device or None
The device of this module.
- dtypetorch.dtype or None
The data type of this module.
References
[1]J-L. Gauvain et al., “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,” IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994.
- forward(x: Tensor | DataLoader, return_posterior: bool = False) tuple[tuple[Tensor, Tensor, Tensor], Tensor] | tuple[tuple[Tensor, Tensor, Tensor], Tensor, Tensor] [source]#
Train Gaussian mixture models.
- Parameters:
- xTensor [shape=(T, M+1)] or DataLoader
The input vectors or a DataLoader that yields the input vectors.
- return_posteriorbool
If True, return the posterior probabilities.
- Returns:
- paramstuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]
The estimated GMM parameters.
- posteriorTensor [shape=(T, K)] (optional)
The posterior probabilities.
- log_likelihoodTensor [scalar]
The total log-likelihood.
Examples
>>> import diffsptk >>> import torch >>> gmm = diffsptk.GMM(1, 2) >>> x = torch.tensor([ ... [-0.5, 0.3], [0.0, 0.7], [0.2, -0.1], [3.4, 2.0], [-2.8, 1.0], ... [2.9, -3.0], [2.2, -2.5], [1.5, -1.6], [1.8, 0.5], [1.3, 0.0], ... ]) >>> gmm.warmup(x) >>> params, log_likelihood = gmm(x) >>> w, mu, sigma = params >>> w tensor([0.5471, 0.4529]) >>> mu tensor([[-0.1507, 0.4112], [ 2.3901, -1.0930]]) >>> print(sigma) tensor([[[2.1197, 0.0000], [0.0000, 0.1536]], [[0.5578, -0.0000], [-0.0000, 3.6378]]]) >>> log_likelihood tensor(-32.5925)
- set_params(params: tuple[Tensor | None, Tensor | None, Tensor | None]) None [source]#
Set model parameters.
- Parameters:
- paramstuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]
The GMM parameters.
- transform(x: Tensor) tuple[Tensor | None, Tensor, Tensor] [source]#
Transform the input vectors based on a single mixture sequence.
- Parameters:
- xTensor [shape=(T, N+1)]
The input vectors.
- Returns:
- yTensor [shape=(T, M-N)]
The output vectors.
- indicesTensor [shape=(T,)]
The selected mixture indices.
- log_probTensor [shape=(T,)]
The log probabilities.
See also