gmm#

diffsptk.GMM#

alias of GaussianMixtureModeling

class diffsptk.GaussianMixtureModeling(order: int, n_mixture: int, *, n_iter: int = 100, eps: float = 1e-05, weight_floor: float = 1e-05, var_floor: float = 1e-06, var_type: str = 'diag', block_size: list[int] | tuple[int, ...] | ndarray | None = None, ubm: tuple[Tensor, Tensor, Tensor] | None = None, alpha: float = 0, batch_size: int | None = None, verbose: bool | int = False, device: device | None = None, dtype: dtype | None = None)[source]#

See this page for details. Note that the forward method is not differentiable.

Parameters:
orderint >= 0

The order of the vector, \(M\).

n_mixtureint >= 1

The number of mixture components, \(K\).

n_iterint >= 1

The number of iterations.

epsfloat >= 0

The convergence threshold.

weight_floorfloat >= 0

The floor value for mixture weights.

var_floorfloat >= 0

The floor value for variance.

var_type[‘diag’, ‘full’]

The type of covariance matrix.

block_sizelist[int]

The block size of covariance matrix.

ubmtuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]

The GMM parameters of a universal background model.

alphafloat in [0, 1]

The smoothing parameter.

batch_sizeint >= 1 or None

The batch size.

verbosebool

If 1, shows the likelihood at each iteration; if 2, shows progress bars.

devicetorch.device or None

The device of this module.

dtypetorch.dtype or None

The data type of this module.

References

[1]

J-L. Gauvain et al., “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,” IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 291-298, 1994.

forward(x: Tensor | DataLoader, return_posterior: bool = False) tuple[tuple[Tensor, Tensor, Tensor], Tensor] | tuple[tuple[Tensor, Tensor, Tensor], Tensor, Tensor][source]#

Train Gaussian mixture models.

Parameters:
xTensor [shape=(T, M+1)] or DataLoader

The input vectors or a DataLoader that yields the input vectors.

return_posteriorbool

If True, return the posterior probabilities.

Returns:
paramstuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]

The estimated GMM parameters.

posteriorTensor [shape=(T, K)] (optional)

The posterior probabilities.

log_likelihoodTensor [scalar]

The total log-likelihood.

Examples

>>> import diffsptk
>>> import torch
>>> gmm = diffsptk.GMM(1, 2)
>>> x = torch.tensor([
...     [-0.5, 0.3], [0.0, 0.7], [0.2, -0.1], [3.4, 2.0], [-2.8, 1.0],
...     [2.9, -3.0], [2.2, -2.5], [1.5, -1.6], [1.8, 0.5], [1.3, 0.0],
... ])
>>> gmm.warmup(x)
>>> params, log_likelihood = gmm(x)
>>> w, mu, sigma = params
>>> w
tensor([0.5471, 0.4529])
>>> mu
tensor([[-0.1507,  0.4112],
        [ 2.3901, -1.0930]])
>>> print(sigma)
tensor([[[2.1197, 0.0000],
         [0.0000, 0.1536]],

        [[0.5578, -0.0000],
         [-0.0000, 3.6378]]])
>>> log_likelihood
tensor(-32.5925)
set_params(params: tuple[Tensor | None, Tensor | None, Tensor | None]) None[source]#

Set model parameters.

Parameters:
paramstuple of Tensors [shape=((K,), (K, M+1), (K, M+1, M+1))]

The GMM parameters.

transform(x: Tensor) tuple[Tensor | None, Tensor, Tensor][source]#

Transform the input vectors based on a single mixture sequence.

Parameters:
xTensor [shape=(T, N+1)]

The input vectors.

Returns:
yTensor [shape=(T, M-N)]

The output vectors.

indicesTensor [shape=(T,)]

The selected mixture indices.

log_probTensor [shape=(T,)]

The log probabilities.

warmup(x: Tensor | DataLoader, **lbg_params) None[source]#

Initialize the model parameters by K-means clustering.

Parameters:
xTensor [shape=(T, M+1)] or DataLoader

The training data.

lbg_paramsadditional keyword arguments

The parameters for the Linde-Buzo-Gray algorithm.

See also

lbg