f0eval#

class diffsptk.F0Evaluation(reduction: str = 'mean', out_format: str = 'f0-rmse-cent')[source]#

See this page for details. Note that the gradients cannot be calculated if the output format is related to voiced/unvoiced decision.

Parameters:

reduction[‘none’, ‘mean’, ‘sum’]: The reduction type.
out_format[‘f0-rmse-hz’, ‘f0-rmse-cent’, ‘f0-rmse-semitone’, ‘vuv-error-rate’, ‘vuv-error-percent’, ‘vuv-macro-f1-score’]: The output format.

forward(x: Tensor, y: Tensor) → Tensor[source]#

Calculate F0 metric.

Parameters:

xTensor [shape=(…, N)]: The input F0 in Hz.
yTensor [shape=(…, N)]: The target F0 in Hz.

Returns:

outTensor [shape=(…,) or scalar]: The F0 metric.

Examples

>>> import diffsptk
>>> f0eval = diffsptk.F0Evaluation(out_format="f0-rmse-cent")
>>> x = torch.tensor([100, 200, 300, 0, 400])
>>> y = torch.tensor([110, 180, 0, 290, 410])
>>> error = f0eval(x, y)
>>> error
tensor(144.1353)

diffsptk.functional.f0eval(x: Tensor, y: Tensor, reduction: str = 'mean', out_format: str = 'f0-rmse-cent') → Tensor[source]#

Calculate F0 metric.

Parameters:

xTensor [shape=(…, N)]: The input F0 in Hz.
yTensor [shape=(…, N)]: The target F0 in Hz.
reduction[‘none’, ‘mean’, ‘sum’]: The reduction type.
out_format[‘f0-rmse-hz’, ‘f0-rmse-cent’, ‘f0-rmse-semitone’, ‘vuv-error-rate’, ‘vuv-error-percent’, ‘vuv-macro-f1-score’]: The output format.

Returns:

outTensor [shape=(…,) or scalar]: The F0 metric.