pitch#

class diffsptk.Pitch(frame_period, sample_rate, algorithm='fcnf0', out_format='pitch', **kwargs)[source]#

Pitch extraction module using external neural models.

Parameters:
frame_periodint >= 1

Frame period, \(P\).

sample_rateint >= 1

Sample rate in Hz.

algorithm[‘crepe’, ‘fcnf0’]

Algorithm.

out_format[‘pitch’, ‘f0’, ‘log-f0’, ‘prob’, ‘embed’]

Output format.

f_minfloat >= 0

Minimum frequency in Hz.

f_maxfloat <= sample_rate // 2

Maximum frequency in Hz.

voicing_thresholdfloat

Voiced/unvoiced threshold.

References

[1]

J. W. Kim et al., “CREPE: A convolutional representation for pitch estimation,” Proceedings of ICASSP, pp. 161-165, 2018.

[2]

M. Morisson et al., “Cross-domain neural pitch and periodicity estimation,” arXiv prepreint, arXiv:2301.12258, 2023.

forward(x)[source]#

Compute pitch representation.

Parameters:
xTensor [shape=(B, T) or (T,)]

Waveform.

Returns:
outTensor [shape=(B, N, C) or (N, C) or (B, N) or (N,)]

Pitch probability, embedding, or pitch, where N is the number of frames and C is the number of pitch classes or the dimension of embedding.

Examples

>>> x = diffsptk.sin(1000, 80)
>>> pitch = diffsptk.Pitch(160, 8000, out_format="f0")
>>> y = pitch(x)
>>> y
tensor([  0.0000,  99.7280,  99.7676,  99.8334,  99.8162, 100.1602,   0.0000])

See also

excite