pitch#

class diffsptk.Pitch(frame_period, sample_rate, algorithm='fcnf0', out_format='pitch', **kwargs)[source]#

Pitch extraction module using external neural models.

Parameters:

frame_periodint >= 1: Frame period, \(P\).
sample_rateint >= 1: Sample rate in Hz.
algorithm[‘crepe’, ‘fcnf0’]: Algorithm.
out_format[‘pitch’, ‘f0’, ‘log-f0’, ‘prob’, ‘embed’]: Output format.
f_minfloat >= 0: Minimum frequency in Hz.
f_maxfloat <= sample_rate // 2: Maximum frequency in Hz.
voicing_thresholdfloat: Voiced/unvoiced threshold.

References

[1]

J. W. Kim et al., “CREPE: A convolutional representation for pitch estimation,” Proceedings of ICASSP, pp. 161-165, 2018.

[2]

M. Morisson et al., “Cross-domain neural pitch and periodicity estimation,” arXiv prepreint, arXiv:2301.12258, 2023.

forward(x)[source]#

Compute pitch representation.

Parameters:

xTensor [shape=(B, T) or (T,)]: Waveform.

Returns:

outTensor [shape=(B, N, C) or (N, C) or (B, N) or (N,)]: Pitch probability, embedding, or pitch, where N is the number of frames and C is the number of pitch classes or the dimension of embedding.

Examples

>>> x = diffsptk.sin(1000, 80)
>>> pitch = diffsptk.Pitch(160, 8000, out_format="f0")
>>> y = pitch(x)
>>> y
tensor([  0.0000,  99.7280,  99.7676,  99.8334,  99.8162, 100.1602,   0.0000])

pitch#

This Page