pitch#

class diffsptk.Pitch(frame_period, sample_rate, algorithm='crepe', out_format='pitch', **kwargs)[source]#

Pitch extraction module using external neural models.

Parameters:

frame_periodint >= 1 [scalar]: Frame period, \(P\).
sample_rateint >= 1 [scalar]: Sample rate in Hz.
algorithm[‘crepe’]: Algorithm.
out_format[‘pitch’, ‘f0’, ‘log-f0’, ‘prob’, ‘embed’]: Output format.
f_minfloat >= 0 [scalar]: Minimum frequency in Hz.
f_maxfloat <= sample_rate // 2 [scalar]: Maximum frequency in Hz.
voicing_thresholdfloat [scalar]: Voiced/unvoiced threshold.
silence_thresholdfloat [scalar]: Silence threshold in dB.
filter_lengthint >= 1 [scalar]: Window length of median and moving average filters.
model[‘tiny’, ‘full’]: Model size.

forward(x)[source]#

Compute pitch representation.

Parameters:

Returns:

yTensor [shape=(B, N, C) or (N, C) or (B, N) or (N,)]: Pitch probability, embedding, or pitch, where N is the number of frames and C is the number of pitch classes or the dimension of embedding.

Examples

>>> x = diffsptk.sin(100, 10)
>>> pitch = diffsptk.Pitch(80, 16000)
>>> y = pitch(x)
>>> y
tensor([10.0860, 10.0860])