pitch#

class diffsptk.Pitch(frame_period, sample_rate, f_min=0, f_max=None, algorithm='crepe', out_format='f0', **option)[source]#

Pitch extraction module using external neural models.

Parameters:

frame_periodint >= 1 [scalar]: Frame period, \(P\).
sample_rateint >= 1 [scalar]: Sample rate in Hz.
f_minfloat >= 0 [scalar]: Minimum frequency in Hz.
f_maxfloat <= sample_rate // 2 [scalar]: Maximum frequency in Hz.
algorithm[‘crepe’]: Algorithm.
optionstr -> Any [dict]: Algorithm-dependent options.

decode(prob)[source]#

Get appropriate pitch contour from pitch probabilities.

Parameters:

probTensor [shape=(B, N, C) or (N, C)]: Pitch probabilitiy.

Returns:

pitchTensor [shape=(B, N) or (N,)]: Pitch in seconds, Hz, or log Hz.

Examples

>>> x = diffsptk.sin(100, 10)
>>> pitch = diffsptk.pitch(80, 16000)
>>> prob = pitch.forward(x)
>>> result = pitch.decode(prob)
>>> result
tensor([1586.6013, 1593.9536])

forward(x, embed=False)[source]#

Compute pitch representation.

Parameters:

xTensor [shape=(B, T) or (T,)]: Waveform.
embedbool [scalar]: If True, return embedding instead of probability.

Returns:

yTensor [shape=(B, N, C) or (N, C)]: Pitch probability or embedding, where N is the number of frames and C is the number of classes or the dimension of embedding.

Examples

>>> x = diffsptk.sin(100, 10)
>>> pitch = diffsptk.pitch(80, 16000)
>>> prob = pitch(x)
>>> prob.shape
torch.Size([2, 360])