pitch#

class diffsptk.Pitch(frame_period, sample_rate, f_min=0, f_max=None, algorithm='crepe', out_format='f0', **option)[source]#

Pitch extraction module using external neural models.

Parameters:
frame_periodint >= 1 [scalar]

Frame period, \(P\).

sample_rateint >= 1 [scalar]

Sample rate in Hz.

f_minfloat >= 0 [scalar]

Minimum frequency in Hz.

f_maxfloat <= sample_rate // 2 [scalar]

Maximum frequency in Hz.

algorithm[‘crepe’]

Algorithm.

optionstr -> Any [dict]

Algorithm-dependent options.

decode(prob)[source]#

Get appropriate pitch contour from pitch probabilities.

Parameters:
probTensor [shape=(B, N, C) or (N, C)]

Pitch probabilitiy.

Returns:
pitchTensor [shape=(B, N) or (N,)]

Pitch in seconds, Hz, or log Hz.

Examples

>>> x = diffsptk.sin(100, 10)
>>> pitch = diffsptk.pitch(80, 16000)
>>> prob = pitch.forward(x)
>>> result = pitch.decode(prob)
>>> result
tensor([1586.6013, 1593.9536])
forward(x, embed=False)[source]#

Compute pitch representation.

Parameters:
xTensor [shape=(B, T) or (T,)]

Waveform.

embedbool [scalar]

If True, return embedding instead of probability.

Returns:
yTensor [shape=(B, N, C) or (N, C)]

Pitch probability or embedding, where N is the number of frames and C is the number of classes or the dimension of embedding.

Examples

>>> x = diffsptk.sin(100, 10)
>>> pitch = diffsptk.pitch(80, 16000)
>>> prob = pitch(x)
>>> prob.shape
torch.Size([2, 360])