pitch#
- class diffsptk.Pitch(frame_period, sample_rate, f_min=0, f_max=None, algorithm='crepe', out_format='f0', **option)[source]#
Pitch extraction module using external neural models.
- Parameters:
- frame_periodint >= 1 [scalar]
Frame period, \(P\).
- sample_rateint >= 1 [scalar]
Sample rate in Hz.
- f_minfloat >= 0 [scalar]
Minimum frequency in Hz.
- f_maxfloat <= sample_rate // 2 [scalar]
Maximum frequency in Hz.
- algorithm[‘crepe’]
Algorithm.
- optionstr -> Any [dict]
Algorithm-dependent options.
- decode(prob)[source]#
Get appropriate pitch contour from pitch probabilities.
- Parameters:
- probTensor [shape=(B, N, C) or (N, C)]
Pitch probabilitiy.
- Returns:
- pitchTensor [shape=(B, N) or (N,)]
Pitch in seconds, Hz, or log Hz.
Examples
>>> x = diffsptk.sin(100, 10) >>> pitch = diffsptk.pitch(80, 16000) >>> prob = pitch.forward(x) >>> result = pitch.decode(prob) >>> result tensor([1586.6013, 1593.9536])
- forward(x, embed=False)[source]#
Compute pitch representation.
- Parameters:
- xTensor [shape=(B, T) or (T,)]
Waveform.
- embedbool [scalar]
If True, return embedding instead of probability.
- Returns:
- yTensor [shape=(B, N, C) or (N, C)]
Pitch probability or embedding, where N is the number of frames and C is the number of classes or the dimension of embedding.
Examples
>>> x = diffsptk.sin(100, 10) >>> pitch = diffsptk.pitch(80, 16000) >>> prob = pitch(x) >>> prob.shape torch.Size([2, 360])