pitch#
- class diffsptk.Pitch(frame_period, sample_rate, algorithm='crepe', out_format='pitch', **kwargs)[source]#
- Pitch extraction module using external neural models. - Parameters:
- frame_periodint >= 1 [scalar]
- Frame period, \(P\). 
- sample_rateint >= 1 [scalar]
- Sample rate in Hz. 
- algorithm[‘crepe’]
- Algorithm. 
- out_format[‘pitch’, ‘f0’, ‘log-f0’, ‘prob’, ‘embed’]
- Output format. 
- f_minfloat >= 0 [scalar]
- Minimum frequency in Hz. 
- f_maxfloat <= sample_rate // 2 [scalar]
- Maximum frequency in Hz. 
- voicing_thresholdfloat [scalar]
- Voiced/unvoiced threshold. 
- silence_thresholdfloat [scalar]
- Silence threshold in dB. 
- filter_lengthint >= 1 [scalar]
- Window length of median and moving average filters. 
- model[‘tiny’, ‘full’]
- Model size. 
 
 - forward(x)[source]#
- Compute pitch representation. - Parameters:
- xTensor [shape=(B, T) or (T,)]
- Waveform. 
 
- Returns:
- yTensor [shape=(B, N, C) or (N, C) or (B, N) or (N,)]
- Pitch probability, embedding, or pitch, where N is the number of frames and C is the number of pitch classes or the dimension of embedding. 
 
 - Examples - >>> x = diffsptk.sin(100, 10) >>> pitch = diffsptk.Pitch(80, 16000) >>> y = pitch(x) >>> y tensor([10.0860, 10.0860]) 
 
See also