pitch#
- class diffsptk.Pitch(frame_period, sample_rate, algorithm='fcnf0', out_format='pitch', **kwargs)[source]#
- Pitch extraction module using external neural models. - Parameters:
- frame_periodint >= 1
- Frame period, \(P\). 
- sample_rateint >= 1
- Sample rate in Hz. 
- algorithm[‘crepe’, ‘fcnf0’]
- Algorithm. 
- out_format[‘pitch’, ‘f0’, ‘log-f0’, ‘prob’, ‘embed’]
- Output format. 
- f_minfloat >= 0
- Minimum frequency in Hz. 
- f_maxfloat <= sample_rate // 2
- Maximum frequency in Hz. 
- voicing_thresholdfloat
- Voiced/unvoiced threshold. 
 
 - References [1]- J. W. Kim et al., “CREPE: A convolutional representation for pitch estimation,” Proceedings of ICASSP, pp. 161-165, 2018. [2]- M. Morisson et al., “Cross-domain neural pitch and periodicity estimation,” arXiv prepreint, arXiv:2301.12258, 2023. - forward(x)[source]#
- Compute pitch representation. - Parameters:
- xTensor [shape=(B, T) or (T,)]
- Waveform. 
 
- Returns:
- outTensor [shape=(B, N, C) or (N, C) or (B, N) or (N,)]
- Pitch probability, embedding, or pitch, where N is the number of frames and C is the number of pitch classes or the dimension of embedding. 
 
 - Examples - >>> x = diffsptk.sin(1000, 80) >>> pitch = diffsptk.Pitch(160, 8000, out_format="f0") >>> y = pitch(x) >>> y tensor([ 0.0000, 99.7280, 99.7676, 99.8334, 99.8162, 100.1602, 0.0000]) 
 
See also