ap#
- class diffsptk.Aperiodicity(frame_period: int, sample_rate: int, fft_length: int | None = None, algorithm: str = 'tandem', out_format: str | int = 'a', lower_bound: float = 0.001, upper_bound: float = 0.999, **kwargs)[source]#
See this page for details. Note that the gradients do not propagated through F0.
- Parameters:
- frame_periodint >= 1
The frame period in samples, \(P\).
- sample_rateint >= 8000
The sample rate in Hz.
- fft_lengthint >= 16 or None
The size of double-sided aperiodicity, \(L\). If None, the band aperiodicity (uninterpolated aperiodicity) is returned as the output.
- algorithm[‘tandem’, ‘d4c’]
The algorithm to estimate aperiodicity.
- out_format[‘a’, ‘p’, ‘a/p’, ‘p/a’]
The output format.
- lower_boundfloat >= 0
The lower bound of aperiodicity.
- upper_boundfloat <= 1
The upper bound of aperiodicity.
References
[1]H. Kawahara et al., “Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems,” Proceedings of Interspeech, pp. 38-41, 2010.
[2]M. Morise, “D4C, a band-aperiodicity estimator for high-quality speech synthesis,” Speech Communication, vol. 84, pp. 57-65, 2016.
- forward(x: Tensor, f0: Tensor) Tensor [source]#
Compute aperiodicity measure.
- Parameters:
- xTensor [shape=(B, T) or (T,)]
The input waveform.
- f0Tensor [shape=(B, T/P) or (T/P,)]
The F0 in Hz.
- Returns:
- outTensor [shape=(B, T/P, L/2+1) or (T/P, L/2+1)]
The aperiodicity.
Examples
>>> x = diffsptk.sin(1000, 80) >>> pitch = diffsptk.Pitch(160, 8000, out_format="f0") >>> f0 = pitch(x) >>> f0.shape torch.Size([7]) >>> aperiodicity = diffsptk.Aperiodicity(160, 16000, 8) >>> ap = aperiodicity(x, f0) >>> ap tensor([[0.1010, 0.9948, 0.9990, 0.9990, 0.9990], [0.0010, 0.8419, 0.3644, 0.5912, 0.9590], [0.0010, 0.5316, 0.3091, 0.5430, 0.9540], [0.0010, 0.3986, 0.1930, 0.4222, 0.9234], [0.0010, 0.3627, 0.1827, 0.4106, 0.9228], [0.0010, 0.3699, 0.1827, 0.4106, 0.9228], [0.0010, 0.7659, 0.7081, 0.8378, 0.9912]])
See also