ap#

class diffsptk.Aperiodicity(frame_period, sample_rate, fft_length=None, algorithm='tandem', out_format='a', lower_bound=0.001, upper_bound=0.999, **kwargs)[source]#

See this page for details.

Parameters:
frame_periodint >= 1

Frame period, \(P\).

sample_rateint >= 8000

Sample rate in Hz.

fft_lengthint >= 16 or None

Size of double-sided aperiodicity, \(L\). If None, band aperiodicity (uninterpolated aperiodicity) is returned as the output.

algorithm[‘tandem’, ‘d4c’]

Algorithm.

out_format[‘a’, ‘p’, ‘a/p’, ‘p/a’]

Output format.

lower_boundfloat >= 0

Lower bound of aperiodicity.

upper_boundfloat <= 1

Upper bound of aperiodicity.

References

[1]

H. Kawahara et al., “Simplification and extension of non-periodic excitation source representations for high-quality speech manipulation systems,” Proceedings of Interspeech, pp. 38-41, 2010.

[2]

M. Morise, “D4C, a band-aperiodicity estimator for high-quality speech synthesis,” Speech Communication, vol. 84, pp. 57-65, 2016.

forward(x, f0)[source]#

Compute aperiodicity measure.

Parameters:
xTensor [shape=(B, T) or (T,)]

Waveform.

f0Tensor [shape=(B, T/P) or (T/P,)]

F0 in Hz.

Returns:
outTensor [shape=(B, T/P, L/2+1) or (T/P, L/2+1)]

Aperiodicity.

Examples

>>> x = diffsptk.sin(1000, 80)
>>> pitch = diffsptk.Pitch(160, 8000, out_format="f0")
>>> f0 = pitch(x)
>>> f0.shape
torch.Size([7])
>>> aperiodicity = diffsptk.Aperiodicity(160, 16000, 8)
>>> ap = aperiodicity(x, f0)
>>> ap
tensor([[0.1010, 0.9948, 0.9990, 0.9990, 0.9990],
        [0.0010, 0.8419, 0.3644, 0.5912, 0.9590],
        [0.0010, 0.5316, 0.3091, 0.5430, 0.9540],
        [0.0010, 0.3986, 0.1930, 0.4222, 0.9234],
        [0.0010, 0.3627, 0.1827, 0.4106, 0.9228],
        [0.0010, 0.3699, 0.1827, 0.4106, 0.9228],
        [0.0010, 0.7659, 0.7081, 0.8378, 0.9912]])