drc#

diffsptk.DRC#: alias of DynamicRangeCompression

class diffsptk.DynamicRangeCompression(*, sample_rate: int, threshold: float = -20, ratio: float = 2, attack_time: float = 1, release_time: float = 500, makeup_gain: float = 0, abs_max: float = 1, learnable: bool = False, device: device | None = None, dtype: dtype | None = None)[source]#

See this page for details.

Parameters:

sample_rateint >= 1: The sample rate in Hz.
thresholdfloat <= 0: The threshold in dB.
ratiofloat > 1: The input/output ratio.
attack_timefloat > 0: The attack time in msec.
release_timefloat > 0: The release time in msec.
makeup_gainfloat >= 0: The make-up gain in dB.
abs_maxfloat > 0: The absolute maximum value of input.
learnablebool: Whether to make the DRC parameters learnable.
devicetorch.device or None: The device of this module.
dtypetorch.dtype or None: The data type of this module.

References

[1]

C.-Y. Yu et al., “Differentiable all-pole filters for time-varying audio systems,” Proceedings of DAFx, pp. 345-352, 2024.

forward(x: Tensor) → Tensor[source]#

Perform dynamic range compression.

Parameters:

xTensor [shape=(…, T)]: The input waveform.

Returns:

outTensor [shape=(…, T)]: The compressed waveform.

Examples

>>> import diffsptk
>>> drc = diffsptk.DynamicRangeCompression(
...     sample_rate=8000, threshold=-20, ratio=2, makeup_gain=10
... )
>>> x = diffsptk.sin(8000)
>>> torch.var(x, correction=0)
tensor(0.5000)
>>> y = drc(x)
>>> torch.var(y, correction=0)
tensor(0.5651)

diffsptk.functional.drc(x: Tensor, *, sample_rate: int, threshold: float, ratio: float, attack_time: float, release_time: float, makeup_gain: float = 0, abs_max: float = 1) → Tensor[source]#

Perform dynamic range compression.

Parameters:

xTensor [shape=(…, T)]: The input waveform.
sample_rateint >= 1: The sample rate in Hz.
thresholdfloat <= 0: The threshold in dB.
ratiofloat > 1: The input/output ratio.
attack_timefloat > 0: The attack time in msec.
release_timefloat > 0: The release time in msec.
makeup_gainfloat >= 0: The make-up gain in dB.
abs_maxfloat > 0: The absolute maximum value of input.

Returns:

outTensor [shape=(…, T)]: The compressed waveform.