drc#

diffsptk.DRC#

alias of DynamicRangeCompression

class diffsptk.DynamicRangeCompression(*, sample_rate: int, threshold: float = -20, ratio: float = 2, attack_time: float = 1, release_time: float = 500, makeup_gain: float = 0, abs_max: float = 1, learnable: bool = False, device: device | None = None, dtype: dtype | None = None)[source]#

See this page for details.

Parameters:
sample_rateint >= 1

The sample rate in Hz.

thresholdfloat <= 0

The threshold in dB.

ratiofloat > 1

The input/output ratio.

attack_timefloat > 0

The attack time in msec.

release_timefloat > 0

The release time in msec.

makeup_gainfloat >= 0

The make-up gain in dB.

abs_maxfloat > 0

The absolute maximum value of input.

learnablebool

Whether to make the DRC parameters learnable.

devicetorch.device or None

The device of this module.

dtypetorch.dtype or None

The data type of this module.

References

[1]

C.-Y. Yu et al., “Differentiable all-pole filters for time-varying audio systems,” Proceedings of DAFx, pp. 345-352, 2024.

forward(x: Tensor) Tensor[source]#

Perform dynamic range compression.

Parameters:
xTensor [shape=(…, T)]

The input waveform.

Returns:
outTensor [shape=(…, T)]

The compressed waveform.

Examples

>>> import diffsptk
>>> drc = diffsptk.DynamicRangeCompression(
...     sample_rate=8000, threshold=-20, ratio=2, makeup_gain=10
... )
>>> x = diffsptk.sin(8000)
>>> torch.var(x, correction=0)
tensor(0.5000)
>>> y = drc(x)
>>> torch.var(y, correction=0)
tensor(0.5651)
diffsptk.functional.drc(x: Tensor, *, sample_rate: int, threshold: float, ratio: float, attack_time: float, release_time: float, makeup_gain: float = 0, abs_max: float = 1) Tensor[source]#

Perform dynamic range compression.

Parameters:
xTensor [shape=(…, T)]

The input waveform.

sample_rateint >= 1

The sample rate in Hz.

thresholdfloat <= 0

The threshold in dB.

ratiofloat > 1

The input/output ratio.

attack_timefloat > 0

The attack time in msec.

release_timefloat > 0

The release time in msec.

makeup_gainfloat >= 0

The make-up gain in dB.

abs_maxfloat > 0

The absolute maximum value of input.

Returns:
outTensor [shape=(…, T)]

The compressed waveform.