cqt#

diffsptk.CQT#: alias of ConstantQTransform

class diffsptk.ConstantQTransform(frame_period: int, sample_rate: int, *, f_min: float = 32.7, n_bin: float = 84, n_bin_per_octave: int = 12, tuning: float = 0, filter_scale: float = 1, norm: float = 1, sparsity: float = 0.01, window: str = 'hann', scale: bool = True, res_type: str | None = 'kaiser_best', device: device | None = None, dtype: dtype | None = None, **kwargs)[source]#

Perform constant-Q transform based on the librosa implementation.

Parameters:

frame_periodint >= 1: The frame period in samples, \(P\).
sample_rateint >= 1: The sample rate in Hz.
f_minfloat > 0: The minimum center frequency in Hz.
n_binint >= 1: The number of CQ-bins, \(K\).
n_bin_per_octaveint >= 1: The number of bins per octave, \(B\).
tuningfloat: The tuning offset in fractions of a bin.
filter_scalefloat > 0: The filter scale factor.
normfloat: The type of norm used in the basis function normalization.
sparsityfloat in [0, 1): The sparsification factor.
windowstr: The window function for the basis.
scalebool: If True, scale the CQT response by the length of the filter.
res_type[‘kaiser_best’, ‘kaiser_fast’] or None: The resampling type.
devicetorch.device or None: The device of this module.
dtypetorch.dtype or None: The data type of this module.
**kwargsadditional keyword arguments: See torchaudio.transforms.Resample.

forward(x: Tensor) → Tensor[source]#

Compute constant-Q transform.

Parameters:

xTensor [shape=(…, T)]: The input waveform.

Returns:

outTensor [shape=(…, T/P, K)]: The CQT complex output.

Examples

>>> x = diffsptk.sin(99)
>>> cqt = diffsptk.CQT(100, 8000, n_bin=4)
>>> c = cqt(x).abs()
>>> c
tensor([[1.1259, 1.2069, 1.3008, 1.3885]])