Operations¶
Low-level DSP building blocks: delay, envelopes, FFT, convolution, sample rates, mixing, panning, normalization, cross-correlation, Hilbert transform, median filter, LMS adaptive filter.
Usage examples¶
Delay¶
from nanodsp import ops
from nanodsp.buffer import AudioBuffer
buf = AudioBuffer.from_file("input.wav")
# Fixed delay (100 samples)
delayed = ops.delay(buf, delay_samples=100)
# Cubic interpolation for fractional delays
delayed = ops.delay(buf, delay_samples=50.5, interpolation="cubic")
# Time-varying delay (vibrato effect)
import numpy as np
t = np.arange(buf.frames, dtype=np.float32) / buf.sample_rate
delay_curve = 20 + 10 * np.sin(2 * np.pi * 5 * t) # 5 Hz modulation
vibrato = ops.delay_varying(buf, delays=delay_curve)
Envelopes¶
# Smooth an envelope with a box filter
smoothed = ops.box_filter(buf, length=64)
# Multi-stage smoothing (cascaded box filters approximate Gaussian)
smooth = ops.box_stack_filter(buf, size=32, layers=4)
# Peak tracking
peaks = ops.peak_hold(buf, length=128)
decayed = ops.peak_decay(buf, length=256)
FFT¶
buf = AudioBuffer.sine(440.0, frames=1024, sample_rate=44100)
# Forward FFT (returns list of complex arrays, one per channel)
spectra = ops.rfft(buf)
# Inverse FFT back to time domain
reconstructed = ops.irfft(spectra, size=1024, sample_rate=44100)
Convolution¶
signal = AudioBuffer.from_file("dry.wav")
ir = AudioBuffer.from_file("impulse_response.wav")
# Convolve (applies reverb IR to signal)
wet = ops.convolve(signal, ir, normalize=True, trim=True)
Mixing and panning¶
a = AudioBuffer.sine(440.0, frames=44100)
b = AudioBuffer.sine(880.0, frames=44100)
# Crossfade between two buffers
blended = ops.crossfade(a, b, x=0.5) # 50/50 mix
# Mix multiple buffers with gains
mixed = ops.mix_buffers(a, b, gains=[1.0, 0.5])
# Equal-power panning (mono -> stereo)
panned = ops.pan(a, position=0.3) # slightly right
# Stereo widening
wide = ops.stereo_widen(stereo_buf, width=1.5)
Mid-side processing¶
stereo = AudioBuffer.noise(channels=2, frames=44100)
# Encode to mid-side
ms = ops.mid_side_encode(stereo)
# Process mid and side independently...
# Decode back to left-right
lr = ops.mid_side_decode(ms)
Normalization and fades¶
buf = AudioBuffer.from_file("input.wav")
# Peak normalize to -1 dBFS
normalized = ops.normalize_peak(buf, target_db=-1.0)
# Trim leading/trailing silence
trimmed = ops.trim_silence(buf, threshold_db=-60.0, pad_frames=100)
# Apply fades
faded = ops.fade_in(buf, duration_ms=10.0)
faded = ops.fade_out(buf, duration_ms=50.0, curve="ease_out")
LFO¶
# Generate a 2 Hz LFO signal
lfo_signal = ops.lfo(frames=44100, low=0.0, high=1.0, rate=2.0, sample_rate=44100)
Cross-correlation and Hilbert transform¶
# Cross-correlation between two signals
corr = ops.xcorr(buf_a, buf_b)
# Autocorrelation
auto = ops.xcorr(buf_a)
# Analytic signal envelope via Hilbert transform
env = ops.hilbert(buf)
env = ops.envelope(buf) # alias
Adaptive filtering¶
# LMS adaptive noise cancellation
output, error = ops.lms_filter(
buf, ref, filter_len=32, step_size=0.01, normalized=True
)
API reference¶
ops
¶
Core DSP building blocks: delays, envelopes, FFT, convolution, rates, mix, LFO, numpy utils.
delay
¶
delay(
buf: AudioBuffer,
delay_samples: float,
capacity: int | None = None,
interpolation: Literal["linear", "cubic"] = "linear",
) -> AudioBuffer
Apply a fixed delay (in samples) per channel.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
delay_samples
|
Delay amount in samples, >= 0 (fractional for interpolated delay).
TYPE:
|
capacity
|
Delay line capacity. If None, auto-sized from delay_samples.
TYPE:
|
interpolation
|
Interpolation mode: 'linear' or 'cubic'.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Delayed audio. |
delay_varying
¶
delay_varying(
buf: AudioBuffer,
delays,
interpolation: Literal["linear", "cubic"] = "linear",
) -> AudioBuffer
Apply time-varying delay per channel.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
delays
|
1D (broadcast to all channels) or 2D [channels, frames].
TYPE:
|
interpolation
|
Interpolation mode: 'linear' or 'cubic'.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Delayed audio. |
box_filter
¶
Apply a BoxFilter (moving average) per channel.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
length
|
Window size in samples.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Smoothed audio. |
box_stack_filter
¶
Apply a BoxStackFilter (stacked moving average) per channel.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
size
|
Window size in samples per layer.
TYPE:
|
layers
|
Number of stacked box filter layers.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Smoothed audio. |
peak_hold
¶
Apply PeakHold per channel.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
length
|
Hold window size in samples.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Peak-held envelope. |
peak_decay
¶
Apply PeakDecayLinear per channel.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
length
|
Decay window size in samples.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Peak-decayed envelope. |
rfft
¶
Forward real FFT per channel.
Returns a list of complex64 arrays (one per channel, N/2 bins each). Uses RealFFT.fast_size_above for efficient FFT size, zero-pads if needed.
irfft
¶
Inverse real FFT from list of spectra to AudioBuffer.
Returns unscaled output (matches C++ convention). Divide by N if needed.
convolve
¶
convolve(
buf: AudioBuffer,
ir: AudioBuffer,
normalize: bool = False,
trim: bool = True,
) -> AudioBuffer
FFT-based overlap-add convolution.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input signal.
TYPE:
|
ir
|
Impulse response.
TYPE:
|
normalize
|
If True, scale IR to unit energy before convolving.
TYPE:
|
trim
|
If True (default), output has the same length as buf. If False, output is the full convolution (buf.frames + ir.frames - 1).
TYPE:
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If sample rates differ or channel counts are incompatible. |
upsample_2x
¶
upsample_2x(
buf: AudioBuffer,
max_block: int | None = None,
half_latency: int = 16,
pass_freq: float = 0.43,
) -> AudioBuffer
Upsample by 2x. Returns AudioBuffer with 2x frames and 2x sample rate.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
max_block
|
Maximum block size. If None, uses buf.frames.
TYPE:
|
half_latency
|
Half-band filter latency in samples.
TYPE:
|
pass_freq
|
Normalized passband edge frequency.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Upsampled audio at 2x sample rate. |
oversample_roundtrip
¶
oversample_roundtrip(
buf: AudioBuffer,
max_block: int | None = None,
half_latency: int = 16,
pass_freq: float = 0.43,
) -> AudioBuffer
Upsample then downsample (roundtrip). Same shape and sample rate as input.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
max_block
|
Maximum block size. If None, uses buf.frames.
TYPE:
|
half_latency
|
Half-band filter latency in samples.
TYPE:
|
pass_freq
|
Normalized passband edge frequency.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Roundtripped audio (same shape as input). |
hadamard
¶
Apply Hadamard mixing across channels at each frame.
Requires power-of-2 channel count.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio (must have power-of-2 channel count).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Hadamard-mixed audio. |
householder
¶
Apply Householder reflection across channels at each frame.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Householder-mixed audio. |
crossfade
¶
Crossfade between two buffers using cheap_energy_crossfade coefficients.
| PARAMETER | DESCRIPTION |
|---|---|
buf_a
|
First audio buffer (returned when x=0).
TYPE:
|
buf_b
|
Second audio buffer (returned when x=1).
TYPE:
|
x
|
Crossfade position, 0.0--1.0 (0.0 = buf_a, 1.0 = buf_b).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Crossfaded audio. |
lfo
¶
lfo(
frames: int,
low: float,
high: float,
rate: float,
sample_rate: float = 48000.0,
rate_variation: float = 0.0,
depth_variation: float = 0.0,
seed: int | None = None,
) -> AudioBuffer
Generate an LFO signal using CubicLfo.
| PARAMETER | DESCRIPTION |
|---|---|
frames
|
Number of output samples.
TYPE:
|
low
|
Output value range.
TYPE:
|
high
|
Output value range.
TYPE:
|
rate
|
Base rate in cycles per sample, > 0. Typical: 0.0001--0.01 (e.g. 0.001 = ~48 Hz at 48 kHz sample rate).
TYPE:
|
sample_rate
|
Sample rate for the returned AudioBuffer metadata, > 0.
TYPE:
|
rate_variation
|
Randomization parameters, >= 0 (0 = deterministic).
TYPE:
|
depth_variation
|
Randomization parameters, >= 0 (0 = deterministic).
TYPE:
|
seed
|
Random seed for reproducibility.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Mono buffer containing the LFO waveform. |
normalize_peak
¶
Normalize peak amplitude to target_db dBFS.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
target_db
|
Target peak level in dBFS, <= 0. Typical: -6 to 0.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Peak-normalized audio. |
trim_silence
¶
Trim leading and trailing silence below threshold_db.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
threshold_db
|
Silence threshold in dB, typically -80 to -20.
TYPE:
|
pad_frames
|
Extra frames to keep around non-silent regions, >= 0.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Trimmed audio. |
fade_in
¶
fade_in(
buf: AudioBuffer,
duration_ms: float = 10.0,
curve: Literal[
"linear", "ease_in", "ease_out", "smoothstep"
] = "linear",
) -> AudioBuffer
Apply a fade-in over duration_ms milliseconds.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
duration_ms
|
Fade duration in milliseconds.
TYPE:
|
curve
|
Fade shape: 'linear', 'ease_in', 'ease_out', or 'smoothstep'.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Audio with fade-in applied. |
fade_out
¶
fade_out(
buf: AudioBuffer,
duration_ms: float = 10.0,
curve: Literal[
"linear", "ease_in", "ease_out", "smoothstep"
] = "linear",
) -> AudioBuffer
Apply a fade-out over duration_ms milliseconds.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
duration_ms
|
Fade duration in milliseconds.
TYPE:
|
curve
|
Fade shape: 'linear', 'ease_in', 'ease_out', or 'smoothstep'.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Audio with fade-out applied. |
pan
¶
Pan a signal using equal-power panning.
Mono input produces stereo output. Stereo input scales L/R gains.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input audio.
TYPE:
|
position
|
Pan position: -1.0 = hard left, 0.0 = center, 1.0 = hard right.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Panned audio (stereo if mono input). |
mix_buffers
¶
Sum multiple AudioBuffers with optional per-buffer gains.
All buffers must share the same sample_rate. Shorter buffers are zero-padded to the length of the longest.
| PARAMETER | DESCRIPTION |
|---|---|
*buffers
|
Audio buffers to sum.
TYPE:
|
gains
|
Per-buffer gain multipliers. If None, all gains are 1.0.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Summed audio. |
mid_side_encode
¶
Encode stereo [L, R] to mid-side [M, S].
M = (L + R) / 2, S = (L - R) / 2.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Stereo input audio.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Mid-side encoded audio [M, S]. |
mid_side_decode
¶
Decode mid-side [M, S] back to stereo [L, R].
L = M + S, R = M - S.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Mid-side input audio [M, S].
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Stereo audio [L, R]. |
xcorr
¶
FFT-based cross-correlation (or autocorrelation).
| PARAMETER | DESCRIPTION |
|---|---|
buf_a
|
First signal (mono). Multi-channel buffers are mixed to mono.
TYPE:
|
buf_b
|
Second signal. If None, computes autocorrelation of buf_a.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ndarray
|
1D cross-correlation array of length |
hilbert
¶
Compute the envelope (magnitude of analytic signal) per channel.
Uses the FFT-based method: zero negative frequencies, IFFT, then take the absolute value.
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Envelope of the analytic signal (real-valued). |
envelope
¶
Compute the amplitude envelope (magnitude of analytic signal).
Alias for :func:hilbert.
median_filter
¶
Apply a median filter per channel.
| PARAMETER | DESCRIPTION |
|---|---|
kernel_size
|
Window size for the median (must be odd and >= 1).
TYPE:
|
lms_filter
¶
lms_filter(
buf: AudioBuffer,
ref: AudioBuffer,
filter_len: int = 32,
step_size: float = 0.01,
normalized: bool = True,
) -> tuple[AudioBuffer, AudioBuffer]
LMS (Least Mean Squares) adaptive filter.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Input (desired) signal.
TYPE:
|
ref
|
Reference (noise) signal to be adaptively filtered and subtracted.
TYPE:
|
filter_len
|
Number of filter taps, >= 1. Typical: 16--256.
TYPE:
|
step_size
|
Adaptation step size (mu), > 0 and < 1 for stability. Typical: 0.001--0.1.
TYPE:
|
normalized
|
If True, use Normalized LMS (step_size normalized by input power).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
tuple[AudioBuffer, AudioBuffer]
|
(output, error) — output is the filtered reference, error is buf - output. |
stereo_widen
¶
Adjust stereo width via mid-side processing.
| PARAMETER | DESCRIPTION |
|---|---|
buf
|
Stereo input audio.
TYPE:
|
width
|
Width factor, >= 0. 0.0 = mono, 1.0 = unchanged, > 1.0 = wider. Typical range: 0.0--3.0.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
Width-adjusted stereo audio. |