The Mathematics of Sound Synthesis

Dave Deriso · 2026

Dave's Mini Synth
Wave
Warp
Intensity2.00
Cycles3
C2
D
E
F
G
A
B
C3
D
E
F
G
A
B
C4
D
E
F
G
A
B

A synthesizer builds complex sounds from simple mathematical operations on waveforms. At its core, every technique is a variation on the same idea: take a periodic signal and transform it by adding, multiplying, or reshaping to produce something with more texture. This post walks through five classic synthesis techniques — plus a novel one based on time warp functions, derives the math behind each, and provides interactive demos so you can see and hear the results.

Additive synthesis

The oldest idea in synthesis traces back to Fourier: any periodic sound can be decomposed into a sum of sinusoids. Additive synthesis builds sounds by reversing this process — stacking harmonics on top of a fundamental frequency.

Given a fundamental frequency f0f_0, we sum NN harmonics with amplitudes AkA_k:

y(t)=k=1NAksin(2πkf0t)y(t) = \sum_{k=1}^{N} A_k \sin(2\pi k f_0 t)

The shape of the resulting waveform depends entirely on the harmonic amplitudes. Setting Ak=1/kA_k = 1/k produces a sawtooth-like wave. Setting Ak=1/k2A_k = 1/k^2 rolls off faster, yielding a softer tone. With Ak=1A_k = 1 for all kk, every harmonic contributes equally — a bright, buzzy timbre.

The rolloff exponent α\alpha controls the spectral slope via Ak=1/kαA_k = 1/k^\alpha. At α=0\alpha = 0 the spectrum is flat; as α\alpha increases, higher harmonics decay faster and the sound becomes more mellow. Try adjusting the number of harmonics and the rolloff below.

Additive Synthesis
0
6
1.0

Amplitude modulation (AM)

Amplitude modulation multiplies a high-frequency carrier signal by a slowly varying envelope derived from a lower-frequency modulator. The classic AM formula is:

y(t)=[1+msin(2πfmt)]sin(2πfct)y(t) = \bigl[1 + m \cdot \sin(2\pi f_m t)\bigr] \cdot \sin(2\pi f_c t)

Here fcf_c is the carrier frequency, fmf_m is the modulator frequency, and m[0,1]m \in [0, 1] is the modulation depth. The factor of 1 keeps the envelope non-negative (for m1m \leq 1), preserving the carrier's pitch.

Expanding via the product-to-sum identity reveals what happens in the frequency domain:

y(t)=sin(2πfct)+m2[sin(2π(fc+fm)t)+sin(2π(fcfm)t)]y(t) = \sin(2\pi f_c t) + \frac{m}{2}\bigl[\sin(2\pi(f_c + f_m)t) + \sin(2\pi(f_c - f_m)t)\bigr]

The output contains three frequencies: the original carrier at fcf_c, plus two sidebands at fc±fmf_c \pm f_m. When fmf_m is low (a few Hz), you hear a tremolo effect — the volume pulses rhythmically. As fmf_m increases into the audible range, the sidebands become distinct tones, producing the characteristic metallic quality of AM.

Amplitude Modulation
0
40 Hz
0.80

Frequency modulation (FM)

While AM varies the amplitude of a carrier, FM varies its instantaneous frequency. The modulator signal is injected directly into the phase argument of the carrier:

y(t)=sin ⁣(2πfct+βsin(2πfmt))y(t) = \sin\!\bigl(2\pi f_c t + \beta \sin(2\pi f_m t)\bigr)

The parameter β\beta is the modulation index, which controls how far the carrier's instantaneous frequency deviates from fcf_c. The instantaneous frequency is the time derivative of the phase:

finst(t)=12πddt[2πfct+βsin(2πfmt)]=fc+βfmcos(2πfmt)f_{\text{inst}}(t) = \frac{1}{2\pi}\frac{d}{dt}\bigl[2\pi f_c t + \beta \sin(2\pi f_m t)\bigr] = f_c + \beta f_m \cos(2\pi f_m t)

So the carrier frequency oscillates between fcβfmf_c - \beta f_m and fc+βfmf_c + \beta f_m. The frequency deviation Δf=βfm\Delta f = \beta f_m determines the bandwidth of the resulting spectrum.

Using the Jacobi-Anger expansion, the FM signal can be decomposed into an infinite series of sidebands:

y(t)=n=Jn(β)sin ⁣(2π(fc+nfm)t)y(t) = \sum_{n=-\infty}^{\infty} J_n(\beta) \, \sin\!\bigl(2\pi (f_c + n f_m) t\bigr)

where JnJ_n are Bessel functions of the first kind. At low β\beta, only a few sidebands are significant and the sound is relatively pure. As β\beta increases, more sidebands appear and the spectrum becomes richer and brighter. This is the mechanism behind the Yamaha DX7 and much of 1980s electronic music.

Frequency Modulation
0
110 Hz
3.0

Ring modulation

Ring modulation is the simplest nonlinear operation: multiply two signals directly, without the DC offset that AM uses:

y(t)=sin(2πf1t)sin(2πf2t)y(t) = \sin(2\pi f_1 t) \cdot \sin(2\pi f_2 t)

Applying the product-to-sum trigonometric identity:

y(t)=12[cos(2π(f1f2)t)cos(2π(f1+f2)t)]y(t) = \tfrac{1}{2}\bigl[\cos(2\pi(f_1 - f_2)t) - \cos(2\pi(f_1 + f_2)t)\bigr]

The output contains only the sum and difference frequencies — neither original frequency survives. This is what gives ring modulation its distinctively inharmonic, metallic character. When the two frequencies are not harmonically related, the sum and difference tones are not integer multiples of a common fundamental, so the result sounds dissonant and bell-like.

Ring modulation is named after the ring of diodes used in early analog implementations. It was famously used for the Dalek voices in Doctor Who and remains a staple of electronic and experimental music.

Ring Modulation
0
300 Hz

Pulse width modulation (PWM)

A square wave alternates between +1 and −1. The duty cycle d(0,1)d \in (0, 1) controls what fraction of each period the wave spends at +1. At d=0.5d = 0.5 we get a symmetric square wave; at other values, we get a pulse wave.

We can express a pulse wave by thresholding a sinusoid:

y(t)=sign ⁣(sin(2πft)cos(πd))y(t) = \text{sign}\!\bigl(\sin(2\pi f t) - \cos(\pi d)\bigr)

The Fourier series of a pulse wave with duty cycle dd is:

y(t)=(2d1)+k=12πksin(πkd)sin(2πkft)y(t) = (2d - 1) + \sum_{k=1}^{\infty} \frac{2}{\pi k} \sin(\pi k d) \, \sin(2\pi k f t)

The term sin(πkd)\sin(\pi k d) acts as a spectral envelope: it zeros out harmonics where kdk d is an integer. At d=0.5d = 0.5, every even harmonic vanishes (since sin(πk/2)=0\sin(\pi k / 2) = 0 for even kk), producing the hollow, clarinet-like tone of a standard square wave. As the duty cycle shifts, different harmonics are suppressed or emphasized, sweeping through a continuum of timbres.

Slowly modulating the duty cycle with an LFO produces the classic PWM chorus effect heard in analog polysynths — a shimmering, animated pad sound that never quite sits still.

Pulse Width Modulation
0
0.50

Time warping

All the techniques above manipulate a signal's amplitude or frequency. Time warping takes a different approach: instead of changing what a signal does, it changes when it does it. We reparametrize time itself, stretching and compressing different parts of the waveform. This is a new approach to sound synthesis based on some of the ideas in my general dynamic time warping paper.

A time warp function ϕ:[0,1][0,1]\phi:[0,1] \to [0,1] is a mapping from original time to “warped” time. The identity ϕ(t)=t\phi(t) = t means no warping has occurred --time proceeds normally. Any other monotonically increasing function that preserves the endpoints, ϕ(0)=0\phi(0) = 0 and ϕ(1)=1\phi(1) = 1, produces a valid time warp, where time still moves forward, but at a varying rate.

Given a signal xx, we compose it with the warp function to obtain a time-warped signal:

x~(t)=(xϕ)(t)\tilde{x}(t) = (x \circ \phi)(t)

Where ϕ(t)>1\phi'(t) > 1, time moves faster than normal and the signal is compressed — frequencies increase locally. Where ϕ(t)<1\phi'(t) < 1, time slows down and the signal stretches out. The instantaneous frequency at time tt is scaled by ϕ(t)\phi'(t), so a single warp function can create pitch sweeps, chirps, and rhythmic distortions that would be difficult to achieve with conventional modulation.

To ensure musically useful results, we typically impose three constraints on ϕ\phi:

  • Boundary conditions: ϕ(0)=0\phi(0) = 0 and ϕ(1)=1\phi(1) = 1 — the signal starts and ends at the same points.
  • Monotonicity: ϕ(t)0\phi'(t) \geq 0 — time never runs backwards.
  • Slope bounds: sminϕ(t)smaxs_{\min} \leq \phi'(t) \leq s_{\max} — limits how extreme the local time-stretching can be.

In practice, we control the warp strength with an intensity parameter α[0,1]\alpha \in [0, 1] that blends between the identity and the full warp:

ϕα(t)=(1α)t+αϕ(t)\phi_\alpha(t) = (1 - \alpha) \cdot t + \alpha \cdot \phi(t)

At α=0\alpha = 0, we recover the identity and time proceeds normally. At α=1\alpha = 1, the full warp is applied. Values in between produce a smooth interpolation, allowing continuous control over the warping effect.

The interactive demo below lets you explore different warp functions. The left chart shows ϕ(t)\phi(t) against the identity (dashed line); the right chart shows the original and time-warped signals. Try selecting different functions and adjusting the intensity to see how each reshapes the waveform.

ϕ(t)=t2\phi(t) = t^2

So far we have applied ϕ\phi independently to each cycle of the waveform. But we can extend the warp to span NN cycles at once. Given a signal with period TT, we group NN consecutive periods into a single block of length NTNT and apply the warp across the entire block:

x~(t)=x ⁣(ϕα ⁣(tmodNTNT)N)\tilde{x}(t) = x\!\left(\phi_\alpha\!\left(\frac{t \bmod NT}{NT}\right) \cdot N\right)

At N=1N = 1, each cycle is warped identically — the result is periodic and harmonically rich, but uniform. As NN increases, the warp function stretches across more cycles, creating asymmetries between them: some cycles compress while others expand, producing evolving timbral patterns that repeat every NN periods rather than every one. Higher NN values yield longer-range pitch sweeps and rhythmic irregularities within each repeating unit.

When applied to synthesis, time warping creates timbral effects that sit somewhere between FM and granular techniques. Gentle warps produce vibrato-like pitch undulations; aggressive warps fracture a waveform into something entirely new.

Warp Modulation
0
0.80
5

Putting it together

These six techniques — additive synthesis, amplitude modulation, frequency modulation, ring modulation, pulse width modulation, and time warping — form the core vocabulary of sound synthesis. Each is a different way of mapping simple mathematical operations to perceptually rich timbral changes:

  • Additive builds up from harmonics — direct control over spectral content, at the cost of many oscillators.
  • AM creates sidebands by modulating amplitude — efficient but limited to two extra spectral components.
  • FM creates arbitrarily many sidebands from just two oscillators — the most powerful spectral tool per unit of complexity.
  • Ring mod eliminates the original frequencies entirely — ideal for inharmonic, metallic textures.
  • PWM reshapes a single oscillator's waveform — a timbral sweep from a single parameter.
  • Time warping reparametrizes time itself — local pitch and rhythm shifts from a single monotonic function.

Modern synthesizers combine these primitives freely: FM operators feeding into ring-modulated carriers, additive resynthesis driving PWM voices, AM envelopes shaping FM timbres, time warps morphing between presets. The math is always the same — sines, products, sums, and compositions — but the sonic possibilities are vast.