The Mathematics of Sound Synthesis

Dave's Mini Synth

Wave

Warp

Intensity2.00

Cycles3

A synthesizer builds complex sounds from simple mathematical operations on waveforms. At its core, every technique is a variation on the same idea: take a periodic signal and transform it by adding, multiplying, or reshaping to produce something with more texture. This post walks through five classic synthesis techniques — plus a novel one based on time warp functions, derives the math behind each, and provides interactive demos so you can see and hear the results.

Additive synthesis

The oldest idea in synthesis traces back to Fourier: any periodic sound can be decomposed into a sum of sinusoids. Additive synthesis builds sounds by reversing this process — stacking harmonics on top of a fundamental frequency.

Given a fundamental frequency $f_0$ , we sum $N$ harmonics with amplitudes $A_k$ :

y(t) = \sum_{k=1}^{N} A_k \sin(2\pi k f_0 t)

The shape of the resulting waveform depends entirely on the harmonic amplitudes. Setting $A_k = 1/k$ produces a sawtooth-like wave. Setting $A_k = 1/k^2$ rolls off faster, yielding a softer tone. With $A_k = 1$ for all $k$ , every harmonic contributes equally — a bright, buzzy timbre.

The rolloff exponent $\alpha$ controls the spectral slope via $A_k = 1/k^\alpha$ . At $\alpha = 0$ the spectrum is flat; as $\alpha$ increases, higher harmonics decay faster and the sound becomes more mellow. Try adjusting the number of harmonics and the rolloff below.

Additive Synthesis

Octave0

Harmonics6

Rolloff1.0

Amplitude modulation (AM)

Amplitude modulation multiplies a high-frequency carrier signal by a slowly varying envelope derived from a lower-frequency modulator. The classic AM formula is:

y(t) = \bigl[1 + m \cdot \sin(2\pi f_m t)\bigr] \cdot \sin(2\pi f_c t)

Here $f_c$ is the carrier frequency, $f_m$ is the modulator frequency, and $m \in [0, 1]$ is the modulation depth. The factor of 1 keeps the envelope non-negative (for $m \leq 1$ ), preserving the carrier's pitch.

Expanding via the product-to-sum identity reveals what happens in the frequency domain:

y(t) = \sin(2\pi f_c t) + \frac{m}{2}\bigl[\sin(2\pi(f_c + f_m)t) + \sin(2\pi(f_c - f_m)t)\bigr]

The output contains three frequencies: the original carrier at $f_c$ , plus two sidebands at $f_c \pm f_m$ . When $f_m$ is low (a few Hz), you hear a tremolo effect — the volume pulses rhythmically. As $f_m$ increases into the audible range, the sidebands become distinct tones, producing the characteristic metallic quality of AM.

Amplitude Modulation

Octave0

Modulator (fₘ)40 Hz

Depth (m)0.80

Frequency modulation (FM)

While AM varies the amplitude of a carrier, FM varies its instantaneous frequency. The modulator signal is injected directly into the phase argument of the carrier:

y(t) = \sin\!\bigl(2\pi f_c t + \beta \sin(2\pi f_m t)\bigr)

The parameter $\beta$ is the modulation index, which controls how far the carrier's instantaneous frequency deviates from $f_c$ . The instantaneous frequency is the time derivative of the phase:

f_{\text{inst}}(t) = \frac{1}{2\pi}\frac{d}{dt}\bigl[2\pi f_c t + \beta \sin(2\pi f_m t)\bigr] = f_c + \beta f_m \cos(2\pi f_m t)

So the carrier frequency oscillates between $f_c - \beta f_m$ and $f_c + \beta f_m$ . The frequency deviation $\Delta f = \beta f_m$ determines the bandwidth of the resulting spectrum.

Using the Jacobi-Anger expansion, the FM signal can be decomposed into an infinite series of sidebands:

y(t) = \sum_{n=-\infty}^{\infty} J_n(\beta) \, \sin\!\bigl(2\pi (f_c + n f_m) t\bigr)

where $J_n$ are Bessel functions of the first kind. At low $\beta$ , only a few sidebands are significant and the sound is relatively pure. As $\beta$ increases, more sidebands appear and the spectrum becomes richer and brighter. This is the mechanism behind the Yamaha DX7 and much of 1980s electronic music.

Frequency Modulation

Octave0

Modulator (fₘ)110 Hz

Mod Index (β)3.0

Ring modulation

Ring modulation is the simplest nonlinear operation: multiply two signals directly, without the DC offset that AM uses:

y(t) = \sin(2\pi f_1 t) \cdot \sin(2\pi f_2 t)

Applying the product-to-sum trigonometric identity:

y(t) = \tfrac{1}{2}\bigl[\cos(2\pi(f_1 - f_2)t) - \cos(2\pi(f_1 + f_2)t)\bigr]

The output contains only the sum and difference frequencies — neither original frequency survives. This is what gives ring modulation its distinctively inharmonic, metallic character. When the two frequencies are not harmonically related, the sum and difference tones are not integer multiples of a common fundamental, so the result sounds dissonant and bell-like.

Ring modulation is named after the ring of diodes used in early analog implementations. It was famously used for the Dalek voices in Doctor Who and remains a staple of electronic and experimental music.

Ring Modulation

Octave0

Frequency 2 (f₂)300 Hz

Pulse width modulation (PWM)

A square wave alternates between +1 and −1. The duty cycle $d \in (0, 1)$ controls what fraction of each period the wave spends at +1. At $d = 0.5$ we get a symmetric square wave; at other values, we get a pulse wave.

We can express a pulse wave by thresholding a sinusoid:

y(t) = \text{sign}\!\bigl(\sin(2\pi f t) - \cos(\pi d)\bigr)

The Fourier series of a pulse wave with duty cycle $d$ is:

y(t) = (2d - 1) + \sum_{k=1}^{\infty} \frac{2}{\pi k} \sin(\pi k d) \, \sin(2\pi k f t)

The term $\sin(\pi k d)$ acts as a spectral envelope: it zeros out harmonics where $k d$ is an integer. At $d = 0.5$ , every even harmonic vanishes (since $\sin(\pi k / 2) = 0$ for even $k$ ), producing the hollow, clarinet-like tone of a standard square wave. As the duty cycle shifts, different harmonics are suppressed or emphasized, sweeping through a continuum of timbres.

Slowly modulating the duty cycle with an LFO produces the classic PWM chorus effect heard in analog polysynths — a shimmering, animated pad sound that never quite sits still.

Pulse Width Modulation

Octave0

Duty Cycle0.50

Time warping

All the techniques above manipulate a signal's amplitude or frequency. Time warping takes a different approach: instead of changing what a signal does, it changes when it does it. We reparametrize time itself, stretching and compressing different parts of the waveform. This is a new approach to sound synthesis based on some of the ideas in my general dynamic time warping paper.

A time warp function $\phi:[0,1] \to [0,1]$ is a mapping from original time to “warped” time. The identity $\phi(t) = t$ means no warping has occurred --time proceeds normally. Any other monotonically increasing function that preserves the endpoints, $\phi(0) = 0$ and $\phi(1) = 1$ , produces a valid time warp, where time still moves forward, but at a varying rate.

Given a signal $x$ , we compose it with the warp function to obtain a time-warped signal:

\tilde{x}(t) = (x \circ \phi)(t)

Where $\phi'(t) > 1$ , time moves faster than normal and the signal is compressed — frequencies increase locally. Where $\phi'(t) < 1$ , time slows down and the signal stretches out. The instantaneous frequency at time $t$ is scaled by $\phi'(t)$ , so a single warp function can create pitch sweeps, chirps, and rhythmic distortions that would be difficult to achieve with conventional modulation.

To ensure musically useful results, we typically impose three constraints on $\phi$ :

Boundary conditions: $\phi(0) = 0$ and $\phi(1) = 1$ — the signal starts and ends at the same points.
Monotonicity: $\phi'(t) \geq 0$ — time never runs backwards.
Slope bounds: $s_{\min} \leq \phi'(t) \leq s_{\max}$ — limits how extreme the local time-stretching can be.

In practice, we control the warp strength with an intensity parameter $\alpha \in [0, 1]$ that blends between the identity and the full warp:

\phi_\alpha(t) = (1 - \alpha) \cdot t + \alpha \cdot \phi(t)

At $\alpha = 0$ , we recover the identity and time proceeds normally. At $\alpha = 1$ , the full warp is applied. Values in between produce a smooth interpolation, allowing continuous control over the warping effect.

The interactive demo below lets you explore different warp functions. The left chart shows $\phi(t)$ against the identity (dashed line); the right chart shows the original and time-warped signals. Try selecting different functions and adjusting the intensity to see how each reshapes the waveform.

\phi(t) = t^2

FlipIntensity1.00

So far we have applied $\phi$ independently to each cycle of the waveform. But we can extend the warp to span $N$ cycles at once. Given a signal with period $T$ , we group $N$ consecutive periods into a single block of length $NT$ and apply the warp across the entire block:

\tilde{x}(t) = x\!\left(\phi_\alpha\!\left(\frac{t \bmod NT}{NT}\right) \cdot N\right)

At $N = 1$ , each cycle is warped identically — the result is periodic and harmonically rich, but uniform. As $N$ increases, the warp function stretches across more cycles, creating asymmetries between them: some cycles compress while others expand, producing evolving timbral patterns that repeat every $N$ periods rather than every one. Higher $N$ values yield longer-range pitch sweeps and rhythmic irregularities within each repeating unit.

When applied to synthesis, time warping creates timbral effects that sit somewhere between FM and granular techniques. Gentle warps produce vibrato-like pitch undulations; aggressive warps fracture a waveform into something entirely new.

Warp Modulation

Waveform

Warp

Flip

Octave0

Intensity0.80

Cycles5

Putting it together

These six techniques — additive synthesis, amplitude modulation, frequency modulation, ring modulation, pulse width modulation, and time warping — form the core vocabulary of sound synthesis. Each is a different way of mapping simple mathematical operations to perceptually rich timbral changes:

Additive builds up from harmonics — direct control over spectral content, at the cost of many oscillators.
AM creates sidebands by modulating amplitude — efficient but limited to two extra spectral components.
FM creates arbitrarily many sidebands from just two oscillators — the most powerful spectral tool per unit of complexity.
Ring mod eliminates the original frequencies entirely — ideal for inharmonic, metallic textures.
PWM reshapes a single oscillator's waveform — a timbral sweep from a single parameter.
Time warping reparametrizes time itself — local pitch and rhythm shifts from a single monotonic function.

Modern synthesizers combine these primitives freely: FM operators feeding into ring-modulated carriers, additive resynthesis driving PWM voices, AM envelopes shaping FM timbres, time warps morphing between presets. The math is always the same — sines, products, sums, and compositions — but the sonic possibilities are vast.