FFT — radix-2 power spectrum

Cooley–Tukey radix-2 FFT on a 256-sample two-tone signal. The math kernel runs on this device, in WebAssembly. The receipt below is computed against the Landauer lower bound (kT·ln2 × bit-ops) and the TDP envelope (wall_time × assumed_watts). The impedance term μ is how far apparent silicon sits above the physical floor.

view the kernel that wrote this receipt crates/mathground-view/src/fft.rs
//! FFT kernel — Cooley–Tukey radix-2 in-place, then power spectrum in dB.
//!
//! Pure compute. No allocations after the input buffer is taken. Counts its
//! bit-ops so the meter can price it against the Landauer floor.
//!
//! Reference: any textbook (Press et al., NR §12.2). The kernel is
//! deterministic — V-class L0-closed.

use serde::{Deserialize, Serialize};

/// Per-frame report from running the FFT kernel.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct FftKernelReport {
    pub n: u32,
    /// Estimated bit-ops for this FFT pass. Conservative: counts the
    /// butterflies × an f32 multiply-add cost in 32-bit bit-ops.
    pub bit_ops: u64,
}

/// In-place radix-2 Cooley–Tukey FFT on interleaved (re, im) floats.
/// `data.len()` must be `2 * n` where `n` is a power of two.
///
/// Returns the kernel report. Errors only when `data.len()` is not a
/// power-of-two pair count.
pub fn fft_in_place(data: &mut [f32]) -> Result<FftKernelReport, &'static str> {
    let pair_count = data.len() / 2;
    if data.len() != pair_count * 2 {
        return Err("FFT buffer must be interleaved (re, im) pairs");
    }
    if pair_count == 0 || !pair_count.is_power_of_two() {
        return Err("FFT length must be a power of two");
    }
    let n = pair_count;

    // ── Bit-reversal permutation ──────────────────────────────────────
    let mut j = 0usize;
    for i in 0..n {
        if j > i {
            data.swap(2 * i, 2 * j);
            data.swap(2 * i + 1, 2 * j + 1);
        }
        let mut m = n >> 1;
        while m >= 1 && j >= m {
            j -= m;
            m >>= 1;
        }
        j += m;
    }

    // ── Butterflies ───────────────────────────────────────────────────
    let mut len = 2usize;
    while len <= n {
        let half = len >> 1;
        // Twiddle: exp(-i·2π/len)
        let ang = -2.0 * core::f32::consts::PI / (len as f32);
        let wlen_re = ang.cos();
        let wlen_im = ang.sin();
        let mut i = 0usize;
        while i < n {
            let mut w_re = 1.0f32;
            let mut w_im = 0.0f32;
            for k in 0..half {
                let u_idx = 2 * (i + k);
                let t_idx = 2 * (i + k + half);
                let t_re = data[t_idx] * w_re - data[t_idx + 1] * w_im;
                let t_im = data[t_idx] * w_im + data[t_idx + 1] * w_re;
                let u_re = data[u_idx];
                let u_im = data[u_idx + 1];
                data[u_idx] = u_re + t_re;
                data[u_idx + 1] = u_im + t_im;
                data[t_idx] = u_re - t_re;
                data[t_idx + 1] = u_im - t_im;
                let nw_re = w_re * wlen_re - w_im * wlen_im;
                let nw_im = w_re * wlen_im + w_im * wlen_re;
                w_re = nw_re;
                w_im = nw_im;
                let _ = k;
            }
            i += len;
        }
        len <<= 1;
    }

    // ── Bit-op estimate ───────────────────────────────────────────────
    // Each butterfly: 4 mults + 6 adds on f32. Treat each f32 mul/add as
    // ~32 bit-ops (rough, conservative). Total butterflies = (n/2) · log2 n.
    let log2n = (n as u64).trailing_zeros() as u64;
    let butterflies = (n as u64 / 2) * log2n;
    let bit_ops = butterflies * 10 * 32;

    Ok(FftKernelReport {
        n: n as u32,
        bit_ops,
    })
}

/// Compute the power spectrum (in dB, ref = max) of an interleaved buffer
/// that has been through `fft_in_place`. Writes to `out` of length `n/2`.
///
/// Returns the maximum magnitude squared for the page to display.
pub fn fft_power_spectrum(data: &[f32], out: &mut [f32]) -> f32 {
    let half = data.len() / 4; // n/2
    debug_assert_eq!(out.len(), half);
    let mut max_mag2 = 0.0f32;
    for k in 0..half {
        let re = data[2 * k];
        let im = data[2 * k + 1];
        let mag2 = re * re + im * im;
        out[k] = mag2;
        if mag2 > max_mag2 {
            max_mag2 = mag2;
        }
    }
    let floor = 1.0e-12_f32;
    let scale = max_mag2.max(floor);
    for v in out.iter_mut() {
        let r = (*v / scale).max(floor);
        *v = 10.0 * r.log10(); // dB, 0 at peak, negative below
    }
    max_mag2
}

#[cfg(test)]
mod tests {
    use super::*;

    fn make_buf(n: usize) -> Vec<f32> {
        let mut v = vec![0.0; 2 * n];
        for (k, chunk) in v.chunks_mut(2).enumerate() {
            // Pure tone at bin 4 of n=64 samples.
            let t = k as f32 / n as f32;
            let s = (2.0 * core::f32::consts::PI * 4.0 * t).sin();
            chunk[0] = s; // real
            chunk[1] = 0.0; // imag
        }
        v
    }

    #[test]
    fn fft_rejects_non_power_of_two() {
        let mut data = vec![0.0; 2 * 6];
        let err = fft_in_place(&mut data);
        assert!(err.is_err());
    }

    #[test]
    fn fft_recovers_pure_tone() {
        let n = 64;
        let mut buf = make_buf(n);
        let report = fft_in_place(&mut buf).unwrap();
        assert_eq!(report.n, n as u32);
        let mut spec = vec![0.0; n / 2];
        fft_power_spectrum(&buf, &mut spec);
        // Peak should be at bin 4 (the tone we injected).
        let (peak_idx, _) = spec
            .iter()
            .enumerate()
            .max_by(|a, b| a.1.partial_cmp(b.1).unwrap())
            .unwrap();
        assert_eq!(peak_idx, 4, "FFT should peak at the injected tone bin");
    }

    #[test]
    fn bit_ops_scale_with_n_log_n() {
        let mut a = vec![0.0f32; 2 * 64];
        let mut b = vec![0.0f32; 2 * 128];
        let ra = fft_in_place(&mut a).unwrap();
        let rb = fft_in_place(&mut b).unwrap();
        // Expect rb.bit_ops / ra.bit_ops ≈ (128·7) / (64·6) = 2.333
        let ratio = rb.bit_ops as f64 / ra.bit_ops as f64;
        assert!((ratio - 2.333).abs() < 0.05, "ratio was {ratio}");
    }
}

This is the exact Rust file compiled into the WASM module the page just loaded. Every line that runs on your device is here. The receipt is a function of this code, not a bespoke benchmark.

Notes on the receipt

  • V-class L0-closed: identical inputs produce identical outputs. The receipt is replayable.
  • bit-ops / op: butterflies × ~10 f32 ops × 32 bit-ops. Conservative; real microarchitecture overhead lives in μ.
  • μ apparent: the gap to physics. Typical commodity silicon is ~10⁹–10¹⁰ above the floor for f32 math.