Skip to navigation Skip to content

Diffusion Models Through the Linear Lens: Exact Analysis of Sampling, Learning, Receptive Fields, and Consistency

Date and Time
-
Location
Strickland Hall 109
Organizers
Speaker
Binxu Wang (Harvard)

Diffusion models are powerful generative systems, yet their internal mechanisms remain difficult to analyze. Taking a physicist's approach, we study the simplest tractable case: a diffusion model with a linear score function.

A key duality links architecture and distribution — a Gaussian dataset implies a linear optimal score, and a linear score network implies the learned distribution is the Gaussian approximation of the data. This duality enables fully analytical treatment across four aspects of diffusion models. Sampling dynamics. The linear score yields a closed-form, low-dimensional, rotation-like sampling trajectory governed by data covariance — and precisely predicts the early phase of pretrained diffusion models, revealing dominant linear structure across a wide range of noise scales. Learning dynamics. Deep linear networks admit analytical training dynamics, uncovering a spectral bias: structure is learned first along the top eigendimensions of the data. Receptive field structure. The effective receptive field is shaped by data covariance rather than architectural priors — it need not be local or equivariant — yielding predictions that extend recent work by Kamb and Ganguli. Sample consistency. Using random matrix theory, we predict sensitivity to dataset resampling, identifying which noise seeds yield consistent versus variable outputs. This work shows how a tractable linear regime provides a rigorous analytical lens into the sampling, learning, receptive field structure, and consistency of diffusion models — with insights that extend surprisingly far into the nonlinear setting.