Skip to navigation Skip to content

Steering Diffusion Models

Date and Time
-
Location
Zoom
Speaker
Qingsong Wang (UCSD)

Guidance mechanisms enable controllable generation from diffusion models at inference time. Classifier guidance steers sampling using gradients from a noise-aware classifier, offering principled control but requiring a separately trained network. Classifier-free guidance eliminates the external classifier by interpolating conditional and unconditional predictions, yet demands paired training. Training-free methods such as universal guidance repurpose off-the-shelf networks, but rely on per-step gradient optimization that is expensive and often unstable.

In this talk, I present a general recipe for efficiently steering unconditional diffusion models without gradient guidance during inference. Our approach rests on two structural observations. First, noise alignment: even at early, highly corrupted stages of the reverse process, coarse semantic steering is possible using a lightweight, offline-computed guidance signal—no per-step or per-sample gradients required. Second, transferable concept vectors: a concept direction in activation space, once learned, transfers across both timesteps and samples. A single fixed steering vector learned near low noise levels remains effective when injected at intermediate noise levels for every generation trajectory, providing refined conditional control at negligible cost. These directions are identified via Recursive Feature Machines (RFM), a backpropagation-free feature learning method. Experiments on CIFAR-10, ImageNet, and CelebA demonstrate improved accuracy and generation quality over gradient-based guidance, with significant inference speedups.