Skip to navigation Skip to content

Fine-tuning and Steering of Diffusions with Non-Differentiable Rewards

Date and Time
-
Location
Zoom
Speaker
Jakiw Pidstrigach

Abstract: We consider stochastic differential equations that are modified by reward functions or likelihood based weights in order to promote specific events. This perspective applies both to diffusion type models used in generative modeling and to SDEs describing physical phenomena such as molecular dynamics or weather. The main emphasis is on rewards that are non smooth or singular, as they appear in conditioning, threshold objectives, and rare event simulation. We discuss diffusion bridges as a central example, where one seeks typical trajectories connecting prescribed endpoints, for instance during a molecular transition between stable states or between two atmospheric configurations. We also discuss fine tuning of diffusion models with non differentiable rewards, motivated by applications that prioritize tail events and other low probability regions.

 

Bio: Jakiw Pidstrigach is an AI Research Scientist at Gridmatic. He earned his PhD from the University of Potsdam, with research on filtering and diffusion models. He subsequently held a postdoctoral position at the University of Oxford, where he worked on theory and optimal control of AI.