publications | Michael Plainer

in review

Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models

Michael Plainer, Hao Wu, Leon Klein, Stephan Günnemann, and Frank Noé

2025

Abs arXiv Bib Code

Diffusion models have recently gained significant attention due to their effectiveness in various scientific domains, including biochemistry. When trained on equilibrium molecular distributions, diffusion models provide both: a generative procedure to sample equilibrium conformations and associated forces derived from the model’s scores. However, using the forces for coarse-grained molecular dynamics simulations uncovers inconsistencies in the samples generated via classical diffusion inference and simulation, despite both originating from the same model. Particularly at the small diffusion timesteps required for simulations, diffusion models fail to satisfy the Fokker-Planck equation, which governs how the score should evolve over time. We interpret this deviation as an indication of the observed inconsistencies and propose an energy-based diffusion model with a Fokker-Planck-derived regularization term enforcing consistency. We demonstrate the effectiveness of our approach on toy systems, alanine dipeptide, and introduce a state-of-the-art transferable Boltzmann emulator for dipeptides that supports simulation and demonstrates enhanced consistency and efficient sampling.
@article{plainer2025consistent, author = {Plainer, Michael and Wu, Hao and Klein, Leon and G{\"u}nnemann, Stephan and No{\'e}, Frank}, title = {Consistent Sampling and Simulation: Molecular Dynamics with Energy-Based Diffusion Models}, eprint = {arXiv:2506.17139}, year = {2025}, }

2024

NeurIPSSpotlight
Doob’s Lagrangian: A Sample-Efficient Variational Approach to Transition Path Sampling

Yuanqi Du*, Michael Plainer*, Rob Brekelmans*, Chenru Duan, Frank Noé, Carla P. Gomes, Alán Aspuru-Guzik, and Kirill Neklyudov

In Advances in Neural Information Processing Systems, 2024

Abs arXiv Bib Code

Rare event sampling in dynamical systems is a fundamental problem arising in the natural sciences, which poses significant computational challenges due to an exponentially large space of trajectories. For settings where the dynamical system of interest follows a Brownian motion with known drift, the question of conditioning the process to reach a given endpoint or desired rare event is definitively answered by Doob’s h-transform. However, the naive estimation of this transform is infeasible, as it requires simulating sufficiently many forward trajectories to estimate rare event probabilities. In this work, we propose a variational formulation of Doob’s h-transform as an optimization problem over trajectories between a given initial point and the desired ending point. To solve this optimization, we propose a simulation-free training objective with a model parameterization that imposes the desired boundary conditions by design. Our approach significantly reduces the search space over trajectories and avoids expensive trajectory simulation and inefficient importance sampling estimators which are required in existing methods. We demonstrate the ability of our method to find feasible transition paths on real-world molecular simulation and protein folding tasks.
@inproceedings{du2024doob, author = {Du, Yuanqi and Plainer, Michael and Brekelmans, Rob and Duan, Chenru and No{\'e}, Frank and Gomes, Carla P. and Aspuru-Guzik, Al{\'a}n and Neklyudov, Kirill}, title = {Doob’s Lagrangian: A Sample-Efficient Variational Approach to Transition Path Sampling}, booktitle = {Advances in Neural Information Processing Systems}, editor = {Globerson, A. and Mackey, L. and Belgrave, D. and Fan, A. and Paquet, U. and Tomczak, J. and Zhang, C.}, pages = {65791--65822}, publisher = {Curran Associates, Inc.}, volume = {37}, year = {2024} }

2023

GenBioSpotlight
Transition Path Sampling with Boltzmann Generator-based MCMC Moves

Michael Plainer*, Hannes Stärk*, Charlotte Bunne, and Stephan Günnemann

In Generative AI and Biology Workshop, 2023

Abs arXiv Bib Code

Sampling all possible transition paths between two 3D states of a molecular system has various applications ranging from catalyst design to drug discovery. Current approaches to sample transition paths use Markov chain Monte Carlo and rely on time-intensive molecular dynamics simulations to find new paths. Our approach operates in the latent space of a normalizing flow that maps from the molecule’s Boltzmann distribution to a Gaussian, where we propose new paths without requiring molecular simulations. Using alanine dipeptide, we explore Metropolis-Hastings acceptance criteria in the latent space for exact sampling and investigate different latent proposal mechanisms.
@inproceedings{plainer2023transition, author = {Plainer, Michael and St{\"a}rk, Hannes and Bunne, Charlotte and G{\"u}nnemann, Stephan}, title = {Transition Path Sampling with Boltzmann Generator-based MCMC Moves}, year = {2023}, maintitle = {Advances in Neural Information Processing Systems}, booktitle = {Generative AI and Biology Workshop}, }
MLSBOral
DiffDock-Pocket: Diffusion for Pocket-Level Docking with Sidechain Flexibility

Michael Plainer, Marcella Toth, Simon Dobers, Hannes Stärk, Gabriele Corso, Céline Marquet, and Regina Barzilay

In Machine Learning in Structural Biology, 2023

Abs Bib PDF Code

When a small molecule binds to a protein, the 3D structure and function of the protein can significantly change. Understanding this process, called molecular docking, is crucial in areas such as drug design. Recent learning-based attempts have shown promising results at this task, yet lack the necessary features that traditional approaches support. In this work, we close this gap by proposing DiffDock-Pocket: a diffusion-based all-atom docking algorithm conditioned on a binding target. Our model supports receptor flexibility by extending the generative diffusion process to the manifold describing the main degrees of freedom of the protein’s side chains. Empirically, we improve the state-of-the-art in site-specific-docking on the PDBBind benchmark. In particular, in the realistic scenario that no bound protein structure is available, we double the accuracy of current methods while being 20 times faster than other flexible approaches.
@inproceedings{plainer2023diffdockpocket, author = {Plainer, Michael and Toth, Marcella and Dobers, Simon and St{\"a}rk, Hannes and Corso, Gabriele and Marquet, C{\'e}line and Barzilay, Regina}, title = {{DiffDock-Pocket}: Diffusion for Pocket-Level Docking with Sidechain Flexibility}, year = {2023}, maintitle = {Advances in Neural Information Processing Systems}, booktitle = {Machine Learning in Structural Biology}, }
Transporting Densities Across Dimensions

Michael Plainer, Felix Dietrich, and Ioannis G. Kevrekidis

2023

Abs arXiv Bib

Even the best scientific equipment can only partially observe reality. Recorded data is often lower-dimensional, e.g., two-dimensional pictures of the three-dimensional world. Combining data from multiple experiments then results in a marginal density. This work shows how to transport such lower-dimensional marginal densities into a more informative, higher-dimensional joint space by leveraging time-delayed measurements from an observation process. This can augment the information from scientific equipment to construct a more coherent view. Classical transportation algorithms can be used when the source and target dimensions match. Our approach allows the transport of samples between spaces of different dimensions by exploiting information from the sample collection process. We reconstruct the surface of an implant from partial recordings of bacteria moving on it and construct a joint space for satellites orbiting the Earth by combining one-dimensional, time-delayed altitude measurements.
@article{plainer2023transporting, author = {Plainer, Michael and Dietrich, Felix and Kevrekidis, Ioannis G.}, title = {Transporting Densities Across Dimensions}, eprint = {arXiv:2305.16227}, year = {2023}, }

theses

MSc.
Machine Learning Techniques for Improved Transition Path Sampling

Michael Plainer

Technical University of Munich, Nov 2023

Master’s Thesis

Abs Bib PDF

The ability to efficiently, and most importantly accurately, simulate the atoms of molecules has opened many opportunities in various disciplines. In areas such as drug discovery or material science, we are interested in the constant small fluctuations captured by molecular simulations but also in finding rare transitions to different states as well. Transition path sampling (TPS) offers a powerful approach to exploring the pathways of rare events in complex systems, providing a comprehensive landscape of the transitional trajectories that traditional methods often miss. Current algorithms are based on Markov chain Monte Carlo and rely on computationally expensive molecular dynamics simulations. In this thesis, we will propose two new methods to overcome the issues of current approaches. As for the first approach, we demonstrate how we can sample transition paths in a learned latent space of a Boltzmann generator without the need for molecular dynamics simulations. For this, we reformulate the acceptance criterion of Metropolis-Hastings in the latent space to ensure that paths can be sampled with the correct probability. Additionally, we investigate how we can improve the current state of traditional TPS methodology. For this, we introduce a self-attention-based neural network architecture that uses the entire transition path to determine the optimal point to start molecular simulations from. We demonstrate the capabilities of our approaches on the molecule alanine dipeptide and introduce metrics and evaluation techniques to compare them with existing work. While the introduced latent TPS approach is mathematically correct, the produced results are not convincing and often exhibit unfavorable performance due to low acceptance of paths. Our ideas to improve point selection with context-aware neural networks on the other hand, seem promising and can improve on the state-of-the-art.
@mastersthesis{plainer2023machine, author = {Plainer, Michael}, title = {Machine Learning Techniques for Improved Transition Path Sampling}, year = {2023}, school = {Technical University of Munich}, month = nov, language = {en}, note = {Master's Thesis}, }
BSc.
Transport of Discontinuous Densities with Artificial Neural Networks

Michael Plainer

Technical University of Munich, Sep 2021

Bachelor’s Thesis

Abs Bib PDF

Nearly all real-world measurements can only record a part of the underlying truth due to technical limitations. In many fields, full comprehension of the system requires an understanding of how the unmeasurable inputs or states map to the measurable outputs. In cases where many individual measurements are performed, the density of the observation can be approximated with histograms. They count the frequency at which measurements fall in a given range. Each observed sample corresponds to exactly one unknown point in the input space that has been mapped by a function to produce exactly this recorded output. When the distribution of these points in the original input space is known (e.g. uniformly distributed), a transport function describing this mapping can be found. Identifying this transport function is the main objective of this thesis. The field of transportation theory is dedicated to finding these transportation maps between two (probability) measures that are optimal according to a metric. Those approaches can fail to identify the true underlying transport map, for example if it is not bijective or when the recorded density is discontinuous. Reconstructing this true underlying transport map can be done by employing an observation process that measures consecutive outputs of moving points. This reconstruction procedure is implemented with artificial neural networks and demonstrated by examples. Separately to the transport of measures, another network is implemented that learns the underlying dynamical system based on the observation process, allowing to extrapolate the movement of the points. Apart from fictitious examples, the procedure is also applied to reconstruct the shape of a simulated cell by synthesizing image data (e.g. produced by a microscope) and observing moving bacteria on the cell’s surface.
@mastersthesis{plainer2021transport, author = {Plainer, Michael}, title = {Transport of Discontinuous Densities with Artificial Neural Networks}, year = {2021}, school = {Technical University of Munich}, month = sep, language = {en}, note = {Bachelor's Thesis}, howpublished = {\url{https://mediatum.ub.tum.de/1632332/}}, }

* Equal contribution