A Group Symmetric Stochastic Differential Equation Model
for Molecule Multi-modal Pretraining

  • 1Mila
  • 2Université de Montréal
  • 3Chinese Academy of Sciences
  • 4National Research Council Canada
  • 5University of Ottawa
  • 6HEC Montréal
  • 7CIFAR AI Chair

  • Co-first

Previous Work (GraphMVP) Main Issue: The generative loss is proxy to the actual generative loss.
Solution (MoleculeSDE): We propose doing the actual conditional generation in the mutual way.
  • We start by aiming at maximizing the following objective for MI maximization: $$\mathcal{L}_{\text{MI}} = \frac{1}{2} \mathbb{E}_{p(x,y)} \big[ \log p(y|x) + \log p(x|y) \big],$$ where x and y are for the 2D topologies and 3D structures respectively.
  • MoleculeSDE: use diffusion model for estimation.
    • One diffusion model (SDE) for 3D structures generation conditioned on 2D topologies.
    • One diffusion model (SDE) for 2D topologies generation conditioned on 3D structures.

Trajectories Demos

3D structures given 2D topologies.
(Generate atom coordinates from atom types and bond types.)
2D topologies given 3D structures.
(Generate bond types from atom coordinates and types.)

funny animation GIF