NVIDIA Research, in collaboration with the University of Oxford and Mila – Québec AI Institute, has introduced La-Proteina, a new method for atomistic protein design aiming to generate fully atomistic protein structures with their underlying amino acid sequences.
La-Proteina utilizes a partially latent protein representation to model the backbone structure explicitly while capturing sequence and atomistic details through per-residue latent variables of fixed dimensionality, addressing challenges in explicit side-chain representations.
The model is trained using a Variational Autoencoder (VAE) and a Partially Latent Flow Matching Model, achieving state-of-the-art performance in terms of designability, diversity, and structural validity, especially in scalability to large proteins.
La-Proteina's architectural design involves neural networks based on efficient transformer architectures, with the denoiser network conditioning on interpolation times, totaling around 160M parameters, to enhance performance in atomistic protein generation.