An open index of research

A status.lu publication

Biology & Genetics

Accurate prediction of protein structures and interactions using a three-track neural network

Minkyung Baek, Frank DiMaio, David Baker · Collaboration of ~23 authors led from the University of Washington (Baker lab) and collaborators; David Baker is the corresponding/senior author. Other authors include Ivan Anishchenko, Justas Dauparas, Sergey Ovchinnikov, Gyu Rie Lee, Jue Wang.

Published 15 July 2021 · Science · Journal article

Summary

This paper presented RoseTTAFold, a three-track neural network that simultaneously processes one-dimensional sequence, two-dimensional residue-pair distances, and three-dimensional atomic coordinate information, with information flowing between the tracks. The method achieved protein structure prediction accuracy approaching that of AlphaFold2 while being more computationally efficient. It also demonstrated rapid generation of accurate models for protein-protein complexes.

Key findings

  • A three-track architecture that exchanges information among sequence, distance, and coordinate representations substantially improves prediction over two-track approaches.
  • RoseTTAFold produced accurate single-chain structures and could model protein complexes, with run times short enough for practical use.
  • The system was used to build models for biologically important proteins, including human targets relevant to function and drug discovery.

Subjects & keywords

Cite this paper

APA

Minkyung Baek, Frank DiMaio, & David Baker [Collaboration of ~23 authors led from the University of Washington (Baker lab) and collaborators; David Baker is the corresponding/senior author. Other authors include Ivan Anishchenko, Justas Dauparas, Sergey Ovchinnikov, Gyu Rie Lee, Jue Wang.] (2021). Accurate prediction of protein structures and interactions using a three-track neural network. Science. https://doi.org/10.1126/science.abj8754

BibTeX
@article{baek2021accurate,
  author    = {Minkyung Baek and Frank DiMaio and David Baker and {Collaboration of ~23 authors led from the University of Washington (Baker lab) and collaborators; David Baker is the corresponding/senior author. Other authors include Ivan Anishchenko, Justas Dauparas, Sergey Ovchinnikov, Gyu Rie Lee, Jue Wang.}},
  title     = {Accurate prediction of protein structures and interactions using a three-track neural network},
  journal   = {Science},
  year      = {2021},
  doi       = {10.1126/science.abj8754},
  url       = {https://doi.org/10.1126/science.abj8754}
}

Related in Biology & Genetics

Accurate structure prediction of biomolecular interactions with AlphaFold 3

Josh Abramson, Jonas Adler and John M. Jumper

This paper introduced AlphaFold 3, a unified deep learning model that predicts the joint structure of complexes containing proteins, nucleic acids, small-molecule ligands, ions, and modified residues. It replaces much of the prior architecture with a diffusion-based module that directly generates atomic coordinates. The model achieved substantially improved accuracy over specialized tools across many interaction types, including protein-ligand and protein-nucleic acid complexes.

Nature Open access

De novo design of protein structure and function with RFdiffusion

Joseph L. Watson, David Juergens and David Baker

This paper introduced RFdiffusion, a generative diffusion model built on the RoseTTAFold network for de novo protein design. It enables a range of design tasks, including unconditional generation, symmetric oligomer design, functional motif scaffolding, and binder design. Many designs were experimentally validated, with solved structures closely matching the intended models.

Nature Open access

A draft human pangenome reference

Wen-Wei Liao, Mobin Asri and Jana Ebler

The Human Pangenome Reference Consortium presents a first draft human pangenome built from 47 phased, diploid genome assemblies of genetically diverse individuals. The assemblies cover more than 99% of the expected sequence per genome at over 99% base-level and structural accuracy, and are combined into a graph-based reference. Relative to GRCh38, the pangenome adds about 119 million base pairs of euchromatic polymorphic sequence and 1,115 gene duplications, improving representation of variation at structurally complex loci.

Nature Open access