This paper introduced AlphaFold 3, a unified deep learning model that predicts the joint structure of complexes containing proteins, nucleic acids, small-molecule ligands, ions, and modified residues. It replaces much of the prior architecture with a diffusion-based module that directly generates atomic coordinates. The model achieved substantially improved accuracy over specialized tools across many interaction types, including protein-ligand and protein-nucleic acid complexes.
This paper introduced RFdiffusion, a generative diffusion model built on the RoseTTAFold network for de novo protein design. It enables a range of design tasks, including unconditional generation, symmetric oligomer design, functional motif scaffolding, and binder design. Many designs were experimentally validated, with solved structures closely matching the intended models.
The Human Pangenome Reference Consortium presents a first draft human pangenome built from 47 phased, diploid genome assemblies of genetically diverse individuals. The assemblies cover more than 99% of the expected sequence per genome at over 99% base-level and structural accuracy, and are combined into a graph-based reference. Relative to GRCh38, the pangenome adds about 119 million base pairs of euchromatic polymorphic sequence and 1,115 gene duplications, improving representation of variation at structurally complex loci.
The Telomere-to-Telomere (T2T) Consortium reports T2T-CHM13, the first essentially gapless assembly of a human genome (all chromosomes except Y), totaling about 3.055 billion base pairs. The assembly resolves previously unfinished heterochromatic and repetitive regions, including centromeric satellite arrays, segmental duplications, and the short arms of the acrocentric chromosomes. It adds nearly 200 million base pairs of new sequence and corrects errors in prior reference assemblies.
This paper introduces Enformer, a transformer-based deep learning model that predicts gene expression and chromatin states directly from DNA sequence by integrating regulatory information from up to ~100 kb away. By using self-attention to capture long-range interactions, it substantially improves prediction accuracy over prior convolutional models. The approach also improves prediction of the effects of non-coding genetic variants on expression.
Kathryn Tunyasuvunakool, John Jumper and Demis Hassabis
This companion paper applied AlphaFold to predict structures for nearly the entire human proteome and 20 other key organisms, producing a large public database of predicted models. It assessed coverage and confidence across the human proteome, showing that a substantial fraction of residues could be modeled with high or very high confidence. The work created the AlphaFold Protein Structure Database, greatly expanding structural coverage beyond experimentally determined structures.
This paper presented RoseTTAFold, a three-track neural network that simultaneously processes one-dimensional sequence, two-dimensional residue-pair distances, and three-dimensional atomic coordinate information, with information flowing between the tracks. The method achieved protein structure prediction accuracy approaching that of AlphaFold2 while being more computationally efficient. It also demonstrated rapid generation of accurate models for protein-protein complexes.
Longxing Cao, Inna Goreshnik, Brian Coventry and David Baker
The authors used computational de novo protein design to create small, stable miniproteins that bind the SARS-CoV-2 spike receptor-binding domain and block its interaction with ACE2. Two design strategies were used: incorporating the ACE2 helix into a designed scaffold, and building entirely new binders against the RBD. The best designs bound with picomolar affinity and neutralized the virus, with cryo-EM confirming the binding modes matched the computational models.
David E. Gordon, Gwendolyn M. Jang and Mehdi Bouhaddou
The authors expressed 26 of the 29 SARS-CoV-2 proteins in human cells and used affinity-purification mass spectrometry to map 332 high-confidence virus-human protein-protein interactions. They then identified 69 existing drugs and compounds that target the human proteins in this interactome and tested several for antiviral activity. Two pharmacological classes (mRNA translation inhibitors and Sigma1/Sigma2 receptor regulators) showed antiviral effects, providing candidate therapeutics for repurposing.
Daniel Wrapp, Nianshuang Wang, Kizzmekia S. Corbett, Jory A. Goldsmith, Ching-Lin Hsieh, Olubukola Abiona, et al.
The authors determined a 3.5 angstrom cryo-EM structure of the SARS-CoV-2 (2019-nCoV) spike glycoprotein ectodomain stabilized in the prefusion conformation. They showed the receptor-binding domain engages human ACE2 with roughly 10- to 20-fold higher affinity than the SARS-CoV-1 spike, helping explain efficient human transmission. They also found that SARS-CoV-1 receptor-binding antibodies did not appreciably bind the new spike, informing vaccine and therapeutic design.
Andrew V. Anzalone, Peyton B. Randolph, Jessie R. Davis and David R. Liu
This paper introduced prime editing, a versatile genome-editing method that uses a Cas9 nickase fused to a reverse transcriptase guided by a prime editing guide RNA (pegRNA) to write new genetic information directly into a target site. Without requiring double-strand breaks or donor DNA, prime editing can install targeted insertions, deletions, and all 12 types of point mutations. The authors demonstrated correction of disease-relevant mutations in human cells with broad targeting flexibility and relatively low off-target activity.
This paper presents the Seurat v3 framework for integrating single-cell datasets across different technologies, conditions, and modalities. It introduces 'anchors' — pairs of cells in a shared low-dimensional space representing a common biological state — to harmonize datasets and transfer labels. The methods enable joint analysis of scRNA-seq with other measurements such as protein (CITE-seq), chromatin accessibility, and spatial data.
Clare Bycroft, Colin Freeman and Desislava Petkova
This paper describes the open-access UK Biobank resource of deep genetic and phenotypic data on roughly 500,000 participants. It details the genotyping of ~805,000 markers, imputation to over 90 million variants, and analyses of population structure, relatedness, and genotype quality. The resource has become a foundational dataset for genome-wide association studies and human complex-trait genetics.
Nicholas Schaum, Stephen R. Quake and Tony Wyss-Coray
The Tabula Muris Consortium generated a compendium of single-cell transcriptome data spanning 20 organs and tissues from the mouse Mus musculus. Both droplet-based (10x) and FACS-sorted plate-based (Smart-seq2) methods were used to profile cells, enabling characterization of cell types across the organism. The resource provides a cross-tissue reference for defining and comparing mouse cell populations.
Shannon L. Maude, Theodore W. Laetsch and Jochen Buechner
This paper reports the pivotal global phase 2 ELIANA trial of tisagenlecleucel, an anti-CD19 chimeric antigen receptor (CAR) T-cell therapy, in children and young adults with relapsed or refractory B-cell acute lymphoblastic leukemia. It demonstrated high rates of durable remission and supported the first FDA approval of a CAR-T cell therapy. The study also characterized the safety profile, including cytokine release syndrome.
Nicole M. Gaudelli, Alexis C. Komor, Holly A. Rees and David R. Liu
This work developed adenine base editors (ABEs) by evolving a transfer RNA adenosine deaminase to act on DNA, enabling direct conversion of A-T base pairs to G-C in genomic DNA without double-strand breaks. Because no natural DNA adenosine deaminase was available, the authors used directed evolution to create the enzyme, then fused it to Cas9 nickase. ABEs corrected target adenines efficiently and with high product purity and low indel formation in human cells.
Omar O. Abudayyeh, Jonathan S. Gootenberg, Patrick Essletzbichler and Feng Zhang
This study characterized the class 2 type VI CRISPR effector Cas13a (formerly C2c2) as a programmable RNA-targeting tool in mammalian and plant cells. A catalytically inactive Cas13a (dCas13a) was used for RNA binding while active Cas13a enabled efficient, specific knockdown of endogenous transcripts. The authors showed RNA knockdown comparable to or more specific than RNA interference and demonstrated applications such as transcript tracking and splicing modulation.
Jonathan S. Gootenberg, Omar O. Abudayyeh, Jeong Wook Lee and Feng Zhang
This paper introduced SHERLOCK, a CRISPR-based nucleic acid detection platform built on the collateral RNase activity of Cas13a (C2c2). Upon recognizing a target sequence, Cas13a indiscriminately cleaves nearby reporter RNAs; combined with isothermal amplification, this yields highly sensitive, specific detection. The authors demonstrated attomolar sensitivity and single-base discrimination, with applications in detecting viruses (Zika, dengue), bacteria, and human genotypes.
Salman F. Banani, Hyun O. Lee, Anthony A. Hyman and Michael K. Rosen
This review synthesizes how cells organize biochemistry into membraneless compartments termed biomolecular condensates, which form largely through liquid-liquid phase separation driven by multivalent macromolecular interactions. It describes the physical principles of condensate formation, their compositions, and how they concentrate or sequester molecules to regulate cellular processes. The authors discuss functional roles and the emerging links between aberrant condensate behavior and disease.
Grace X. Y. Zheng, Jessica M. Terry, Phillip Belgrader and Jason H. Bielas
This paper introduced a droplet-based microfluidic platform (the 10x Genomics GemCode/Chromium system) for high-throughput single-cell RNA sequencing using barcoded gel beads. The method enables 3' digital expression profiling of thousands of cells per run at low cost. The authors profiled tens of thousands of cells, including ~68,000 PBMCs, demonstrating the ability to resolve immune cell subpopulations and detect rare cell types.
Monkol Lek, Konrad J. Karczewski and Daniel G. MacArthur
The Exome Aggregation Consortium (ExAC) aggregates and jointly analyzes high-quality exome sequencing data from 60,706 individuals of diverse ancestries, producing the largest catalogue of human protein-coding variation at the time. The dataset reveals roughly one variant per eight exonic bases and provides direct evidence of widespread mutational recurrence. It enables improved estimation of gene-level intolerance to loss-of-function variation and refines the interpretation of pathogenic variants in clinical genetics.
Patrik L. Ståhl, Fredrik Salmén, Joakim Lundeberg and Jonas Frisén
The authors introduce 'spatial transcriptomics,' a method that places thin histological tissue sections onto a glass surface arrayed with barcoded reverse-transcription primers, so that mRNA captured at each position retains its two-dimensional spatial coordinates. Sequencing the barcoded cDNA reconstructs genome-wide expression maps directly on the tissue image. They demonstrate the approach on mouse brain and human breast cancer sections, recovering spatially resolved transcriptomes that align with tissue morphology.
Alexis C. Komor, Yongjoo B. Kim, Michael S. Packer, John A. Zuris and David R. Liu
This paper introduced cytosine base editing (CBE), a strategy that fuses a catalytically impaired Cas9 to a cytidine deaminase to directly convert C-G base pairs to T-A in genomic DNA without inducing double-strand breaks or requiring a donor template. The authors showed that within a programmable target window the deaminase converts cytosine to uracil, which is then read as thymine, and that inhibiting base-excision repair markedly improves editing efficiency. The approach achieved precise single-base correction in human and other mammalian cells.
Bernd Zetsche, Jonathan S. Gootenberg and Feng Zhang
This paper characterizes Cpf1 (later named Cas12a) as a single-RNA-guided DNA endonuclease of a class 2 CRISPR-Cas system, expanding the genome-editing toolbox beyond Cas9. Cpf1 requires only a single crRNA (no tracrRNA), recognizes a T-rich PAM, and produces staggered cuts with overhangs. The authors demonstrate Cpf1-mediated genome editing in human cells.
Kok Hao Chen, Alistair N. Boettiger, Jeffrey R. Moffitt, Siyuan Wang and Xiaowei Zhuang
This paper introduces MERFISH, a single-molecule imaging method that uses error-robust barcoding combined with sequential rounds of hybridization and imaging to identify many RNA species in single cells in situ. The error-correcting binary codes allow detection and misidentification correction across thousands of transcripts. The authors imaged up to ~1,000 genes in individual human cells, mapping spatial expression and revealing gene co-variation and subcellular RNA localization.
Martin Jinek, Krzysztof Chylinski, Ines Fonfara, Michael Hauer, Jennifer A. Doudna and Emmanuelle Charpentier
This study demonstrated that the CRISPR-associated protein Cas9 from Streptococcus pyogenes is an RNA-guided DNA endonuclease whose target specificity is determined by a dual-RNA structure formed by a CRISPR RNA (crRNA) base-paired to a trans-activating crRNA (tracrRNA). The authors showed that Cas9 introduces site-specific double-strand breaks in target DNA, with its HNH domain cleaving the complementary strand and its RuvC-like domain cleaving the noncomplementary strand. Critically, they engineered the two guide RNAs into a single chimeric guide RNA that still directed sequence-specific cleavage, establishing the system as a programmable tool for genome editing.
Takahashi and Yamanaka demonstrated that pluripotent stem cells can be generated directly from mouse fibroblast cultures by introducing a defined set of transcription factors. Screening candidate genes associated with pluripotency, they identified four factors—Oct3/4, Sox2, c-Myc, and Klf4—that were sufficient to reprogram both embryonic and adult fibroblasts into cells they termed induced pluripotent stem (iPS) cells. These iPS cells resembled embryonic stem cells in morphology, growth, and marker expression and could form teratomas containing tissues of all three germ layers.
Eric S. Lander, Lauren M. Linton, Bruce Birren, Chad Nusbaum and Michael C. Zody
This paper reported the results of the publicly funded Human Genome Project, presenting and making freely available a draft sequence covering the great majority of the human genome along with an initial analysis. The consortium described the broad genomic landscape—including gene content, repeat elements, GC content, and recombination rates—and estimated a surprisingly low number of protein-coding genes, on the order of roughly 30,000–40,000. The work provided a foundational reference for human biology, medicine, and evolutionary studies.
Randall K. Saiki, Stephen Scharf, Fred Faloona, Kary B. Mullis, Glenn T. Horn, Henry A. Erlich, et al.
This paper reported the first published application of in vitro primer-mediated enzymatic amplification of DNA—the technique that became known as the polymerase chain reaction (PCR)—as part of a rapid, sensitive prenatal diagnostic test for sickle cell anemia. The authors amplified specific β-globin target sequences from genomic DNA roughly 220,000-fold and then distinguished the normal (βA) and sickle (βS) alleles by restriction endonuclease digestion of a hybridized end-labeled oligonucleotide probe. The combined procedure allowed genotyping in under a day using far less than one microgram of genomic DNA.
Sanger, Nicklen, and Coulson introduced a new method for determining nucleotide sequences in DNA, building on their earlier 'plus and minus' technique. The method uses 2′,3′-dideoxy and arabinonucleoside analogues of the normal deoxynucleoside triphosphates, which act as specific chain-terminating inhibitors of DNA polymerase, generating a set of partially extended chains that can be size-separated by gel electrophoresis to read the sequence. Applied to bacteriophage φX174 DNA, the approach proved faster and more accurate than the original plus or minus method.
In this one-page report, Watson and Crick proposed a double-helical structure for the salt of deoxyribose nucleic acid (DNA), consisting of two right-handed helical polynucleotide chains coiled around a common axis and running in antiparallel directions. They proposed that the chains are held together by hydrogen bonding between specific complementary base pairs—adenine with thymine and guanine with cytosine—a feature dictated by the structure. They famously noted that this specific pairing immediately suggested a possible copying mechanism for the genetic material.