Subject

Biology & Genetics

Life and heredity at the molecular level — DNA, the genome, gene editing and the tools that let us read and rewrite it.

31 papers in this field

BiologyNature · May 2024 Open access

Accurate structure prediction of biomolecular interactions with AlphaFold 3

Josh Abramson, Jonas Adler and John M. Jumper

This paper introduced AlphaFold 3, a unified deep learning model that predicts the joint structure of complexes containing proteins, nucleic acids, small-molecule ligands, ions, and modified residues. It replaces much of the prior architecture with a diffusion-based module that directly generates atomic coordinates. The model achieved substantially improved accuracy over specialized tools across many interaction types, including protein-ligand and protein-nucleic acid complexes.

alphafold3 biomolecular interactions diffusion model protein-ligand

BiologyNature · Jul 2023 Open access

De novo design of protein structure and function with RFdiffusion

Joseph L. Watson, David Juergens and David Baker

This paper introduced RFdiffusion, a generative diffusion model built on the RoseTTAFold network for de novo protein design. It enables a range of design tasks, including unconditional generation, symmetric oligomer design, functional motif scaffolding, and binder design. Many designs were experimentally validated, with solved structures closely matching the intended models.

protein design de novo design diffusion model rfdiffusion

BiologyNature · May 2023 Open access

A draft human pangenome reference

Wen-Wei Liao, Mobin Asri and Jana Ebler

The Human Pangenome Reference Consortium presents a first draft human pangenome built from 47 phased, diploid genome assemblies of genetically diverse individuals. The assemblies cover more than 99% of the expected sequence per genome at over 99% base-level and structural accuracy, and are combined into a graph-based reference. Relative to GRCh38, the pangenome adds about 119 million base pairs of euchromatic polymorphic sequence and 1,115 gene duplications, improving representation of variation at structurally complex loci.

pangenome human genome structural variation genome assembly

BiologyScience · Apr 2022 Open access

The complete sequence of a human genome

Sergey Nurk, Sergey Koren and Adam M. Phillippy

The Telomere-to-Telomere (T2T) Consortium reports T2T-CHM13, the first essentially gapless assembly of a human genome (all chromosomes except Y), totaling about 3.055 billion base pairs. The assembly resolves previously unfinished heterochromatic and repetitive regions, including centromeric satellite arrays, segmental duplications, and the short arms of the acrocentric chromosomes. It adds nearly 200 million base pairs of new sequence and corrects errors in prior reference assemblies.

human genome t2t genome assembly telomere-to-telomere

BiologyNature Methods · Oct 2021 Open access

Effective gene expression prediction from sequence by integrating long-range interactions

Žiga Avsec, Vikram Agarwal and David R. Kelley

This paper introduces Enformer, a transformer-based deep learning model that predicts gene expression and chromatin states directly from DNA sequence by integrating regulatory information from up to ~100 kb away. By using self-attention to capture long-range interactions, it substantially improves prediction accuracy over prior convolutional models. The approach also improves prediction of the effects of non-coding genetic variants on expression.

deep learning gene expression genomics transformer

BiologyNature · Jul 2021 Open access

Highly accurate protein structure prediction for the human proteome

Kathryn Tunyasuvunakool, John Jumper and Demis Hassabis

This companion paper applied AlphaFold to predict structures for nearly the entire human proteome and 20 other key organisms, producing a large public database of predicted models. It assessed coverage and confidence across the human proteome, showing that a substantial fraction of residues could be modeled with high or very high confidence. The work created the AlphaFold Protein Structure Database, greatly expanding structural coverage beyond experimentally determined structures.

human proteome alphafold protein structure prediction structural genomics

BiologyScience · Jul 2021 Open access

Accurate prediction of protein structures and interactions using a three-track neural network

Minkyung Baek, Frank DiMaio and David Baker

This paper presented RoseTTAFold, a three-track neural network that simultaneously processes one-dimensional sequence, two-dimensional residue-pair distances, and three-dimensional atomic coordinate information, with information flowing between the tracks. The method achieved protein structure prediction accuracy approaching that of AlphaFold2 while being more computationally efficient. It also demonstrated rapid generation of accurate models for protein-protein complexes.

protein structure prediction rosettafold deep learning protein interactions

BiologyScience · Oct 2020 Open access

De novo design of picomolar SARS-CoV-2 miniprotein inhibitors

Longxing Cao, Inna Goreshnik, Brian Coventry and David Baker

The authors used computational de novo protein design to create small, stable miniproteins that bind the SARS-CoV-2 spike receptor-binding domain and block its interaction with ACE2. Two design strategies were used: incorporating the ACE2 helix into a designed scaffold, and building entirely new binders against the RBD. The best designs bound with picomolar affinity and neutralized the virus, with cryo-EM confirming the binding modes matched the computational models.

protein design sars-cov-2 miniprotein inhibitors ace2

BiologyNature · Apr 2020 Open access

A SARS-CoV-2 protein interaction map reveals targets for drug repurposing

David E. Gordon, Gwendolyn M. Jang and Mehdi Bouhaddou

The authors expressed 26 of the 29 SARS-CoV-2 proteins in human cells and used affinity-purification mass spectrometry to map 332 high-confidence virus-human protein-protein interactions. They then identified 69 existing drugs and compounds that target the human proteins in this interactome and tested several for antiviral activity. Two pharmacological classes (mRNA translation inhibitors and Sigma1/Sigma2 receptor regulators) showed antiviral effects, providing candidate therapeutics for repurposing.

sars-cov-2 protein-interaction drug-repurposing covid-19

BiologyScience · Feb 2020 Open access

Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation

Daniel Wrapp, Nianshuang Wang, Kizzmekia S. Corbett, Jory A. Goldsmith, Ching-Lin Hsieh, Olubukola Abiona, et al.

The authors determined a 3.5 angstrom cryo-EM structure of the SARS-CoV-2 (2019-nCoV) spike glycoprotein ectodomain stabilized in the prefusion conformation. They showed the receptor-binding domain engages human ACE2 with roughly 10- to 20-fold higher affinity than the SARS-CoV-1 spike, helping explain efficient human transmission. They also found that SARS-CoV-1 receptor-binding antibodies did not appreciably bind the new spike, informing vaccine and therapeutic design.

sars-cov-2 spike-protein cryo-em prefusion

BiologyNature · Oct 2019 Open access

Search-and-replace genome editing without double-strand breaks or donor DNA

Andrew V. Anzalone, Peyton B. Randolph, Jessie R. Davis and David R. Liu

This paper introduced prime editing, a versatile genome-editing method that uses a Cas9 nickase fused to a reverse transcriptase guided by a prime editing guide RNA (pegRNA) to write new genetic information directly into a target site. Without requiring double-strand breaks or donor DNA, prime editing can install targeted insertions, deletions, and all 12 types of point mutations. The authors demonstrated correction of disease-relevant mutations in human cells with broad targeting flexibility and relatively low off-target activity.

prime editing crispr genome editing reverse transcriptase

BiologyCell · Jun 2019 Open access

Comprehensive Integration of Single-Cell Data

Tim Stuart, Andrew Butler and Rahul Satija

This paper presents the Seurat v3 framework for integrating single-cell datasets across different technologies, conditions, and modalities. It introduces 'anchors' — pairs of cells in a shared low-dimensional space representing a common biological state — to harmonize datasets and transfer labels. The methods enable joint analysis of scRNA-seq with other measurements such as protein (CITE-seq), chromatin accessibility, and spatial data.

single-cell data integration seurat anchors

BiologyNature · Oct 2018 Open access

The UK Biobank resource with deep phenotyping and genomic data

Clare Bycroft, Colin Freeman and Desislava Petkova

This paper describes the open-access UK Biobank resource of deep genetic and phenotypic data on roughly 500,000 participants. It details the genotyping of ~805,000 markers, imputation to over 90 million variants, and analyses of population structure, relatedness, and genotype quality. The resource has become a foundational dataset for genome-wide association studies and human complex-trait genetics.

biobank human genetics gwas population genomics

BiologyNature · Oct 2018 Open access

Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris

Nicholas Schaum, Stephen R. Quake and Tony Wyss-Coray

The Tabula Muris Consortium generated a compendium of single-cell transcriptome data spanning 20 organs and tissues from the mouse Mus musculus. Both droplet-based (10x) and FACS-sorted plate-based (Smart-seq2) methods were used to profile cells, enabling characterization of cell types across the organism. The resource provides a cross-tissue reference for defining and comparing mouse cell populations.

single-cell rna-seq mouse atlas tabula muris cell types

BiologyNew England Journal of Medicine · Feb 2018 Open access

Tisagenlecleucel in Children and Young Adults with B-Cell Lymphoblastic Leukemia

Shannon L. Maude, Theodore W. Laetsch and Jochen Buechner

This paper reports the pivotal global phase 2 ELIANA trial of tisagenlecleucel, an anti-CD19 chimeric antigen receptor (CAR) T-cell therapy, in children and young adults with relapsed or refractory B-cell acute lymphoblastic leukemia. It demonstrated high rates of durable remission and supported the first FDA approval of a CAR-T cell therapy. The study also characterized the safety profile, including cytokine release syndrome.

car-t immunotherapy leukemia clinical trial

BiologyNature · Oct 2017 Open access

Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage

Nicole M. Gaudelli, Alexis C. Komor, Holly A. Rees and David R. Liu

This work developed adenine base editors (ABEs) by evolving a transfer RNA adenosine deaminase to act on DNA, enabling direct conversion of A-T base pairs to G-C in genomic DNA without double-strand breaks. Because no natural DNA adenosine deaminase was available, the authors used directed evolution to create the enzyme, then fused it to Cas9 nickase. ABEs corrected target adenines efficiently and with high product purity and low indel formation in human cells.

base editing adenine base editor crispr genome editing

BiologyNature · Oct 2017 Open access

RNA targeting with CRISPR-Cas13

Omar O. Abudayyeh, Jonathan S. Gootenberg, Patrick Essletzbichler and Feng Zhang

This study characterized the class 2 type VI CRISPR effector Cas13a (formerly C2c2) as a programmable RNA-targeting tool in mammalian and plant cells. A catalytically inactive Cas13a (dCas13a) was used for RNA binding while active Cas13a enabled efficient, specific knockdown of endogenous transcripts. The authors showed RNA knockdown comparable to or more specific than RNA interference and demonstrated applications such as transcript tracking and splicing modulation.

cas13 rna targeting crispr rna knockdown

BiologyScience · Apr 2017 Open access

Nucleic acid detection with CRISPR-Cas13a/C2c2

Jonathan S. Gootenberg, Omar O. Abudayyeh, Jeong Wook Lee and Feng Zhang

This paper introduced SHERLOCK, a CRISPR-based nucleic acid detection platform built on the collateral RNase activity of Cas13a (C2c2). Upon recognizing a target sequence, Cas13a indiscriminately cleaves nearby reporter RNAs; combined with isothermal amplification, this yields highly sensitive, specific detection. The authors demonstrated attomolar sensitivity and single-base discrimination, with applications in detecting viruses (Zika, dengue), bacteria, and human genotypes.

cas13 diagnostics crispr nucleic acid detection

BiologyNature Reviews Molecular Cell Biology · Feb 2017 Open access

Biomolecular condensates: organizers of cellular biochemistry

Salman F. Banani, Hyun O. Lee, Anthony A. Hyman and Michael K. Rosen

This review synthesizes how cells organize biochemistry into membraneless compartments termed biomolecular condensates, which form largely through liquid-liquid phase separation driven by multivalent macromolecular interactions. It describes the physical principles of condensate formation, their compositions, and how they concentrate or sequester molecules to regulate cellular processes. The authors discuss functional roles and the emerging links between aberrant condensate behavior and disease.

biomolecular-condensates phase-separation liquid-liquid cell-biology

BiologyNature Communications · Jan 2017 Open access

Massively parallel digital transcriptional profiling of single cells

Grace X. Y. Zheng, Jessica M. Terry, Phillip Belgrader and Jason H. Bielas

This paper introduced a droplet-based microfluidic platform (the 10x Genomics GemCode/Chromium system) for high-throughput single-cell RNA sequencing using barcoded gel beads. The method enables 3' digital expression profiling of thousands of cells per run at low cost. The authors profiled tens of thousands of cells, including ~68,000 PBMCs, demonstrating the ability to resolve immune cell subpopulations and detect rare cell types.

single-cell rna-seq droplet microfluidics 10x genomics transcriptomics

BiologyNature · Aug 2016 Open access

Analysis of protein-coding genetic variation in 60,706 humans

Monkol Lek, Konrad J. Karczewski and Daniel G. MacArthur

The Exome Aggregation Consortium (ExAC) aggregates and jointly analyzes high-quality exome sequencing data from 60,706 individuals of diverse ancestries, producing the largest catalogue of human protein-coding variation at the time. The dataset reveals roughly one variant per eight exonic bases and provides direct evidence of widespread mutational recurrence. It enables improved estimation of gene-level intolerance to loss-of-function variation and refines the interpretation of pathogenic variants in clinical genetics.

exac exome genetic variation loss-of-function

BiologyScience · Jul 2016

Visualization and analysis of gene expression in tissue sections by spatial transcriptomics

Patrik L. Ståhl, Fredrik Salmén, Joakim Lundeberg and Jonas Frisén

The authors introduce 'spatial transcriptomics,' a method that places thin histological tissue sections onto a glass surface arrayed with barcoded reverse-transcription primers, so that mRNA captured at each position retains its two-dimensional spatial coordinates. Sequencing the barcoded cDNA reconstructs genome-wide expression maps directly on the tissue image. They demonstrate the approach on mouse brain and human breast cancer sections, recovering spatially resolved transcriptomes that align with tissue morphology.

spatial transcriptomics gene expression rna-seq tissue sections

BiologyNature · Apr 2016 Open access

Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage

Alexis C. Komor, Yongjoo B. Kim, Michael S. Packer, John A. Zuris and David R. Liu

This paper introduced cytosine base editing (CBE), a strategy that fuses a catalytically impaired Cas9 to a cytidine deaminase to directly convert C-G base pairs to T-A in genomic DNA without inducing double-strand breaks or requiring a donor template. The authors showed that within a programmable target window the deaminase converts cytosine to uracil, which is then read as thymine, and that inhibiting base-excision repair markedly improves editing efficiency. The approach achieved precise single-base correction in human and other mammalian cells.

base editing crispr genome editing deaminase

BiologyCell · Sept 2015 Open access

Cpf1 Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System

Bernd Zetsche, Jonathan S. Gootenberg and Feng Zhang

This paper characterizes Cpf1 (later named Cas12a) as a single-RNA-guided DNA endonuclease of a class 2 CRISPR-Cas system, expanding the genome-editing toolbox beyond Cas9. Cpf1 requires only a single crRNA (no tracrRNA), recognizes a T-rich PAM, and produces staggered cuts with overhangs. The authors demonstrate Cpf1-mediated genome editing in human cells.

crispr cas12a cpf1 genome editing

BiologyScience · Apr 2015 Open access

Spatially resolved, highly multiplexed RNA profiling in single cells

Kok Hao Chen, Alistair N. Boettiger, Jeffrey R. Moffitt, Siyuan Wang and Xiaowei Zhuang

This paper introduces MERFISH, a single-molecule imaging method that uses error-robust barcoding combined with sequential rounds of hybridization and imaging to identify many RNA species in single cells in situ. The error-correcting binary codes allow detection and misidentification correction across thousands of transcripts. The authors imaged up to ~1,000 genes in individual human cells, mapping spatial expression and revealing gene co-variation and subcellular RNA localization.

merfish spatial transcriptomics single-molecule fish rna imaging

BiologyScience · Aug 2012 Open access

A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity

Martin Jinek, Krzysztof Chylinski, Ines Fonfara, Michael Hauer, Jennifer A. Doudna and Emmanuelle Charpentier

This study demonstrated that the CRISPR-associated protein Cas9 from Streptococcus pyogenes is an RNA-guided DNA endonuclease whose target specificity is determined by a dual-RNA structure formed by a CRISPR RNA (crRNA) base-paired to a trans-activating crRNA (tracrRNA). The authors showed that Cas9 introduces site-specific double-strand breaks in target DNA, with its HNH domain cleaving the complementary strand and its RuvC-like domain cleaving the noncomplementary strand. Critically, they engineered the two guide RNAs into a single chimeric guide RNA that still directed sequence-specific cleavage, establishing the system as a programmable tool for genome editing.

crispr cas9 genome editing molecular biology

BiologyCell · Aug 2006

Induction of Pluripotent Stem Cells from Mouse Embryonic and Adult Fibroblast Cultures by Defined Factors

Kazutoshi Takahashi and Shinya Yamanaka

Takahashi and Yamanaka demonstrated that pluripotent stem cells can be generated directly from mouse fibroblast cultures by introducing a defined set of transcription factors. Screening candidate genes associated with pluripotency, they identified four factors—Oct3/4, Sox2, c-Myc, and Klf4—that were sufficient to reprogram both embryonic and adult fibroblasts into cells they termed induced pluripotent stem (iPS) cells. These iPS cells resembled embryonic stem cells in morphology, growth, and marker expression and could form teratomas containing tissues of all three germ layers.

induced pluripotent stem cells ips cells reprogramming stem cells

BiologyNature · Feb 2001

Initial sequencing and analysis of the human genome

Eric S. Lander, Lauren M. Linton, Bruce Birren, Chad Nusbaum and Michael C. Zody

This paper reported the results of the publicly funded Human Genome Project, presenting and making freely available a draft sequence covering the great majority of the human genome along with an initial analysis. The consortium described the broad genomic landscape—including gene content, repeat elements, GC content, and recombination rates—and estimated a surprisingly low number of protein-coding genes, on the order of roughly 30,000–40,000. The work provided a foundational reference for human biology, medicine, and evolutionary studies.

human genome genomics dna sequencing human genome project

BiologyScience · Dec 1985

Enzymatic Amplification of β-Globin Genomic Sequences and Restriction Site Analysis for Diagnosis of Sickle Cell Anemia

Randall K. Saiki, Stephen Scharf, Fred Faloona, Kary B. Mullis, Glenn T. Horn, Henry A. Erlich, et al.

This paper reported the first published application of in vitro primer-mediated enzymatic amplification of DNA—the technique that became known as the polymerase chain reaction (PCR)—as part of a rapid, sensitive prenatal diagnostic test for sickle cell anemia. The authors amplified specific β-globin target sequences from genomic DNA roughly 220,000-fold and then distinguished the normal (βA) and sickle (βS) alleles by restriction endonuclease digestion of a hybridized end-labeled oligonucleotide probe. The combined procedure allowed genotyping in under a day using far less than one microgram of genomic DNA.

pcr polymerase chain reaction dna amplification molecular diagnostics

BiologyProceedings of the National Academy of Sciences · Dec 1977 Open access

DNA sequencing with chain-terminating inhibitors

F. Sanger, S. Nicklen and A. R. Coulson

Sanger, Nicklen, and Coulson introduced a new method for determining nucleotide sequences in DNA, building on their earlier 'plus and minus' technique. The method uses 2′,3′-dideoxy and arabinonucleoside analogues of the normal deoxynucleoside triphosphates, which act as specific chain-terminating inhibitors of DNA polymerase, generating a set of partially extended chains that can be size-separated by gel electrophoresis to read the sequence. Applied to bacteriophage φX174 DNA, the approach proved faster and more accurate than the original plus or minus method.

dna sequencing sanger sequencing molecular biology methods

BiologyNature · Apr 1953

Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid

J. D. Watson and F. H. C. Crick

In this one-page report, Watson and Crick proposed a double-helical structure for the salt of deoxyribose nucleic acid (DNA), consisting of two right-handed helical polynucleotide chains coiled around a common axis and running in antiparallel directions. They proposed that the chains are held together by hydrogen bonding between specific complementary base pairs—adenine with thymine and guanine with cytosine—a feature dictated by the structure. They famously noted that this specific pairing immediately suggested a possible copying mechanism for the genetic material.

dna double helix molecular biology structural biology