Research
A full list of publications can be found here.
2024
Democratizing protein language models with parameter-efficient fine-tuning
Proceedings of the National Academy of Sciences
·
20 Jun 2024
·
doi:10.1073/pnas.2405840121
Scanorama: integrating large and diverse single-cell transcriptomic datasets
Nature Protocols
·
06 Jun 2024
·
doi:10.1038/s41596-024-00991-3
Causal gene regulatory analysis with RNA velocity reveals an interplay between slow and fast transcription factors
Cell Systems
·
01 May 2024
·
doi:10.1016/j.cels.2024.04.005
AlphaFold Meets Flow Matching for Generating Protein Ensembles
arXiv
·
01 Jan 2024
·
doi:10.48550/arXiv.2402.04845
Secure Discovery of Genetic Relatives Across Large-Scale and Distributed Genomic Datasets
Lecture Notes in Computer Science
·
01 Jan 2024
·
doi:10.1007/978-1-0716-3989-4_19
2023
TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions
Bioinformatics
·
28 Oct 2023
·
doi:10.1093/bioinformatics/btad663
SCA: recovering single-cell heterogeneity through information-based dimensionality reduction
Genome Biology
·
25 Aug 2023
·
doi:10.1186/s13059-023-02998-7
Assessing transcriptomic reidentification risks using discriminative sequence models
Genome Research
·
04 Aug 2023
·
doi:10.1101/gr.277699.123
Efficient mapping of accurate long reads in minimizer space with mapquik
Genome Research
·
30 Jun 2023
·
doi:10.1101/gr.277679.123
Contrastive learning in protein language space predicts interactions between drugs and protein targets
Proceedings of the National Academy of Sciences
·
08 Jun 2023
·
doi:10.1073/pnas.2220778120
sfkit: a web-based toolkit for secure and federated genomic analysis
Nucleic Acids Research
·
29 May 2023
·
doi:10.1093/nar/gkad464
Scalable and Privacy-Preserving Federated Principal Component Analysis
2023 IEEE Symposium on Security and Privacy (SP)
·
01 May 2023
·
doi:10.1109/SP46215.2023.10179350
Learning the Language of Antibody Hypervariability
Cold Spring Harbor Laboratory
·
28 Apr 2023
·
doi:10.1101/2023.04.26.538476
Unveiling causal regulatory mechanisms through cell-state parallax
Cold Spring Harbor Laboratory
·
03 Mar 2023
·
doi:10.1101/2023.03.02.530529
Sequre: a high-performance framework for secure multiparty computation enables biomedical data sharing
Genome Biology
·
11 Jan 2023
·
doi:10.1186/s13059-022-02841-5
EigenFold: Generative Protein Structure Prediction with Diffusion Models
arXiv
·
01 Jan 2023
·
doi:10.48550/arXiv.2304.02198
Equivariant Scalar Fields for Molecular Docking with Fast Fourier Transforms
arXiv
·
01 Jan 2023
·
doi:10.48550/arXiv.2312.04323
2022
Learning the Drug-Target Interaction Lexicon
Cold Spring Harbor Laboratory
·
10 Dec 2022
·
doi:10.1101/2022.12.06.519374
Navigating bottlenecks and trade-offs in genomic data analysis
Nature Reviews Genetics
·
07 Dec 2022
·
doi:10.1038/s41576-022-00551-z
Secure and Federated Genome-Wide Association Studies for Biobank-Scale Datasets
Cold Spring Harbor Laboratory
·
02 Dec 2022
·
doi:10.1101/2022.11.30.518537
Uncovering structural ensembles from single-particle cryo-EM data using cryoDRGN
Nature Protocols
·
14 Nov 2022
·
doi:10.1038/s41596-022-00763-x
Adapting protein language models for rapid DTI prediction
Cold Spring Harbor Laboratory
·
04 Nov 2022
·
doi:10.1101/2022.11.03.515084
Contrasting drugs from decoys
Cold Spring Harbor Laboratory
·
04 Nov 2022
·
doi:10.1101/2022.11.03.515086
CryoDRGN2: Ab Initio Neural Reconstruction of Dynamic Protein Complexes
Microscopy and Microanalysis
·
01 Aug 2022
·
doi:10.1017/S1431927622005062
Prioritizing transcription factor perturbations from single-cell transcriptomics
Cold Spring Harbor Laboratory
·
30 Jun 2022
·
doi:10.1101/2022.06.27.497786
Genome-wide mapping of somatic mutation rates uncovers drivers of cancer
Nature Biotechnology
·
20 Jun 2022
·
doi:10.1038/s41587-022-01353-8
Secure and federated linear mixed model association tests
Cold Spring Harbor Laboratory
·
24 May 2022
·
doi:10.1101/2022.05.20.492837
Cellular and transcriptional diversity over the course of human lactation
Proceedings of the National Academy of Sciences
·
04 Apr 2022
·
doi:10.1073/pnas.2121720119
Deep learning guided optimization of human antibody against SARS-CoV-2 variants with broad neutralization
Proceedings of the National Academy of Sciences
·
01 Mar 2022
·
doi:10.1073/pnas.2122954119
2021
Deciphering the species-level structure of topologically associating domains
Cold Spring Harbor Laboratory
·
29 Oct 2021
·
doi:10.1101/2021.10.28.466333
Scalable Multimer Structure Prediction using Diffusion Models
NeurIPS 2023 AI for Science Workshop
·
28 Oct 2021
·
[no id info]
Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer
Cell Systems
·
01 Oct 2021
·
doi:10.1016/j.cels.2021.08.009
D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions
Cell Systems
·
01 Oct 2021
·
doi:10.1016/j.cels.2021.08.010
A Python-based programming language for high-performance computational genomics
Nature Biotechnology
·
19 Jul 2021
·
doi:10.1038/s41587-021-00985-6
Bayesian information sharing enhances detection of regulatory associations in rare cell types
Bioinformatics
·
01 Jul 2021
·
doi:10.1093/bioinformatics/btab269
Levenshtein Distance, Sequence Comparison and Biological Database Search
IEEE Transactions on Information Theory
·
01 Jun 2021
·
doi:10.1109/TIT.2020.2996543
Learning the protein language: Evolution, structure, and function
Cell Systems
·
01 Jun 2021
·
doi:10.1016/j.cels.2021.05.017
Schema: metric learning enables interpretable synthesis of heterogeneous single-cell modalities
Genome Biology
·
03 May 2021
·
doi:10.1186/s13059-021-02313-2
CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks
Nature Methods
·
01 Feb 2021
·
doi:10.1038/s41592-020-01049-4
Assessing single-cell transcriptomic variability through density-preserving data visualization
Nature Biotechnology
·
18 Jan 2021
·
doi:10.1038/s41587-020-00801-7
Learning the language of viral evolution and escape
Science
·
15 Jan 2021
·
doi:10.1126/science.abd7331
Exploring generative atomic models in cryo-EM reconstruction
arXiv
·
01 Jan 2021
·
doi:10.48550/arXiv.2107.01331
2020
Learning mutational semantics
Advances in Neural Information Processing Systems
·
12 Dec 2020
·
[no id info]
Leveraging Uncertainty in Machine Learning Accelerates Biological Discovery and Design
Cell Systems
·
01 Nov 2020
·
doi:10.1016/j.cels.2020.09.007
Topaz-Denoise: general deep denoising models for cryoEM and cryoET
Nature Communications
·
15 Oct 2020
·
doi:10.1038/s41467-020-18952-1
Improved haplotype inference by exploiting long-range linking and allelic imbalance in RNA-seq datasets
Nature Communications
·
16 Sep 2020
·
doi:10.1038/s41467-020-18320-z
Computational Methods for Single-Cell RNA Sequencing
Annual Review of Biomedical Data Science
·
20 Jul 2020
·
doi:10.1146/annurev-biodatasci-012220-100601
Hopper: a mathematically optimal algorithm for sketching biological data
Bioinformatics
·
01 Jul 2020
·
doi:10.1093/bioinformatics/btaa408
Privacy-Preserving Biomedical Database Queries with Optimal Privacy-Utility Trade-Offs
Cell Systems
·
01 May 2020
·
doi:10.1016/j.cels.2020.03.006
Carnelian uncovers hidden functional patterns across diverse study populations from whole metagenome sequencing reads
Genome Biology
·
24 Feb 2020
·
doi:10.1186/s13059-020-1933-7
A Randomized Parallel Algorithm for Efficiently Finding Near-Optimal Universal Hitting Sets
Lecture Notes in Computer Science
·
01 Jan 2020
·
doi:10.1007/978-3-030-45257-5_3
2019
Meta-analysis of Caenorhabditis elegans single-cell developmental data reveals multi-frequency oscillation in gene activation
Bioinformatics
·
20 Dec 2019
·
doi:10.1093/bioinformatics/btz864
Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs
Nature Methods
·
07 Oct 2019
·
doi:10.1038/s41592-019-0575-8
Emerging technologies towards enhancing privacy in genomic data sharing
Genome Biology
·
02 Jul 2019
·
doi:10.1186/s13059-019-1741-0