Lei Xiong

Postdoctoral Associate
Computer Science and Artificial Intelligence Laboratory (CSAIL)
Massachusetts Institute of Technology (MIT)
Broad Institute of MIT and Harvard

My research interests are centered on the development and application of innovative deep learning approaches to solve complex biological questions, with a primary focus on advancing our understanding of gene regulation and cellular diversity. To achieve this goal, I build models that can effectively capture and interpret complex features from biological datasets, which provides new insights into fundamental biological processes. Through my work, I aim to drive significant advances in the field of single-cell multiomics, which has the potential to contribute to a better understanding of human health and disease.

Feel free to reach out by email jsxlei at gmail.com or other social media listed above!

Education

2015-2020

Tsinghua University
Ph.D. in Computational Biology
Advisor: Prof. Qiangfeng Cliff Zhang
Thesis: Artificial intelligence method for single-cell ATAC-seq data via feature extraction

2011-2015

University of Science and Technology of China (USTC)
B.S. in Biology, Shitsan Pai Talent Program in Life Sciences
Advisor: Prof. Nieng Yan
Thesis: Structure basis and transport mechanism of membrane protein GLUT3

News

Dec 16, 2023

Our workshop paper scCLIP: Multi-modal Single-cell Contrastive Learning Integration Pre-training was presented on NeurIPS 2023 AI4Science workshop

Dec 1, 2021

Join Prof. Manolis Kellis lab in CSAIL at MIT and Broad Institute of MIT and Harvard.

Jun 30, 2021

Awarded Outstanding Doctoral Dissertation Award of Tsinghua University.

Jan 13, 2021

Awarded Outstanding Graduate of Beijing.

Sep 14, 2020

PhD. thesis Defense.

Feb 15, 2020

Our work on SCALE method for single-cell ATAC-seq analysis via latent feature extraction was selected as Top 10 Advances in Bioinformatics in China in 2019 and Top 10 Algorithms and Tools for Bioinformatics in China in 2019 by Genomics, Proteomics & Bioinformatics.

Oct 8, 2019

Our work SCALE method for single-cell ATAC-seq analysis via latent feature extraction was published on Nature Communications.

Publications

2024

  1. disent
    Deep learning modeling of ribosome profiling reveals regulatory underpinnings of translatome and interprets disease variants.
    Jialin He, Lei Xiong, Shaohui Shi, Chengyu Li, Kexuan Chen, Qianchen Fang, Jiuhong Nan, Ke Ding, Jingyun Li, Yuanhui Mao, Carles A Boix, Xinyang Hu, Manolis Kellis, and Xushen Xiong.

    bioRxiv 2024.

    Gene expression involves transcription and translation. Despite large datasets and increasingly powerful methods devoted to calculating genetic variants’ effects on transcription, discrepancy between mRNA and protein levels hinders the systematic interpretation of the regulatory effects of disease-associated variants. Accurate models of the sequence determinants of translation are needed to close this gap and to interpret disease-associated variants that act on translation. Here, we present Translatomer, a multimodal transformer framework that predicts cell-type-specific translation from mRNA expression and gene sequence. We train Translatomer on 33 tissues and cell lines, and show that the inclusion of sequence substantially improves the prediction of ribosome profiling signal, indicating that Translatomer captures sequence-dependent translational regulatory information. Translatomer achieves accuracies of 0.72 to 0.80 for de novo prediction of cell-type-specific ribosome profiling. We develop an in silico mutagenesis tool to estimate mutational effects on translation and demonstrate that variants associated with translation regulation are evolutionarily constrained, both within the human population and across species. Notably, we identify cell-type-specific translational regulatory mechanisms independent of eQTLs for 3,041 non-coding and synonymous variants associated with complex diseases, including Alzheimer’s disease, schizophrenia, and congenital heart disease. Translatomer accurately models the genetic underpinnings of translation, bridging the gap between mRNA and protein levels, and providing valuable mechanistic insights toward mapping “missing regulation” in disease genetics. ### Competing Interest Statement The authors have declared no competing interest.

2023

  1. disent
    scCLIP: Multi-modal Single-cell Contrastive Learning Integration Pre-training.
    Lei Xiong, Tianlong Chen, and Manolis Kellis

    NeurIPS AI for Science workshop 2023.

    Recent advances in multi-modal single-cell sequencing technologies enable the simultaneous profiling of chromatin accessibility and transcriptome in individual cells. Integration analysis of multi-modal single-cell data offers a more comprehensive understanding of the regulatory mechanisms linking chromatin status and gene expression, driving cellular processes and diseases. In order to acquire features that align peaks and genes within the same embedding space and facilitate seamless zero-shot transfer to new data, we introduced scCLIP (single-cell Contrastive Learning Integration Pretraining), a generalized multi-modal transformer model with contrastive learning. We show that this model outperforms other competing methods, and beyond this, scCLIP learns transferable features across modalities and generalizes to unseen datasets, which pose the great potential to bridge the vast number of unpaired unimodal datasets both existing and new data generated in the future. Specifically, we propose the first large-scale transformer model designed for single-cell ATAC-seq data by patching peaks across the genomes and representing each patch as a token. This innovative approach enables us effectively to address the scalability challenges posed by scATAC-seq, even when dealing with datasets of up to one million dimensions. Codes are provided at: https://github.com/jsxlei/scCLIP.

2022

  1. disent
    Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space.
    Lei Xiong, Kang Tian, Yuzhe Li, Weixi Ning, Xin Gao, and Qiangfeng Cliff Zhang

    Nat. Commun. 2022.

    Computational tools for integrative analyses of diverse single-cell experiments are facing formidable new challenges including dramatic increases in data scale, sample heterogeneity, and the need to informatively cross-reference new data with foundational datasets. Here, we present SCALEX, a deep-learning method that integrates single-cell data by projecting cells into a batch-invariant, common cell-embedding space in a truly online manner (i.e., without retraining the model). SCALEX substantially outperforms online iNMF and other state-of-the-art non-online integration methods on benchmark single-cell datasets of diverse modalities, (e.g., single-cell RNA sequencing, scRNA-seq, single-cell assay for transposase-accessible chromatin use sequencing, scATAC-seq), especially for datasets with partial overlaps, accurately aligning similar cell populations while retaining true biological differences. We showcase SCALEX’s advantages by constructing continuously expandable single-cell atlases for human, mouse, and COVID-19 patients, each assembled from diverse data sources and growing with every new data. The online data integration capacity and superior performance makes SCALEX particularly appropriate for large-scale single-cell applications to build upon previous scientific insights.
  2. disent
    CD127 imprints functional heterogeneity to diversify monocyte responses in inflammatory diseases.
    Bin Zhang, Yuan Zhang, Lei Xiong, Yuzhe Li, Yunliang Zhang, Jiuliang Zhao, Hui Jiang, Can Li, Yunqi Liu, Xindong Liu, Haofei Liu, Yi-Fang Ping, Qiangfeng Cliff Zhang, Zheng Zhang, Xiu-Wu Bian, Yan Zhao, and Xiaoyu Hu

    J. Exp. Med. 2022.

    Inflammatory monocytes are key mediators of acute and chronic inflammation; yet, their functional diversity remains obscure. Single-cell transcriptome analyses of human inflammatory monocytes from COVID-19 and rheumatoid arthritis patients revealed a subset of cells positive for CD127, an IL-7 receptor subunit, and such positivity rendered otherwise inert monocytes responsive to IL-7. Active IL-7 signaling engaged epigenetically coupled, STAT5-coordinated transcriptional programs to restrain inflammatory gene expression, resulting in inverse correlation between CD127 expression and inflammatory phenotypes in a seemingly homogeneous monocyte population. In COVID-19 and rheumatoid arthritis, CD127 marked a subset of monocytes/macrophages that retained hypoinflammatory phenotypes within the highly inflammatory tissue environments. Furthermore, generation of an integrated expression atlas revealed unified features of human inflammatory monocytes across different diseases and different tissues, exemplified by those of the CD127high subset. Overall, we phenotypically and molecularly characterized CD127-imprinted functional heterogeneity of human inflammatory monocytes with direct relevance for inflammatory diseases.

2019

  1. disent
    SCALE method for single-cell ATAC-seq analysis via latent feature extraction.
    Lei Xiong, Kui Xu, Kang Tian, Yanqiu Shao, Lei Tang, Ge Gao, Michael Zhang, Tao Jiang, and Qiangfeng Cliff Zhang

    Nat. Commun. 2019.

    Single-cell ATAC-seq (scATAC-seq) profiles the chromatin accessibility landscape at single cell level, thus revealing cell-to-cell variability in gene regulation. However, the high dimensionality and sparsity of scATAC-seq data often complicate the analysis. Here, we introduce a method for analyzing scATAC-seq data, called Single-Cell ATAC-seq analysis via Latent feature Extraction (SCALE). SCALE combines a deep generative framework and a probabilistic Gaussian Mixture Model to learn latent features that accurately characterize scATAC-seq data. We validate SCALE on datasets generated on different platforms with different protocols, and having different overall data qualities. SCALE substantially outperforms the other tools in all aspects of scATAC-seq data analysis, including visualization, clustering, and denoising and imputation. Importantly, SCALE also generates interpretable features that directly link to cell populations, and can potentially reveal batch effects in scATAC-seq experiments.

2015

  1. disent
    Molecular basis of ligand recognition and transport by glucose transporters.
    Dong Deng, Pengcheng Sun, Chuangye Yan, Meng Ke, Xin Jiang, Lei Xiong, Wenlin Ren, Kunio Hirata, Masaki Yamamoto, Shilong Fan, and Nieng Yan

    Nature 2015.

    The major facilitator superfamily glucose transporters, exemplified by human GLUT1-4, have been central to the study of solute transport. Using lipidic cubic phase crystallization and microfocus X-ray diffraction, we determined the structure of human GLUT3 in complex with D-glucose at 1.5 Å resolution in an outward-occluded conformation. The high-resolution structure allows discrimination of both α- and β-anomers of D-glucose. Two additional structures of GLUT3 bound to the exofacial inhibitor maltose were obtained at 2.6 Å in the outward-open and 2.4 Å in the outward-occluded states. In all three structures, the ligands are predominantly coordinated by polar residues from the carboxy terminal domain. Conformational transition from outward-open to outward-occluded entails a prominent local rearrangement of the extracellular part of transmembrane segment TM7. Comparison of the outward-facing GLUT3 structures with the inward-open GLUT1 provides insights into the alternating access cycle for GLUTs, whereby the C-terminal domain provides the primary substrate-binding site and the amino-terminal domain undergoes rigid-body rotation with respect to the C-terminal domain. Our studies provide an important framework for the mechanistic and kinetic understanding of GLUTs and shed light on structure-guided ligand design.

Softwares

Honors and Awards

2021

Outstanding Doctoral Dissertation of Tsinghua University (Top 5%).

2021

Outstanding Graduate of Beijing (Top 5%).

2020

Top 10 Advances in Bioinformatics in China in 2019

2020

Top 10 Algorithms and Tools for Bioinformatics in China in 2019

2019

Outstanding Fellowship (¥ 20k), Advanced Innovation Center of Structure Biology, Tsinghua.

2016

Innovation Fellowship (¥ 30k), Advanced Innovation Center of Structural Biology, Tsinghua.

2013

iGEM Gold Medal, USTC-China.

2013

2012-2013 Student Scholarship, USTC.

2012

2011-2012 Student Scholarship, USTC.

2011

Freshman Scholarship, USTC.