I recieved my Ph.D. in Computational Biology at Tsinghua University advised by Prof. Qiangfeng Cliff Zhang. I graduated with a B.A. in Biology at University of Science and Technology of China and my B.A. thesis was supervised by Prof. Nieng Yan.
My PhD research focuses on developing artificial intelligence tool for analysis of single-cell RNA-seq and ATAC-seq data to uncover gene regulation in cell heterogeneity. Previously, I did some research in structural protein-protein interaction (PPI) network.
I have a broad interest in applying artificial intelligence algorithm to solve biological questions.
Awarded Outstanding Doctoral Dissertation Award in Tsinghua University.
Awarded Outstanding Graduate of Beijing.
Receive my PhD degree.
Pass the PhD thesis defense.
Our work on SCALE method for single-cell ATAC-seq analysis via latent feature extraction was selected as Top Ten Advances in Bioinformatics in China in 2019 and Top Ten Algorithms and Tools for Bioinformatics in China in 2019 by Genomics, Proteomics & Bioinformatics.
Our work SCALE method for single-cell ATAC-seq analysis via latent feature extraction was published on Nature Communications.
Construction of continuously expandable single-cell atlases through integration of heterogeneous datasets in a generalized cell-embedding space.
bioRxiv2021.Single-cell RNA-seq and ATAC-seq analyses have been widely applied to decipher cell-type and regulation complexities. However, experimental conditions often confound biological variations when comparing data from different samples. For integrative single-cell data analysis, we have developed SCALEX, a deep generative framework that maps cells into a generalized, batch-invariant cell-embedding space. We demonstrate that SCALEX accurately and efficiently integrates heterogenous single-cell data using multiple benchmarks. It outperforms competing methods, especially for datasets with partial overlaps, accurately aligning similar cell populations while retaining true biological differences. We demonstrate the advantages of SCALEX by constructing continuously expandable single-cell atlases for human, mouse, and COVID-19, which were assembled from multiple data sources and can keep growing through the inclusion of new incoming data. Analyses based on these atlases revealed the complex cellular landscapes of human and mouse tissues and identified multiple peripheral immune subtypes associated with COVID-19 disease severity. Nat Commun2019.Single-cell ATAC-seq (scATAC-seq) profiles the chromatin accessibility landscape at single cell level, thus revealing cell-to-cell variability in gene regulation. However, the high dimensionality and sparsity of scATAC-seq data often complicate the analysis. Here, we introduce a method for analyzing scATAC-seq data, called Single-Cell ATAC-seq analysis via Latent feature Extraction (SCALE). SCALE combines a deep generative framework and a probabilistic Gaussian Mixture Model to learn latent features that accurately characterize scATAC-seq data. We validate SCALE on datasets generated on different platforms with different protocols, and having different overall data qualities. SCALE substantially outperforms the other tools in all aspects of scATAC-seq data analysis, including visualization, clustering, and denoising and imputation. Importantly, SCALE also generates interpretable features that directly link to cell populations, and can potentially reveal batch effects in scATAC-seq experiments.