Though distant, the journey leads to arrival (路虽远,行则将至)

Hi, my name is Wei He, I obtained my Ph.D. degree in bioinformatics from the Univeristy of Science and Technology of China. I'm now working as a senior computational scientist in Xu lab from MD Anderson Cancer Center. My research field is bioinformatics and computational biology with focuses on: (1) Computational method development for CRISPR-based experimental design and data analysis; (2) Structure-based protein function analysis and prediction; (3) High-throughput screening & Multi-omics data analysis for cancer therapeutic targets discovery. Here is the CV for my detailed professional background and accomplishments.

Personal links: Github, ORCID, Twitter, LinkedIn, ResearchGate

Email: hwkobe.1027@gmail.com;  WeChat: hwkeepon

Research Gallery 

CRISPR tiling screen analysis

He W*, Zhang L*, Villarreal OD, Fu R, Bedford E, Dou J, et al. De novo identification of essential protein domains from CRISPR-Cas9 tiling-sgRNA knockout screens. Nat Commun (IF=14.92), 2019;10:4541.[paper link, software]

Base editor tiling screen analysis

He et al. ProTiler-BE: A Comprehensive Computational Framework for Analyzing High-Throughput Base Editor Screens. Manuscript in preparation

2. Computational tools for optimized sgRNA design in CRISPR 

On-target and off-target prediction for CRISPR variants 

Zhang L*, He W*, Fu R, et al. Guide-specific loss of efficiency and off-target reduction with Cas9 variants., Nucleic Acids Res (IF=16.9), 2023; gkad702. [paper link, software]


sgRNA design for CRISPR knockout

He W, Wang H, Wei Y, Jiang Z, Tang Y, Chen Y, et al. GuidePro: A multi-source ensemble predictor for prioritizing sgRNAs in CRISPR/Cas9 protein knockouts. Bioinformatics (IF=6.93), 2021.[paper link, websever]

CRISPR off-target modeling and prediction

Fu R*, He W*, Dou J, Villarreal OD, Bedford E, et al. Systematic decomposition of sequence determinants governing CRISPR/Cas9 specificity. Nat Commun (IF=14.92), 2022; 13:474. [paper link, software]

3. 3D strucutre-based protein functional analysis

Metal binding sites prediction

He W, Liang Z, Teng M, Niu L. mFASD: a structure-based algorithm for discriminating different types of metal-binding sites. Bioinformatics(IF=6.93), 2015;31:1938-44. [paper link, software]

Fragment-based protein-ligand interaction prediction

Yang L*, He W*# , Yun Y, Gao Y, Zhu Z, Teng M, Liang Z, Niu L. Defining a global map of functional group based 3D ligand-binding motifs. Genomics, Proteomics & Bioinformatics (IF=11.5), 2022 (In press, software)


4. Cancer muti-omics data analysis for drug discovery

Essential gene signature

Gene expression signatures (GES) have been extensively used for cancer subtyping and clinical outcome prediction. However, the causal relevance of available expression signatures for the cancer phenotype remains largely unknown, thus limiting the development of personalized therapy for specific cancer subtypes. Here, we leverage RNAseq data and RNAi screen data in multiple cell lines of certain cancer type to define gene signatures whose expression dysregulation are essential to regulating the malignant phenotype, which we call essential gene signatures (EGS). The identified EGSs are mostly cancer type specific and highly enriched for transcription factors and signaling molecules that are critical for cancer progression. Notably, the expression patterns of EGSs derived from cell line data are highly reproducible in real patient samples, suggesting their clinical utility. Applied to small cell lung cancer (SCLC), we identified two distinct group of EGSs corresponding to classic NE and non-NE subtypes. We further discovered CDK13 and CDK4/6 as potential therapeutic targets for these two SCLC subtypes, which are validated experimentally. 



He W et al. Essential gene signatures facilitate cancer patient stratification and subtype-specific anticancer drug discovery. Manuscript in preparation

5. Collaborations with experimental biologists 

CRISPR screen for drug synergestic effects

Gao G, Zhang L, Villarreal OD, He W, Su D, Bedford E, et al. PRMT1 loss sensitizes cells to PRMT5 inhibition. Nucleic Acids Res 2019; 47:5038-48. [paper link]

Tiling screen identify novel domain function

Xu L, Xuan H, He W, Zhang L, Huang M, Li K, Wen H, Xu H, Shi X. TAZ2 truncation confers overactivation of p300 and cellular vulnerability to HDAC inhibition. Nature Commun. 2023;14(1):5362.[paper link]


Structural prediction of aromatic cage in SART3

Yalong Wang, Jujun Zhou, Wei He,Rongjie Fu, Ngoc Khoi Dang, Bin Liu, Han Xu, Xiaodong Cheng, Mark T. Bedford. SART3 reads methylarginine marked glycine/arginine-rich motifs. Cell Reports (accepted), 2024.