Supervisors: Dr. Ferhat Ay and Dr. Pandurangan Vijayanand
La Jolla Institute for Immunology, La Jolla, CA 92037, USA
Funding:
NIH R-24 resource grant R24AI108564 to Dr. P. Vijayanand and Dr. F. Ay.
NIH/NIGMS R35 MIRA grant to Dr. F. Ay.
NIH R-03 grant 1R03OD034494-01 to Dr. Ferhat Ay.
D - Challenge 2021 research grant from SugarScience for research on Type 1 Diabetes (T1D)
LJI Institute funding to Dr. F. Ay
Ongoing Project: Deep-learning-based prediction of gene expression using 3C and epigenomic data
Objective: Using a deep-learning framework, predict gene expression using the chromatin contacts from chromatin conformation capture (3C) datasets and various epigenomic 1D tracks (ChIP-seq, ATAC-seq, DNase-seq) for different cell types and conditions.
Project: Novel method to identify QTLs associated with 3D chromatin interactions (iQTLs)
Objective: Identify SNPs associated with 3D chromatin interactions, showing either genotype-specific or allele-specific changes in chromatin contacts.
Using HiChIP data sequenced from 30 individuals for CD4 Naive T cells, we developed a novel computational framework to derive QTLs associated with HiChIP loops (iQTLs).
Benchmarked these iQTLs with the eQTLs of CD4 Naive T cell, and CD4 T cells in general, for their effect on gene regulation, enrichment of fine-mapped GWAS SNPs for various immune-diseases.
Also defined connectivity-QTLs: a set of IQTLs associated with multiple HiChIP loops within a broad genomic region, and affecting the corresponding gene expression.
Publication: Bhattacharyya S, and Ay F; Identifying genetic variants associated with chromatin looping and genome function, Nature Communications 2024
(GitHub) (Manuscript)
Project: Identifying differential chromatin contacts from HiChIP data
Objective: Detection of differential HiChIP (and other types of chromatin contact assays) loops between two conditions to ontextualize condition-specific activities of genes in connection with such cis-regulatory elements.
Novelty: Implemented DiffHiChIP, the first comprehensive framework to call differential loops from HiChIP and similar 3C protocols.
Supports both DESeq2 and edgeR, using complete or subset of contacts (filtered) for background estimation.
Incorporates edgeR with generalized linear model (GLM) using either quasi-likelihood F-test or likelihood ratio test.
Implements independent hypothesis weighting (IHW) and a custom distance stratification technique for modeling distance decay of contacts
Findings: edgeR GLM-based models with IHW correction capture robust differential interactions.
Publication: Bhattacharyya S, Figueroa D S, Georgopoulos K, and Ay F; DiffHiChIP: Identifying differential chromatin contacts from HiChIP data, bioRxiv 2025
Project: Loop Catalog: HiChIP database for different activation marks and reference genomes
Objective: Construct a database of processed HiChIP contacts with statistical significance, with respect to different cell/tissue types, different proteins/histones, different conditions (healthy / disease), supporting multiple reference genomes (human, mouse).
Features:
Visualization of HiChIP contacts across a wide range of cell/tissue types, conditions.
Identifying cell-type and disease-specific chromatin contacts and SNP-to-gene-links (SGLs), regulatory regions.
Ongoing Project: Identifying Type 1 Diabetes specific causal variants using chromatin interactions, eQTL and GWAS
Objective: We aim to identify the potential T1D regulatory variants, using HiChIP and other chromatin interaction datasets from various immune cells.
Research grant: theSugarScience (D-challenge 2021)
Chromatin folding and 3D spatial proximity between enhancers and gene promoters, even if they are spatially (1D) distant (Dekker et al. Science 2002)
(Top) Hi-C protocol - generating high throughput paired-end sequencing data representing enhancer - promoter contacts (L.-Aiden et. al. Science 2009) (Bottom) FitHiChIP loops between MYC gene and four distal enhancers (about 1 Mb distant) which are validated by CRISPRi (Fulco et al. Science 2016; Bhattacharyya et al. Nature Comm. 2019).
Project: Computational methods to identify significant gene regulatory chromatin interactions using chromatin conformation capture (3C) assays
Objective: High-throughput chromatin conformation capture (3C) techniques (Hi-C, ChIA-PET, Promoter capture Hi-C (PCHi-C), HiChIP / PLAC-seq) shows the 3D genomic structure by identifying the relationships (or links) between the gene promoters and their regulatory regions (enhancers).
These links are also referred to as chromatin contacts or loops.
However, a very small fraction of chromatin contacts identified by the 3C techniques is functional or statistically significant. This is because, contact probability across the genome is not uniform, but is affected by various protocol-specific biases, and the genomic distance between the enhancers and promoters.
Work: Developed novel computational methods to infer statistically significant chromatin contacts from a wide variety of 3C datasets such as HiChIP / PLAC-seq, Hi-C, PCHi-C.
Publications:
1: Bhattacharyya S, Chandra V, Vijayanand P and Ay F. Identification of significant chromatin contacts from HiChIP data by FitHiChIP, Nature Communications 2019 (manuscript) (GitHub) (RECOMB 2019 poster) (ISMB 2018 Slides) (ISMB 2018 presentation video)
2: Kaul A*, Bhattacharyya S*, and Ay F, Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2, Nature Protocols 2020 (manuscript) (GitHub)
Schematic of the project: Identifying promoter interacting eQTLs (pieQTLs) using HiChIP interactions, eQTL and gene expression data from 5 immune cell types of DICE database. (Chandra, Bhattacharyya et al. Nature Genetics 2021)
Schematic of the proposed approach to infer novel pieQTLs distant > 1 Mb from the TSS of the target gene, using HiChIP interactions. (Chandra, Bhattacharyya et al. Nature Genetics 2021)
Project: Computational methods for inferring potentially functional (gene-regulatory) SNPs using chromatin contacts, and 1D chromatin signal, gene expression
Objective: Expression quantitative trait loci (eQTLs) depict associations of genetic variants with gene expression.
However, eQTL studies report a lot of SNPs that are often in high linkage disequilibrium (linkage), thus may not be functional SNPs.
Work: We proposed a computational method to infer the potentially functional eQTLs using chromatin interactions from H3K27ac HiChIP data.
We declared the eQTLs overlapping active cis-regulatory elements that interact with their target gene promoters (promoter-interacting eQTLs, pieQTLs) as potentially functional.
We also discovered novel eQTLs distant > 1 Mb from the target genes, using HiChIP-based promoter interaction data.
Publications:
Chandra V*, Bhattacharyya S*, Schmiedel B, Madrigal A, Gonzalez-Colin C, Fotsing S, Crinklaw A, Seumois G, Mohammadi P, Kronenberg M, Peters B, Ay F and Vijayanand P, Promoter-interacting expression quantitative trait loci are enriched for functional genetic variants, Nature Genetics 2020,
The SNP rs4767032 is a cell-type-specific eQTL (eQTL in non-classical monocyte but not in classical monocyte) for the gene OAS1. The SNP also regulates the OAS1 gene in NCM but not in CM, by overlapping with cell-type-specific regulatory regions (ChIP and ATAC-seq peaks), interacting with OAS1 via cell-type-specific HiChIP loop, and overlaps with the binding sites of TFs TXRa, which is known to be associated with increased COVID risk.
Project: Identifying functional SNPs and genes, using published eQTL and related to COVID-19, using published GWAS summary statistics
Objective: Identify the SNPs associated with COVID-19, their target genes, and the cell-specificity of molecular pathways for COVID-19.
Work: We assessed the effects of COVID-19-risk variants on gene expression in a wide range of immune cell types.
Employed regulatory regions, HiChIP loops, fine-mapped GWAS SNPs, colocalization, and Transcriptome-wide association studies (TWAS) to identify the putative causal genes and the specific immune cell types where gene expression is most influenced by COVID-19-risk variants.
Our study highlights the potential of COVID-19 genetic risk variants to impact the function of diverse immune cell types and influence severe disease manifestations.
Example of an SNP (rs4767032) which is cell-specific eQTL (eQTL in NCM but not in CM) for the gene OAS1.
Publications:
1: Schmiedel B*, Rocha J*, Gonzalez-Colin C*, Bhattacharyya S*, Madrigal A, Ottensmeier C, Ay F, Chandra V and Vijayanand P, COVID-19 genetic risk variants are associated with expression of multiple genes in diverse immune cell types, Nature Communications 2021, Vol 12, No 6760, DOI: https://doi.org/10.1038/s41467-021-26888-3
(manuscript) (GitHub)
Publications:
1: Schmiedel B*, Gonzalez-Colin C*, Fajardo V*, Rocha J, Madrigal A, Ramírez-Suástegui C, Bhattacharyya S, Simon H, Greenbaum J, Peters B, Seumois G, Ay F, Chandra V and Vijayanand P, Single-cell eQTL analysis of activated T cell subsets reveals activation and cell-type-dependent effects of disease-risk variants, Science Immunology 2022 (manuscript)
Project: Computational inference of expression quantitative trait loci (eQTL) from single-cell RNA-seq data of CD4 T cells
Objective: To develop a framework for identifying eQTLs from single-cell RNA-seq data from a wide variety of activated CD4+ T cells.
Data: Activated CD4+ T cells from 89 healthy donors, and corresponding in-house scRNA-seq data from >1 million cells.
Work: Developed computational framework to infer single-cell eQTLs from this data.
Found that expression of over 4000 genes is significantly associated with common genetic polymorphisms. Most of these genes are cell-type-specific. These eQTLs are also enriched for their overlap with disease-risk GWAS variants.
These sc-eQTLs are integrated into the DICE database (https://dice-database.org).
Project: Single-cell transcriptomic analysis of tissue-resident memory T cells in human lung cancer
Using single-cell RNA-seq, bulk RNA-seq and ATAC-seq data of tissue-resident memory T (TRM) cells and non-TRM cells present in tumor and normal lung tissue from patients with lung cancer, we found that PD-1–expressing TRM cells in tumors are clonally expanded and enriched for transcripts linked to cell proliferation and cytotoxicity when compared with PD-1–expressing non-TRM cells.
Publications:
1: Clarke J, Panwar B, Madrigal A, Singh D, Gujar R, Wood O, Chee S, Eschweiler S, King E, Awad A, Hanley C, McCann K, Bhattacharyya S, Woo E, Alzetani A, Seumois G, Thomas G, Ganesan AP, Friedmann P, Sanchez-Elsner T, Ay F, Ottensmeier C, and Vijayanand P. Single-cell transcriptomic analysis of tissue-resident memory T cells in human lung cancer. Journal of Experimental Medicine, 2019 (manuscript)
Project: Understanding the role of PTPN2 locus in regulatory T cells (Tregs) for autoimmunity
(collaboration with Dr. Nunzio Bottini, UCSD)
Using RNA-seq and ATAC-seq data in FoxP3+ regulatory T cells (Tregs), we identified that reduced expression of Ptpn2 enhanced the severity of autoimmune arthritis in the T-cell-dependent SKG mouse model.
The PTPN2 locus encodes the tyrosine phosphatase PTPN2, and is linked to rheumatoid arthritis (RA) and other autoimmune diseases.
PTPN2 inhibits signaling through the T cell and cytokine receptors, and loss of PTPN2 promotes T cell expansion and CD4- and CD8-driven autoimmunity.
Publications:
1: Svensson MN, Doody KM, Schmiedel BJ, Bhattacharyya S, Panwar B, Wiede F, Yang S, Santelli E, Wu DJ, Sacchetti C, Gujar R, Seumois G, Kiosses WB, Aubry I, Kim G, Mydel P, Sakaguchi S, Kronenberg M, Tiganis T, Tremblay ML, Ay F, Vijayanand P, and Bottini N. Reduced expression of phosphatase PTPN2 promotes pathogenic conversion of Tregs in autoimmunity. Journal of Clinical Investigation. 2019 (manuscript)
Project: Understanding the role of NSD2 overexpression in chromatin conformation
(collaboration with Dr. Jane Skok, NYU Langone Health)
Objective: In 20% patients with multiple myeloma, a 4;14 translocation induces overexpression of the histone methyltransferase, NSD2, resulting in expansion of H3K36me2 and shrinkage of antagonistic H3K27me3 domains. We wanted to understand the alterations in chromatin modifications and gene regulation for NSD2 overexpression.
Work: Using isogenic cell lines producing high and low levels of NSD2, we found that oncogene activation is linked to alterations in H3K27ac and CTCF within H3K36me2 enriched chromatin.
Differentially expressed genes are significantly enriched within the same insulated domain as altered H3K27ac and CTCF peaks.
Identify a bidirectional relationship between 2D chromatin and 3D genome organization in gene regulation.
Publications:
1: Lhoumaud P, Badri S, Rodriguez Hernaez J, Sakellaropoulos T, Sethia G, Kloetgen A, Cornwell M, Bhattacharyya S, Ay F, Bonneau R, Tsirigos A and Skok J. NSD2 overexpression drives clustered chromatin and transcriptional changes in a subset of insulated domains. Nature Communications 2019 (manuscript)
Supervisor: Dr. Jayanta Mukhopadhyay, Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur
Funding:
Tata Consultancy Services (TCS) Ph.D. research fellowship, April 2013 - March 2017
Institute fellowship, July 2012 - March 2013
Project 1: Computational algorithms to construct phylogenetic species trees from topologically incongruent gene trees.
A species tree is a phylogenetic tree representing the evolutionary relationship among a group of species.
A gene tree depicts phylogeny constructed from the gene sequences at a particular gene locus of this set of species.
Recent advances in gene sequencing lead to the availability of different gene trees for a group of taxa.
But, these gene trees often exhibit conflicting topologies and branch length values (representing evolutionary distance).
Such topological discordance among the gene trees occurs due to a combination of one or more of the following three biological processes: (1) horizontal gene transfer (HGT), (2) gene duplication and loss, and (3) Incomplete Lineage Sorting (ILS) or deep coalescence (DC).
We focussed on ILS, which occurs when two or more lineages in a population fail to coalesce due to the rapid speciation and short branches in a gene tree.
Publications:
Bhattacharyya S, and Mukherjee J, IDXL: Species Tree Inference Using Internode Distance and Excess Gene Leaf Count, Journal of Molecular Evolution (Springer), 2017, volume 85, issue 1-2, pp. 57-78, DOI: 10.1007/s00239-017-9807-7
2. Bhattacharyya S, and Mukhopadhyay J, Accumulated Coalescence Rank and Excess Gene Count for Species Tree Inference, proceedings of 3rd International Conference on Algorithms for Computational Biology (AlCoB) 2016, Trujillo, Spain, Springer LNBI 9702, pp. 93-105.
(manuscript) (GitHub) (AlCoB 2016 slides)
3: Bhattacharyya S, and Mukhopadhyay J, Couplet Supertree based Species Tree Estimation from Incongruent Gene Trees with Deep Coalescence, proceedings of 11th International Symposium on Bioinformatics Research and Applications (ISBRA), June 2015, Virginia, USA, Springer LNBI 9096, pages 48-59.
Project 2: Computational algorithms to construct phylogenetic supertrees (consensus trees) from topologically incongruent gene trees.
Here we construct phylogenetic supertrees, the consensus topological relationship (obtained from gene trees) among a group of species.
Publications:
1: Bhattacharyya S, and Mukherjee J, COSPEDTree: COuplet Supertree by Equivalence Partitioning of taxa set and DAG formation, IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2015. (manuscript) (GitHub)
2. Bhattacharyya S, and Mukhopadhyay J, COSPEDTree-II: Improved Couplet Based Phylogenetic Supertree, IEEE international conference on bioinformatics and biomedicine (BIBM), 2016, Shenzhen, China, pp. 98-101. (manuscript) (GitHub) (slides)
3: Bhattacharyya S, and Mukhopadhyay J, COuplet Supertree by Equivalence Partitioning of taxa set and DAG formation, Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB), September 2014 (manuscript) (GitHub). (slides)
Supervisors: Dr. Jayanta Mukhopadhyay and Dr. Arun K Majumdar, Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur
Funding:
e-NPCS project, funded by the Department of Information Technology (DIT) and Ministry of Communications and Information Technology (MCIT), Govt of India (currently named as Department of Electronics and Information Technology (DeitY))
Project Collaborators:
Sponsored Research and Industrial Consultancy (SRIC), IIT Kharagpur
Department of Neonatology, Institute of Post Graduate Medical Education and Research (IPGMER), Seth Sukhlal Karnani Memorial (SSKM) Hospital, Kolkata 700020
West Bengal Electronics Industry Development Corporation Limited (WEBEL), Kolkata 700091, India
Project 1: Computational modeling of newborn video-EEG, for epileptic seizure detection.
Computational analysis of newborn EEG signal and video, and automatic detection of newborn epileptic seizures.
Publications:
1: Bhattacharyya S, Biswas A, Mukherjee J, Majumdar A K, Majumdar B, Mukherjee S, and Singh A K, Detection of artifacts from high energy bursts in neonatal EEG, Computers in Biology and Medicine, 2013 (manuscript)
2: Bhattacharyya S, Biswas A, Mukherjee J, Majumdar A K, Majumdar B, Mukherjee S, and Singh A K, Feature Selection for Automatic Burst Detection in Neonatal Electroencephalogram, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2011 (manuscript)
3: Biswas A, Roy R, Mukhopadhyay J, Khaneja D, Bhattacharya S D, and Bhattacharyya S, Android Application for Therapeutic Feed and Fluid Calculation in Neonatal Care - A Way to Fast, Accurate and Safe Health-care Delivery, International Workshop on Biomedical and Health Informatics (BHI), held in conjunction with IEEE international conference on bioinformatics and biomedicine (BIBM), 2016 (manuscript)
4: Sharma S, Bhattacharyya S, Biswas A, Mukhopadhyay J, Purkait P K, and Deb A K, Automated detection of newborn sleep apnea using video monitoring system, 8th International Conference on Advances in Pattern Recognition (ICAPR), 2015 (manuscript) (slides)
5: Bhattacharyya S, Biswas A, Pandit R, Mukhopadhyay J, Majumdar A K, Majumdar B, Mukherjee S, and Singh A K, Detection of Burst-Suppression in Neonatal EEG, presented in the International Conference on VLSI and Signal Processing (ICVSP), 2014 (manuscript)
6: Bhattacharyya S, Roy A, Dogra D P, Biswas A, Mukhopadhyay J, Majumdar A K, Majumdar B, Mukherjee S, and Singh A K, Summarization of Neonatal Video EEG for Seizure and Artifact Detection, IEEE NCVPRIPG, 2011, pp. 134-137 (manuscript)
7: Bhattacharyya S, Ghoshal G, Biswas A, Mukhopadhyay J, Majumdar A K, Majumdar B, Mukherjee S, and Singh A K, Automatic sleep spindle detection in raw EEG signal of newborn babies, ICECT 2011. (manuscript)
8: Roy S, Dogra D P, Bhattacharyya S, Saha B, Biswas A, Majumdar A K, Mukhopadhyay J, Majumdar B, Singh A K, Paria A, and Mukherjee S, A Web Enabled Health Information System with an Application to Neonatal Patient Care Services, IEEE International Workshop on Web Services in Healthcare and Application (WSHA) 2011. (manuscript)
9: Bhattacharyya S, Mukhopadhyay J, Majumdar A K, Majumdar B, Singh A K, and Saha C, Automated Burst Detection in Neonatal EEG, International Conference on Bio-inspired System and Signal Processing (BIOSIGNALS) 2011. (manuscript)