Welcome to G-DOC!

The Georgetown Database of Cancer (G-DOC) is a precision oncology platform providing access to molecular and clinical data from thousands of patients and cell lines, along with tools for data analysis and visualization. The platform enables cancer researchers to explore multiple data types across multiple cancer types to understand mechanisms of cancer progression and therapy resistance.

The new generation of G-DOC platform provides access to a massive amount of cancer genomics data via the G-DOC hub platform. The new G-DOC hub platform powered by UCSC Xena Hub also allows the integrative exploratory analysis of multiple molecular profiling data types to understand disease mechanisms, to discover new associations with clinical data and validate investigator generated findings.

Access the new G-DOC Hub platform here: https://gdochub.georgetown.edu/

Access to additional computational resources for cancer genomics research is under construction and will include access to:

  • Open Cravat – a variant analysis platform for WGS and WES data analysis

  • CIBERSORT – a gene expression deconvolution package for RNAseq data analysis of immunological tumor microenvironment

  • IGV – an integrative genome viewer for secure genomic data visualization

Datasets with Open Access

REMBRANDT (REpository for Molecular BRAin Neoplasia DaTa) Dataset

Description: The REMBRANDT dataset was originally created at the National Cancer Institute and funded by Glioma Molecular Diagnostic Initiative. The data was collected from 2004-2006. In 2015, the NCI transferred this dataset to Georgetown. The dataset is accessible for conducting clinical translational research using the open access Georgetown Database of Cancer (G-DOC) (new window) platform. In addition, the raw and processed genomics and transcriptomics data have also been made available via the public NCBI GEO repository as a super series GSE108476 (new window). Such combined datasets would provide researchers with a unique opportunity to conduct integrative analysis of gene expression and copy number changes in patients alongside clinical outcomes (overall survival) using this large brain cancer study

Raw data: GSE108476

Raw MRIs: The Cancer Imaging Archive (TCIA) (new window)

precisionFDA Brain Cancer Predictive modeling challenge: The precisionFDA (new window), the Georgetown Lombardi Comprehensive Cancer Center (new window) and The Innovation Center for Biomedical Informatics at Georgetown University Medical Center launched and executed the Brain Cancer Predictive Modeling and Biomarker Discovery Challenge; which ran from November 2019 to February 2020. The challenge asked participants to develop machine learning and/or artificial intelligence models to identify biomarkers and predict patient outcomes using gene expression, DNA copy number, and clinical data. For more details about the challenge and its results read here.


  • Gusev Y, Bhuvaneshwar K, Song L, Zenklusen JC, Fine H, Madhavan S. The REMBRANDT study, a large collection of genomic data from brain cancer patients. Nature Scientific Data, Aug 2018. PMID: 30106394

  • Madhavan S, Zenklusen JC, Kotliarov Y, Sahni H, Fine HA, Buetow. Rembrandt: helping personalized medicine become a reality through integrative translational research. Molecular Cancer Research. Feb 2009. PMID19208739

Stage II Colorectal Cancer – Multi-omics Molecular Profiling Dataset

Description: Colorectal cancer (CRC) patient biospecimens with extensive clinical and follow-up data were selected from the Indivumed GmbH biobank for 40 patients (20 relapse and 20 no-relapse). The patients consisted of 12 with late stage I, and 28 with stage II. Four patients (out of 12) with late stage I had experienced relapse (~33%), and it is important to note that 12 patients (out of 28) with stage II were relapse-free (~43%). Therefore, the relapse-free group of samples, and the group with relapse are both represented by a mixture of late stage I and stage II patients. Only nine stage II patients (out of 28) had rectal cancer; of these 6 had relapsed within 5 years. Of more than 180 clinical attributes, 64 were shortlisted based on relevance to clinical outcome and biomarker analysis. The molecular data included gene expression, DNA copy number, microRNA, serum and urine metabolites.

Data Formats: (Data access coming soon )


  • Subha Madhavan,* Yuriy Gusev, Thanemozhi G. Natarajan, Lei Song, Krithika Bhuvaneshwar, Robinder Gauba, Abhishek Pandey, Bassem R. Haddad, David Goerlitz, Amrita K. Cheema, Hartmut Juhl, Bhaskar Kallakury, John L. Marshall, Stephen W. Byers, and Louis M. Weiner. Genome-wide multi-omics profiling of colorectal cancer identifies immune determinants strongly associated with relapse. Frontiers in Genetics, Nov 2013, 4: 236. PMC3834519

Cite G-DOC

If you have used G-DOC in your research and would like to cite this informatics platform, please use the following peer-reviewed articles:

  • Madhavan S, Gusev Y, Harris M, Tanenbaum DM, Gauba R, Bhuvaneshwar K, Shinohara A, Rosso K, Carabet L, Song L, Riggins RB, Dakshanamurthy S, Wang Y, Byers SW, Clarke R, Weiner LM. G-DOC®: A Systems Medicine Platform for Personalized Oncology. Neoplasia 13:9. (Sep 2011). PMID: 21969811

  • Krithika Bhuvaneshwar, Anas Belouali, Varun Singh, Robert M Johnson, Lei Song, Adil Alaoui, Michael A Harris, Robert Clarke, Louis M Weiner, Yuriy Gusev, Subha Madhavan. G-DOC Plus – an integrative bioinformatics platform for precision medicine. BMC Bioinformatics (April2016). PMID: 27130330