1000 Genomes Project
Data collection and a catalog of human variation
A catalog ofSNPs and short indels
dbVar and Database of Genomic Variants
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=dgvPlus (browser track)
A catalog of structural variants
Online Mendelian Inheritance in Man
OMIM is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. The full-text, referenced overviews in OMIM contain information on all known mendelian disorders and over 12,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain copious links to other genetics resources.
Encyclopedia Of DNA Elements (ENCODE) Project
Data collection, integrative analysis, and a comprehensive catalog of
all sequence-based functional elements
Epigenomics (NIH Common Fund)
Data collection, integrative analysis and a resource of human epigenomic data
International Human Epigenome Consortium (IHEC)
Data collection and reference maps of human epigenomes for key
cellular states relevant to health and diseases
Data collection on the epigenome of blood cells
Viewable with Ensemble (http://www.ensembl.org/index.html) or the
Integrated Genomics Viewer (http://www.broadinstitute.org/igv/)
Gene expression database from Illumina, from RNA-seq data
Cancer CellLine Encyclopedia (CCLE)
Array based expression data, CNV, mutations, perturbations over huge collection of cell lines
Large collection of CAGE based expression data across multiple species (time-series and perturbations)
Database of gene expression experiments
Gene Expression Atlas
Database supporting queries of condition-specific gene expression on
a curated subset of the Array Express Archive.
GNF Gene Expression Atlas
Viewable at BioGPS (http://biogps.org/#goto=welcome)
GNF (Genomics Institute of the Novartis Research Foundation) human and
mouse gene expression array data.
The Human Protein Atlas
Protein expression profiles based on immunohistochemistry for a large
number of human tissues, cancers and cell lines, subcellular
localization, transcript expression levels
A comprehensive, freely accessible database of protein sequence and
An integrated database of protein classification, functional domains,
and annotation (including GO terms).
Protein Capture Reagents Initiative
Resource generation: renewable, monoclonal antibodies and other
reagents that target the full range of proteins
Knockout Mouse Program (KOMP)
Resource generation: create knockout strains for all mouse genes,
The Connectivity Map (CMAP)
The Connectivity Map (also known as cmap) is a collection of genome-wide transcriptional expression data from cultured human cells treated with bioactive small molecules and simple pattern-matching algorithms that together enable the discovery of functional connections between drugs, genes and diseases through the transitory feature of common gene-expression changes. You can learn more about cmap from our papers in Science and Nature Reviews Cancer.
Library of Integrated Network-based Cellular Signatures (LINCS)
Data collection and analysis of molecular signatures that describe how
different types of cells respond to a variety of perturbing agents
Genomic of drug sensitivity in cancer
Mutation, CNV, Affy expression and drug sensitivity in ~300 cancer cell-lines
Papers: http://nar.oxfordjournals.org/content/41/D1/D955.long , http://www.nature.com/nature/journal/v483/n7391/full/nature11005.html
The Drug Gene Interaction database (DGIdb)
Molecular Libraries Program (MLP)
Access to the large-scale screening capacity necessary to identify
small molecules that can be optimized as chemical probes to study the
functions of genes, cells, and biochemical pathways in health and
Allen Brain Atlas
Data collection and an online public resources integrating extensive
gene expression and neuroanatomical data for human and mouse,
including variation of mosue gene expression by strain.
The Human Connectome Project
Data collection and integration to create a complete map of the
structural and functional neural connections, within and across
Geuvadis RNA sequencing project of 1000 Genomes samples
mRNA and small RNA sequencing on 465 lymphoblastoid cell line (LCL) samples from 5 populations of the 1000 Genomes Project: the CEPH (CEU), Finns (FIN), British (GBR), Toscani (TSI) and Yoruba (YRI).
The Achilles Project
Project Achilles is a systematic effort aimed at identifying and cataloging genetic vulnerabilities across hundreds of genomically characterized cancer cell lines. The project uses a genome-wide shRNA library to silence individual genes and identify those genes that affect cell survival. Large-scale functional screening of cancer cell lines provides a complementary approach to those studies that aim to characterize the molecular alterations (mutations, copy number alterations, etc.) of primary tumors, such as The Cancer Genome Atlas. The overall goal of the project is to link cancer genetic dependencies to their molecular characteristics in order to Identify molecular targets and guide therapeutic development.
Human Ageing Genomic Resources
The Cancer Genome Atlas (TCGA)
Data collection and a data repository, including cancer genome sequence data
International Cancer Genome Consortium (ICGC)
Data collection and a data repository for a comprehensive description
of genomic, transcriptomic and epigenomic changes of cancer
Genotype-Tissue Expression (GTEx) Project
Data collection, data repository, and sample bank for human gene
expression and regulation in multiple tissues, compared to genetic
Knockout Mouse Phenotyping Program (KOMP2)
Data collection for standardized phenotyping of a genome-wide
collection of mouse knockouts
Database of Genotypes and Phenotypes (dbGaP)
Data repository for results from studies investigating the interaction
of genotype and phenotype
NHGRI Catalog of Published GWAS
Public catalog of published Genome-Wide Association Studies
Clinical Genomic Database
A manually curated database of conditions with known genetic causes, focusing on medically significant genetic data with available interventions.
NHGRI's Breast Cancer information core
Breast Cancer Mutation database
ClinVar is designed to provide a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. ClinVar collects reports of variants found in patient samples, assertions made regarding their clinical significance, information about the submitter, and other supporting data. The alleles described in submissions are mapped to reference sequences, and reported according to the HGVS standard. ClinVar then presents the data for interactive users as well as those wishing to use ClinVar in daily workflows and other local applications. ClinVar works in collaboration with interested organizations to meet the needs of the medical genetics community as efficiently and effectively as possible.
Human Gene Mutation Database (HGMD)
The Human Gene Mutation Database (HGMD®) represents an attempt to collate known (published) gene lesions responsible for human inherited disease
NHLBI Exome Sequencing Project (ESP) Exome Variant Server
The goal of the NHLBI GO Exome Sequencing Project (ESP) is to discover novel genes and mechanisms contributing to heart, lung and blood disorders by pioneering the application of next-generation sequencing of the protein coding regions of the human genome across diverse, richly-phenotyped populations and to share these datasets and findings with the scientific community to extend and enrich the diagnosis, management and treatment of heart, lung and blood disorders.
Genetics Home Reference
Genetics Home Reference is the National Library of Medicine's web site for consumer information about genetic conditions and the genes or chromosomes related to those conditions.
GeneReviews are expert-authored, peer-reviewed disease descriptions presented in a standardized format and focused on clinically relevant and medically actionable information on the diagnosis, management, and genetic counseling of patients and families with specific inherited conditions.
Global Alzheimer's Association Interactive Network (GAAIN)
The Global Alzheimer’s Association Interactive Network (GAAIN) is a collaborative project that will provide researchers around the globe with access to a vast repository of Alzheimer’s disease research data and the sophisticated analytical tools and computational power needed to work with that data. Our goal is to transform the way scientists work together to answer key questions related to understanding the causes, diagnosis, treatment and prevention of Alzheimer’s and other neurodegenerative diseases.
In 2013, obtained WGS data for the largest cohort of 800 Alzheimer's patients
The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium
The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium was formed to facilitate genome-wide association study meta-analyses and replication opportunities among multiple large and well-phenotyped longitudinal cohort studies. They also have DNA methylation data alongside WGS and Exome Seq.
The NIMH Center for Collaborative Genomic Studies on Mental Disorders
(Include Psychiatric Disease Consortium https://pgc.unc.edu/)
The NIMH Center, now known as NIMH Repository and Genomics Resource (NIMH-RGR) plays a key role in facilitating psychiatric genetic research by providing a collection of over 150,000 well characterized, high quality patient and control samples from a wide-range of mental disorders.
UCSC Genome Bioinformatics
Genome databases displayed through a genome browser for vertebrates,
other eukaryotes, and prokaryotes, including sequence conservation,
transcript maps and expression, functional annotation, genetic
variation, and human disease information
Genome databases displayed through a genome browser for vertebrates
and other eukaryotic species, including sequence conservation,
transcript maps and expression, functional annotation, genetic
variation, and human disease information
Pathway database: open-source, open access, manually curated and peer-reviewed
Molecular Signatures Database (MSigDB)
MSigDB is a collection of annotated gene sets for use with Gene Set
Enrichment (GSEA) software
KEGG: Kyoto Encyclopedia of Genes and Genomes
Database of pathways, diseases, drugs
Pathway analysis resource
Proprietary genome annotation and pathway analysis software
GOLD:Genomes Online Database
Information regarding genome and metagenome sequencing projects, and their associated metadata, around the world
ImmPort: Immunology Database and Analysis Portal
The ImmPort system provides advanced information technology support in the production, analysis, archiving, and exchange of scientific data for the diverse community of life science researchers supported by NIAID/DAIT. It serves as a long-term, sustainable archive of data generated by investigators funded through the NIAID/DAIT. The core component of the ImmPort system is an extensive data warehouse containing an integration of experimental data supplied by NIAID/DAIT-funded investigators and genomic, proteomic, and other data relevant to the research of these programs extracted from a variety of public databases. The ImmPort system also provides data analysis tools and an immunology-focused ontology.
Mouse Genome Informatics
Includes genotypes with phenotype annotations, human diseases with one
or more mouse models, expression assays and images, pathways, and
Rat Genome Database (RGD)
Repository of rat genetic and genomic data, as well as mapping,
strain, and physiological information
A Database of Drosophila Genes & Genomes
The genetics, genomics and biology of C. elegans and related nematodes
The Zebrafish Model Organism Database (ZFIN)
Support integrated zebrafish genetic, genomic and developmental information
Xenopus laevis and Xenopus tropicalis biology and genomics resource
Saccharomyces Genome Database (SGD)
Integrated biological information for budding yeast, along with search
and analysis tools