GTEx Data and Analysis FAQs

A1) How can I access the RNA sequence data?

For privacy reasons, NIH policy prevents us from releasing the raw sequence data via the GTEx portal. These data are available through dbGAP .

A2) What does a variant ID of "chr8:73811903:D" or "chr8:73809509:I" mean?

All variant IDs are from the 1000 genomes project, obtained during imputation using 1000 genomes as the reference panel. Some of the variants are small insertions and deletions. These variants' IDs begin with the chromosome position of the first base and end with a 'D' for deletion and 'I' for insertion. For details on the chromosome position (genome build 37) and REF and ALT alleles of all variants used in the GTEx eQTL analysis you can download the file: GTEx_var_genot_imputed_info4_maf05_CR95_CHR_POSb37_ID_REF_ALT.txt.zip, from the "Datasets" link, under the Reference header.

A3) What does the sample ID for an RNA-Seq or genotype sample stand for, such as GTEX-14753-1626-SM-5NQ9L?

The sample ID for an RNA-Seq or genotype sample is made up of the following 3 components separated by a dash, as exemplified with the example "GTEX-14753-1626-SM-5NQ9L":

  1. "GTEX-YYYYY" (e.g. GTEX-14753) represents the GTEx donor ID. This ID should be used to link between the various RNA-Seq and genotype samples that come from the same donor.
  2. "YYYY" (e.g., "1626") mostly refers to the tissue site, BUT we do not recommend using it for tissue site designation. Sometimes sample mix-ups occur, and will be corrected however this part of the ID will not change when that happens. The accurate tissue site designation for all samples can be obtained from the "Tissue Site Detail field" (encoded as "SMTSD") in the Sample Attributes file [Datasets->Download->GTEx_Data_V6_Annotations_SampleAttributesDS.txt].
  3. "SM-YYYYY" (e.g., SM-5NQ9L) is the RNA or DNA aliquot ID used for sequencing.

'Y' stands for any number or capital letter.

A4) How can I map GTEx variant IDs to dbSNP rs IDs?

A lookup table is available for versions 6 and 6P in the file GTEx_Analysis_2015-01-12_OMNI_2.5M_5M_450Indiv_chr1-22-X_genot_imput_info04_maf01_HWEp1E6_variant_id_lookup.txt.gz on the GTEx Portal Datasets Page.

A5) Why are some ischemic times less than zero?

In the sample annotation file, the samples have a SMTSISCH value which indicates minutes of ischemia time. Some of these values are less than zero. Is this time calculated from the time the patient is pronounced dead or when the heart is no longer pumping or when the ventilator is stopped, or all of the above?

Sample-specific ischemic time is defined as the time from death or withdrawal of life-support until the time the sample is placed in a fixative solution or frozen. So it's all of those scenarios, depending on the particular patient, although NOT from when the person is pronounced death, but rather the actual time of death (or as close to it as possible for rapid autopsy patients). The negative times should appear only for blood samples and represent samples that were collected pre-mortem. Those were all from organ donor ventilator cases where life support was about to be shut off, and the patient would have been perfused prior to organ harvest, so blood was taken prior to that happening in those cases.

A6) Was the RNA-seq protocol for GTEx strand specific?

No. RNA-seq was performed using the Illumina TruSeq library construction protocol. This is a non-strand specific polyA+ selected library. For more details, please visit our documentation page: https://gtexportal.org/home/documentationPage

A7) Where can I find the sample annotations for GTEX-111CU-1826-SM-5GZYN? I searched in the biobank but I could not find that sample identifier.

The GTEx biobank inventory contains information about samples that are currently in our freezers. The sample aliquots that were used for genotyping and RNA-seq were used up during processing, so they will not appear in the biobank inventory. The biobank inventory should contain related parent samples, and searching for the sample identifier GTEX-111CU-1826-SM-5GZYN should return those related samples (providing they have not been depleted).

To find the sample and subject annotations for samples used in an analysis release, please use the sample and subject annotation files. You can download these files here: https://gtexportal.org/home/datasets

A8) Why are there are samples in dbGaP for donors that do not have genotypes in the imputed array VCF file?

The RNA-Seq and genotyping processes are run separately. All donors will eventually be genotyped, but their genotypes may not have been produced and QC'd in time for the current release. Also, some donors have been excluded from the imputed array VCF due to reasons such as genetic relatedness or being a biologic

GTEx Portal-specific FAQs

P1) Why are there different numbers of tissues available on the Search eQTLs page?

The "Search Precomputed Significant eQTLs" section allows you to look up cis eQTLs which have been precomputed in a +/- 1Mb cis window around the transcription start site (TSS). Significance was determined using a Q-value threshold. At least 70 samples per tissue are necessary to achieve the statistical power needed for this type of analysis.

In contrast, the "Test Your Own SNP-Gene Associations" section allows you to compute, on demand, an association between a SNP and gene of your choice. The association may be cis or trans. This calculation may be performed in tissues for which we have more than 10 samples. No Q-value filtering is performed and the user is left to interpret the significance of the p-value.

P2) What does Ref Allele mean on the eQTL plot?

REF stands for reference allele, as determined by the hg19/GRCh37 human genome reference. ALT stands for alleles that are alternate in comparison to the reference. The variant IDs are from the 1000 genomes project. A file with the chromosome positions (genome build 37), REF and ALT alleles of all variants used in the GTEx eQTL analysis can be downloaded from the "Datasets" page, under the Reference header: GTEx_var_genot_imputed_info4_maf05_CR95_CHR_POSb37_ID_REF_ALT.txt.zip. Information on the minor allele, including allele frequencies and nucleotides, can be found on dbSNP. For example, the top SNP in Lung is rs2687967 . That link leads to dbSNP, which shows that the reference allele is G and the alternate allele is A.

P3) What browsers does the GTEx portal support?

The GTEx Portal is tested on the latest versions of Chrome and Firefox, Safari version 7+, and Internet Explorer 11+. Please note that the GTEx Portal will not work properly with earlier versions of Internet Explorer.

P4) Why is the GTEx Portal not working for me after the latest update?

While we have taken steps to reduce the chance of this happening, sometimes browsers cache important portal files and do not recognized that they have changed. If you are having problems with the GTEx Portal, please try clearing your cache first. If that doesn't solve the problem, then please contact us.

P5) Can I use one of the figures in the GTEx Portal in my paper?

Yes, you are free to use figures in the GTEx Portal in your publications. Most of the figures on the GTEx Portal now have a Download button above and to the left of the figure. This will download the figure in .svg format, which is a vector-based format.

Please acknowledge the GTEx Project and/or Portal. An example acknowledgement statement follows:

The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The data used for the analyses described in this manuscript were obtained from: [insert, where appropriate] the GTEx Portal on MM/DD/YY and/or dbGaP accession number phs000424.vN.pN on MM/DD/YYYY.

Google Sign-In FAQs

G1) Why won't my username/password doesn't work anymore?

We have upgraded to use Google Sign-In. You will need to sign in with your Google ID. If you don't have a Google ID, then you will need to register with Google.

G2) Will the GTEx Portal see my Google password?

No, the GTEx Portal will never see your Google password. When you sign in using your Google ID and password, you will be using a dialog that is controlled by Google, not by the GTEx Portal. Your ID and password are communicated directly to Google, not through the GTEx Portal.

G3) Why did you change to Google Sign-In?

We migrated to Google Sign-In for several reasons. First, this allows us to provide more functionality to users with less effort on our part. Using Google Sign-In, we no longer have to maintain our login, logout, password-maintenance, and forgot-password functionality. Instead, Google handles all of those features. In addition, Google already provides for more secure two-factor authentication, if you choose to enable it. We want to focus our efforts on scientific features, rather than on user account management. Second, using Google Sign-In will allow us to integrate with analytical pipeline engines like FireCloud in the future.

G4) Where can I find out more about Google Sign-In?

You can read more about Google Sign-In here: https://support.google.com/accounts/answer/112802?hl=en