SGD Help: Gene/Sequence Resources

Gene/Sequence Resources (GSR) serves as a central point for accessing much of the information available at SGD for a 1) a named DNA sequence, 2) a specified chromosomal region, or 3) a raw DNA or protein sequence. This information includes biological information, table/map displays, and sequence analysis and retrieval options. Once you have specified a sequence name or region, GSR will present only those options which are available for obtaining information about your entry.

Whenever possible, selecting one of the available options for your entered gene name, chromosomal region, or sequence takes you directly to the results for your query. In other cases, it takes you directly to the resource with the sequence already pre-pasted (i.e. BLAST or Restriction Analysis).

Contents

  1. Pick a Sequence Query Option
    1. Search a list of genes
    2. Search a specified chromosomal region of S288C genome
    3. Analyze a raw DNA or Protein sequence
  2. Choose Resource to View
    1. Gene/ORF specific resources
    2. Genome Display
    3. Sequence Analysis

Pick a Sequence Query Option

Try Yeastmine for flexible queries and fast retrieval of chromosomal features and sequences, plus additional criteria such as GO annotations, interaction data, and/or phenotype annotations. The video tutorial Template Basics describes how to quickly retrieve this type of information in YeastMine, and a comprehensive list of guides describing the many other features available is on SGD's YeastMine Video Tutorials page.

Option 1. Search a list of genes

    • With this input option, you enter a space separated list of genes/ORFs by standard name (ie. SIR2), systematic name (ie. YHR023W), and/or SGDID (ie. SGD:S000000001). By default resources are generated for the S288C reference strain, but multiple alternative reference strains may be specified by holding the Control (PC) or Command (Mac) key and selecting from the strains listed to the right under "Pick one or more strains". Click the "Submit Form" button to view resources or the "Reset Form" button to clear all fields.
    • Optional: You may also retrieve information about flanking sequences of an entered gene/ORF. To do this, type the length of the flanking region you would like to retrieve in the boxes (Upstream and/or Downstream) below where you entered the gene name(s).
      • Note: negative numbers are not accepted in these boxes. If you would like to retrieve part of an ORF you should use "Option 2: Search a specified chromosomal region of S288C genome".

Option 2. Search a specified chromosomal region of S288C genome

    • This entry option allows you to specify a region by entering the desired chromosome and coordinates. Select a chromosome number using the pull-down menu, and then enter the start and end basepair coordinates in the boxes provided below. If you do not specify the basepair coordinates, resources are displayed for the entire chromosome.
    • If you would like to retrieve the reverse complement of the sequence, check the "Use the Reverse Complement" box OR enter the reverse coordinates in ascending (Watson strand) or descending order (Crick strand) as desired.
    • Click the "Submit Form" button to view resources or the "Reset Form" button to clear all fields in the section for Option 2.
    • Chromosomal coordinates for ORFs are found on the Locus Summary pages in the Sequence section. The boundaries for introns (if present) and exons are also listed. Nucleotide BLAST reports display chromosomal coordinates of the 'subject' gene(s) queried for by DNA alignment of sequences with similarity to S. cerevisiae. Coordinates can also be derived from the JBrowse Genome Browser.

Option 3. Analyze a raw DNA or Protein sequence

    • This entry option allows you to specify a raw sequence from which to retrieve information. First, use the pull-down menu to select the type of sequence to analyze, DNA or protein. Next, type or paste in a sequence in the entry box.
      • Note: sequence entered must be provided in RAW format, without comments (numbers are okay).
    • To retrieve the reverse complement of the sequence, check the "Use the Reverse Complement" box.
    • Click the "Submit Form" button to view resources or the "Reset Form" button to clear all fields in the section for Option 3.

Choose Resource to View

After you submit either 1) a gene/ORF list, 2) a chromosomal region, or 3) a raw DNA or protein sequence, a page is returned that lists all available information, displays, analyses, and sequence retrieval options available for your query. Descriptions for each option are below. Please note that some of them may not be available for your selected entry, depending on the type of query you submitted.

If you would like to change your original selection, navigate "Back" in your browser. This will bring back the Gene/Sequence Resources entry page with your original query still present. You can then modify and re-submit the form.

Option 1 (Genes/ORFs) specific resources

A list of genes in your query is displayed (with the current four selected) at the top of the page. After "Download All Sequences" are links to batch files containing all available coding DNA, genomic DNA, and Protein sequences for the entire list of genes/strains in your query in either FASTA/GCG format.

The output table presents resource categories in the first column and displays resource links for each gene/ORF in separate columns headed by the gene's standard/systematic name. Resources are displayed in the table for the current set of four genes selected; you can get to the next set of genes by clicking on either the page numbers or the word “Next” just above the table display.

  • Locus and Homolog Details: This section contains links to both the SGD Locus Summary Page and Alliance of Genome Resources Gene Page. The former presents S. cerevisiae functional annotations, the latter relational data to other model organisms.
  • Alignment/Variation: This section contains links to a variety of alignment tools for gene sequences in S. cerevisiae strains and close species. Variant Viewer highlights sites of variation in the alternative reference strains available for batch sequence download. Strain Alignment and Fungal Alignment display alignments to additional non-S288C strains and non-cerevisiae Saccharomyces species.
  • Sequence Downloads: For each selected ORFs column, Coding Sequence (DNA without introns or flanking regions) and Protein Translation (translated peptide from coding sequence) provide links to download files containing sequences for the ORF in each strain specified on the query page.

Options 1 and 2 (Genes/Regions): Genome Display

The JBrowse link displays the sequence features within a region of the reference (S288C) genome spanning the selected gene/ORF clone or chromosomal coordinates, as well as RNA genes, centromeres, Ty elements, etc. , and also allows visualization of sequence-based experimental data such as nucleosome positioning, mRNAs, transcription factor binding sites, etc., inside the selected region.

All Options: Sequence Analysis

These links provide access to resources through queried with options 1) a gene/ORF list, 2) a chromosomal region, or 3) a raw DNA or protein sequence. If multiple strains are selected for Option 1, a drop down menu at the top of the Sequence Analysis section allows you to specify the strain of the ORF sequence you would like to analyze.

  • BLAST/Fungal BLAST Search: These link to SGD BLAST and SGD Fungal BLAST forms in which the query sequence is already pasted. One can then select the desired dataset and search options, and then submit the search form.
  • Design Primers: This links to the Design Primer feature, with the selected DNA sequence already pasted in. Design Primers locates primers for PCR or sequencing of an entered sequence based on specified parameters.
  • Genome Restriction Map/Restriction Fragments: This links to the Yeast Genome Restriction Analysis feature, with the sequence you entered already pasted in. You may then select all or a specified set of restriction enzymes with which to generate the Restriction Map. The Restriction Fragments link presents this information in tabular format
  • 6-Frame Translation (with Restriction Map): This link displays a 6-Frame Translation of the selected sequence in GCG format. This translation is accompanied by the DNA sequence with a restriction map and is generated at SGD.

These additional resources that are available when raw DNA sequence is entered via Option 3.

  • Translated Protein Sequence/Genome Pattern Matching: This link displays the protein translation of the entered DNA sequence (using the first reading frame) if the raw sequence is longer than 20, otherwise it directs you to the Genome Pattern Matching tool.
    • Note: when a protein sequence is entered via Option 3, the links for the Genome Restriction Map and Design Primers are unavailable.