DNA Sequence Alignment/Mutant Verification:
1. Download the sequence and absorbance files for each construct that was sequenced. Place each corresponding sequence-absorbance pair in a folder named after the construct. The folders should all then be located in their corresponding sequence date folder in Norimatsu Lab\DNA and RNA\Sequencing Results.
2. Open the sequence file with ApE plasmid editor (blue and green loops icon on Start bar). Delete the unreliable region from both ends. Starting at the 5’ end and moving forward, look for the last N in that region and then delete the entire sequence up to that point. Repeat for the 3’ end.
3. If the construct was sequenced with a reverse primer an additional step is required. Reverse primer sequences carry the word REV in the file names. Forward primer sequences have the word FOR in the file names. If a reverse primer was used, see step 4; otherwise, go to step 5.
4. Highlight the entire sequence. Under Edit, click Copy Reverse Complement and paste the resulting sequence into a new ApE window. Save the new sequence with the same name as the original sequence but add “…RevCom” at the end to indicate that this sequence is the reverse complement.
5. If a forward primer was used to sequence, you may use the original sequence for the alignment. Otherwise, use the the new reverse complement file you created.
6. Open the wild type DNA sequence of the corresponding protein. You can find it in Norimatsu Lab\DNA and RNA\Construct Sequences\. You will see a list of folders for different proteins. Inside each of those folders, there are two additional folders: ORF and Plasmid. The ORF folder contains the open reading frame (coding sequence for the protein) Open the ORF folder and look for the sequence of interest.
7. On any ApE window, click on the Tools tab, Align sequences. A new window will pop up. Select the wild type sequence as your Reference Sequence at the top and from the list in the white box below select the other sequence you are aligning.
8. Click on Show Alignment Parameters and check that the settings are the following: Blocks = 10, N-W Max = 200, Mismatch Penalty = -1, Gap Penalty = -2, Gap Ext. Penalty = 0, Line Width = 100.
NOTE: The settings may need to be changed to accommodate different sequence alignments and results.
9. Look for the red highlighted mismatches and verify that the mutation on the sequence result file is both correct in identity and location. To do this, multiply the amino acid position in the mutation by 3. The product is the last nucleotide in the codon that encodes that amino acid. Subtracting 2 from that number yields the first nucleotide in the triplet. Take the start and end nucleotide positions you found, find them in the wild type sequence, and check that the mutation in the other sequence is correct by translating the mutated codon with a codon table and matching it to the last amino acid in the mutation name.
10. Verify the absence of any other mismatches or deletions. The only mismatch should be the one from the intentional mutation. Sometimes spontaneous mutations occur and even if the intentional mutation is there, the mutant is considered a failure. However, sometimes when a sequence is being read at the company, reading errors may arise that eventually mess up the final sequence. These are reading errors and not mutations per se. The two most common are the following:
Some nucleotides in a repeat might sometimes be omitted by the computer, which will show in the alignment as a single or double deletion.
In aligning the sequences, ApE might match the initial 5’ end nucleotides of the mutated strand to a random (often incorrect) region in the wild type strand, then introduce a huge gap, and match the rest of the mutated strand to a different region in the wild type strand (often the correct region).
Similarly, the the program might align the majority of a sequence correctly up to a few nucleotides from the 3’ end, then introduce a huge gap and match the remaining nucleotides to an incorrect region of the sequence.
To determine whether a mutation is an artifact, compare the sequences meticulously. You may need to refer back to the absorbance file to check if the apparent deletions that result from a computer error. Follow the instructions below.
1. Open the absorbance file (.ab1 file) using ApE. You will see a collection of different color peaks that represent the nucleotides that make up the sequence. If the sequence was obtained with a forward primer, the absorbance file and the sequence should be identical. If the sequence was obtained with a reverse primer, you will need to convert the absorbance file to its reverse complement. To do this, on the top left corner click Edit, then Reverse-Complement.
2. Look for the region where the computer generated mismatches might be and compare the signal peaks to the noise peaks in and around that region. You can make the peaks wider and farther apart from each other to discern individual peaks. To do this, move the buttons on the right up and down for different settings. Watch out for nucleotide repeat omissions, which can be seen on the absorbance file as peaks not being sharp enough that go unnoticed by the computer.
11. After the sequence has been analyzed, use the Snipping Tool (scissors icon on the bottom Start bar) to screenshot the alignment window.
12. Copy and paste image into a Word document. Above the picture, type the following information:
Mutation Name Protein Name
Original Aminoacid (Original Codon) à Mutated Aminoacid (Mutated Codon)
Nucleotide positions: Codon position 1-Codon position 3
Refer to the example in the binder for more clarity.
13. At the top of the window, click Insert, then under Illustrations, click Shapes and insert a Rectangle. Create a rectangle big enough to surround the mutation. Remove the fill so the mutation is visible under the shape and change the outline to red.
14. If there are any additional unintended mutations/mismatches, gaps, etc., record them below the screen capture and summarize your findings from the sequence and absorbance file analysis. Hypothesize as to what could have caused the mismatch (i.e. polymerase error, unreliable sequencing region, misread of a nucleotide repeat, computer alignment error, etc.), record it, and note whether the DNA needs to be resequenced.
15. Save the Word file in the folder for its corresponding mutation as “Mutation Name” Alignment SUCCESS/FAILED.