Biochemical function analysis of a novel Pseudomonas aeruginosa
reverse transcriptase through high throughput sequencing
Seung Kuk Park, Shuo-Fu Yuan, YuneSahng Hwang
Institute for Cellular and Molecular Biology, The University of Texas at Austin
Introduction
For decades, many bacterial reverse transcriptases (RTs) have been characterized including group II intron RT, Abi, DGR and retron. However, biological and biochemical functions of many uncharacterized bacterial RTs still remain unknown. Among the uncharacterized bacterial RTs, we identified a RT from Pseudomonas aeruginosa (we designated it as P. a YIDD RT in this study) possessing the amino acids YIDD motif in the RT active site, which is structurally similar to a group IIC intron RT but not having a group II intron-like element. In contrast to G. streothem ophilus group IIC intron RT (GsI-IIC RT), the P. a YIDD RT has a relatively longer thumb domain but with a shorter DNA binding module (Figure 1A) (1-3). Using Pseudomonas aeruginosa as a model organism, we purified the P.a YIDD RT protein and found that its biochemical activities differ from those of a group II intron RT, but is more similar to those of eukaryotic DNA-repair polymerase θ (translesion polymerase) including snap-back copying of ssDNA and RNA templates with low fidelity and exhibiting a Mn2+-dependent terminal transferase activity that can switch between template-dependent and -independent modes during DNA synthesis (Figure 1B) (4).
To investigate its biochemical properties, we first performed a high-throughput sequencing utilizing a 50-nt single stranded DNA or RNA template for short cDNA extension synthesized by P.a YIDD RT, its mutant YIDD (I238A) RT and a GsI-IIC RT (as the control) in the presence of MgCl2 only or mixture of MgCl2 and MnCl2(5). Next, we analyzed the initial DNA extension patterns and determined whether the P.a YIDD RT behaves differently with YIDD (I238A) RT and GsI-IIC RT (both RTs have a YADD active site ) in terms of snap-back replication, terminal transferase activity and fidelity.
Through the above mentioned computational analysis, novel biochemical functions of P.a YIDD RT could be identified and further validate its potential role in DNA double-strand break repair, a unique property observed in DNA polymerase θ such as translesion polymerase.
Work Flow
Results & Discussion
The length distribution of extended DNA of YIDD RT and other RTs using different oligo templates and buffer compositions
First, a single stranded DNA or RNA extension experiment was conducted using YIDD RT and other RTs, and the libraries for the single stranded DNA or RNA oligos extension products of YIDD RT, YIDD (I238A) RT and the control GsI-II C RT were prepared by following TGIRT-seq protocol, and performed Miseq PE75 analysis, then the raw data was analyzed by Biopython codes (6, 7)(Table1).
When using a DNA oligo at MgCl2 buffer condition, the major length of extended product for YIDD RT is around 11-nt (Figure 2A, Table1). In contrast, a major extended product for YIDD (I238A) RT is about 13-nt in length with a more even length distribution of extended DNA. In the MgCl2/MnCl2 buffer condition, YIDD RT showed about 5-nt longer DNA extension than the one at MgCl2 buffer condition, and the product of YIDD (I238A) RT extended 8-nt longer than the one at MgCl2 buffer condition (Figure 2B, Table1). Unexpectedly, the most abundant products from GsI-IIC RT were an adenine nucleotide appended to the 3’end of the DNA oligo at both MgCl2 and MgCl2/MnCl2 buffer conditions, which might be due to bias generated from sequencing library preparation (Figure 2A and B, Table1).
None of RTs showed noticeable extension from the RNA oligo template in the presence of MgCl2, but YIDD and YIDD (I238A) RTs showed some extensions in the presence of MnCl2 (Figure 2C and D, Table1).
Five different extension patterns were formed during DNA extension of YIDD RT and other RTs
Based on the workflow of processing RNA sequencing results, we first trimmed primers and input template sequence of the sequencing reads and sorted reads by length. Subsequently, we aligned all sequence using clustalW sequencing alignment tool (8). We were able to classify the extension patterns into five distinct types for all RTs including perfect snap-back replication, occurrence of mismatch, jumping snap-back replication, terminal transferase activity and terminal transferase activity to snap-back replication (Figure 3). The perfect snap-back replication refers any extended DNAs are complementary sequence of input oligos without having mismatches or gaps. The occurrence of mismatch stands for 1-2 nucleotides do not base-pair with input oligos during the snap-back replication. The jumping snap-back replication is the snap-back replication with a large gap (around 20nt) forming within the extended sequence (Figure 3). The terminal transferase activity represents 1-7 nucleotides extended from the 3’-end of input oligo do not match with any input oligo sequence, and the terminal transferase activity to the snap-back replication means that a RT starts an extension with terminal transferase activity and the extension subsequently transfers to the snap-back replication (Figure 3).
Analysis for the ratio of the initial extension patterns between snap-back replication and terminal transferase activity
Next, we attempted to analyze what extension is more preferred in the beginning between snap-back replication and terminal transferase activity. Since both DNA and RNA oligos end with one or two A nucleotides mediated microhomology base-pairing is required for the snap-back replication, we searched for any T nucleotide positions that A nucleotide can be annealed (Figure 4A). Then, we counted the frequencies of first 5 nucleotides of each possible snap-back replication position beginning with a A-T microhomology base-pair to calculate the ratio of snap-back replication for the initial extension. As a result, we found that about 82% of total sequence reads start with snap-back replication using YIDD RT when using Mg2+ buffer and a DNA template, and we considered the rest of them as starting with terminal transferase activity (Figure 4B left panel). In the presence of MnCl2, ratio of snap-back replication was decreased by 60%, and YIDD RT mostly starts extension with terminal transferase activity using RNA oligo with or without MnCl2. Unexpectedly, YIDD (I238A) RT showed better extension for the DNA oligo (Fig. 2A) but with a lower ratio of snap-back replication than that of YIDD RT, indicating that isoleucine to alanine mutation leads to a higher terminal transferase activity (Figure 4B, middle panel). Since the major product of GsI-IIC RT is an A nucleotide appended to the DNA and RNA oligos, it showed very low ratio of snap-back replication (Figure 4B, right panel).
- (a) A left panel, a schematic diagram for cases of reactions after initial snap-back replication. A right panel, an example of calculation for perfect snap-back replication. (b) Graphs for ratio of perfect snap-back replication versus mismatches or jumping snap-back replication of RTs using the DNA or RNA oligos in different buffer conditions.
Analysis for the fidelity of YIDD RT and other RTs after GAATT initial snap-back replication
Since we found that there are three possible extension patterns after the initial snap-back replication including perfect snap-back replication, occurrence of mismatches, and jumping snap-back replication based on the clustalw alignment (Figure 3), we investigated fidelity of YIDD RT by analyzing all reads starting from the GAATT sequences (GAATT site is the dominant snap-back replication position based on our analysis) (Figure 5A). We calculated a ratio of perfect snap-back replication by dividing the total number of reads by length that starts with GAATT initial snap-back replication by the total number of the same length of snap-back replication sequence. As a result, YIDD RT showed 35% of perfect snap-back replication in Mg/DNA condition, and the ratio dramatically decreased in the presence of MnCl2, suggesting Mn2+ mainly causes more terminal transferase activity, or lower the fidelity of YIDD RT (Figure 5B left panel). Interestingly, YIDD RT showed better fidelity in Mg/RNA condition than Mg/DNA, and it indicates that although YIDD RT mostly prefers the terminal transferase activity on the RNA oligo at the beginning, cDNA was stably synthesized from the RNA template. YIDD (I238A) RT showed less than 20% of perfect snap-back replication in Mg/DNA condition, which was lower fidelity than YIDD RT, indicating that isoleucine to alanine mutation increased terminal transferase activity or decreased the fidelity of YIDD RT (Figure 5B middle panel). GsI-IIC RT showed the highest fidelity among RTs in any condition. Specifically, its fidelity of replication was not significantly changed when using different metal ions and oligo substrates once GsI-IIC RT started snap-back replication comparing to other RTs (Figure 5B right panel).
Taken together, based on our sequencing analysis, we verified YIDD RT has lower fidelity than GsI-IIC RT, and a mutation within its active site changing isoleucine to alanine does not improve and even worsen its fidelity. In addition, the YIDD RT extension condition in the presence of MnCl2 contributes longer cDNA extended products when using either a DNA or RNA oligo template, but with a lower fidelity by increasing the ratio of terminal transferase activity, mismatches or jumping snap-back replication suggesting its fidelity is highly dependent on Mg2+.