eGFP mRNA size distributions were measured using TapeStation CE and compared with read length distributions acquired from ONS data. The CE electropherogram data was exported to .csv file format and the mean average readings across replicates for each time point were calculated and graphed.
Using CE, eGFP mRNA showed a primary peak between 900 and 1150 nt at 0 hrs, as can be seen in Figure 10. The primary peak corresponded to the anticipated mRNA size of 996 nt. This primary peak decreased in intensity as the mRNA was exposed to degrading temperature conditions over time. A range of smaller mRNA strands were detected at later timepoints and can be seen in the electropherograms of the later timepoints. This trend can be seen across all temperatures measured using CE. The secondary peak at 25 nt represents an internal standard added during sample preparation for CE.
Figure 10A-C shows the CE electropherograms of naked eGFP mRNA samples incubated at 25°C, 35°C, and 50°C over time. Figure 10D-F show close-ups of CE eGFP mRNA electropherograms in the range 0-500 nt.
Figure 10: Naked EGFP mRNA CE electropherograms measured using CE. (A-C) CE of EGFP mRNA over time at 25°C, 35°C, and 50°C respectively. (D-F) Close-up of 0-500 nt regions
As can be seen in Figure 10, the CE electropherogram’s primary peak decreases over time as the mRNA is degraded at the elevated temperature conditions. As the mRNA is degraded, fragmented mRNA strands are generated from the full-length strand. These short mRNA fragments are detected in samples from the later timepoints in increasing amounts, as seen in Figure 10D-F. This trend is particularly noticeable at the final timepoints of the 35°C and 50°C samples where it appears that the entirety of the full-length mRNA strand was almost completely degraded, Figures 10B-C.
These results agree with the anticipated outcomes of the project and also with previous thermal stability studies conducted upon naked mRNA (Barros et al., 2025).
Due to the limited number of available ONS flow cells only certain samples were chosen for Oxford Nanopore Sequencing. Due to the limited budget of this project, only one replicate representative of selected timepoints were analysed using Nanopore sequencing. The specific replicate selected for analysis was chosen based on the measured concentration from the Tapestation readings.
Read length distributions generated from ONS data using R show a primary peak between 840 and 880 nt, (Fig. 11A-C). Similar to the CE electropherograms, the primary peak decreases over time and shorter reads become more prominent as the mRNA is subjected to higher temperatures. Increasing numbers of short reads are detected in the samples from the late timepoints across all temperatures, as can be seen in Figure 11D-F.
Figure 11: EGFP mRNA read length distributions measured using ONS. (A-C) Read lengths over time at 25C, 35C and 50C respectively. (D-F) Close-up of 0-500 nt regions
As can be seen in the read length distributions from ONS a sharp primary peak is seen at 860 nt. By inspecting the .fasta files which were used to generate these graphs, it was noticed that few reads contained the 120 nt-long poly(A) tail present in eGFP mRNA, explaining the peak at 860 nt rather than 996 nt. This is a major limitation of the method as poly(A) tail length is an extremely important CQA which must be monitored. As discussed already, there are certain tools which may be used in conjunction with ONS, namely tailfindr, in order to improve the detail of poly(A) tail length estimation.
As expected, ONS delivers higher resolution information on mRNA length and stability. The primary peak is sharper in the graphs made from ONS data when compared to those of CE. As well as information on strand lengths, ONS gives the sequences of every mRNA strand allowing for confirmation of sequence identity in a single measurement. Contrary to Gunter et al. (2023), the size distribution from the two methods do not match, however they are very similar.
A major limitation of the data collected in this report is the lack of replicates measured using ONS.
CE was carried out on the LNP-encapsulated eGFP mRNA samples following isolation of the mRNA from LNPs using ammonium acetate precipitation. As can be seen in Figure 12 below, a primary peak is observed at ~1000 nt which decreases in intensity in samples from later timepoints.
Figure 12: LNP-eGFP mRNA CE electropherograms measured using CE.
A clear distinction can be made between the stability of the naked mRNA samples and that of the LNP-encapsulated mRNA samples. Despite being exposed to 35°C for a longer period of time, the LNP-encapsulated mRNA seems to show a less distinct fragmentation pattern when compared with the 35°C naked mRNA samples. As expected, mRNA is rapidly degraded if it is not encapsulated within a protective carrier.
The results of the characterisation study of mRNA-containing LNPs using the Malvern Zetasizer Ultra are shown below in Table 4.
Table 4: Properties of mRNA-containing LNPs. N/P is defined as the ratio of ionisable amines to the number of phosphates in the RNA backbone. EE% is LNP encapsulation efficiency
This information on the size and polydispersity index (PDI) of the LNPs is important as these qualities have been shown to affect the immunogenicity and distribution of the mRNA therapy in vivo (Hassett et al., 2021, Nakamura et al., 2020). An average LNP diameter of 77 nm was recorded for the three samples measured using the Malvern Zetasizer Ultra. This size is within the range of diameters (60 150 nm) which have been shown to provide a robust immune response in primates, according to Hassett et al. (2021), and so these have been considered acceptable for the stability studies conducted in this study.
The average PDI value of 0.166 indicates that the sample is relatively “monodisperse” or uniform in size. PDI is a measurement of the distribution of particle sizes within the sample and is given as a value between 0 and 1, where 0 represents a completely uniform distribution of particle sizes and values closer to 1 (>0.4) represent an extremely varied spread of particle sizes (Hong, 2025).
Using the data from ONS, the amount of unique eGFP reads in each sample which had a length within 5% (817-903 nt) of the expected full-length (860 nt). This rationale for calculating mRNA integrity was taken from Gunter et al. (2023).
Figure 13: Amount of full-length mRNA reads as percentage of total reads
As may be seen in Figure 13 above, ~76% of reads are within 5% of the full-length at 0 hours. As figures for the acceptable limits of intact mRNA in mRNA therapies are not publicly available, nor have there been any strict guidelines published by regulatory authorities, it is difficult to put this figure into perspective. Leaked documents from the 2020 European Medicines Agency data breach show that the percentage of full length mRNA transcripts present in the Pfizer/BioNTech COVID-19 vaccine, Comirnaty, varied significantly from 55-78% during the production of clinical batches; with commercial lots exhibiting ~70-75% intact mRNA (Tinari, 2021). These figures help place the experimental value comfortably within an acceptable range.
This value also matches closely with the figure recorded in Gunter et al. (2023) which was 77% full length mRNA transcripts.
As may be expected, the percentage of intact mRNA decreases over time, with the fastest rate of degradation occurring at 50°C, the highest temperature recorded. However, the lack of replicates used in this study makes it difficult to make any concrete conclusions.
Base coverage is a measurement of how many reads from a Nanopore sequencing run contained a particular base along the reference sequence. The method for acquiring this information from the sequencing data is described in Appendix A. The base coverage maps have been normalised across samples from the same temperature in an effort to better visualise a comparison. This was done because each sequencing run had unique numbers of reads recorded. The depth coverage maps for Nanopore sequenced samples are shown below (Figures 14, 15, 16).
Figure 14: Depth coverage map - naked mRNA 25°C
Figure 15: Depth coverage map - naked mRNA 35°C
Figure 16: Base coverage map - naked mRNA 50°C. Several dips are present throughout the mRNA sequence
The base coverage maps exhibit a similar trend across all measured temperatures: A decrease in the number of reads containing bases in the 5’ end of the mRNA is seen over time. There may be several reasons which would explain why there is a decrease in the number of reads containing bases from the 5’ end of the mRNA. For example, this decrease may be because the 5’ end of the mRNA, involved in ribosome binding, is less protected through its secondary structure and therefore prone to scission, while the 3’ region remains intact and is drawn through the nanopore and sequenced. However, further research is required in order to investigate the cause of this decrease of reads covering the 5’ end of eGFP mRNA.
There are also certain regions which are marked by a sharp decrease in base coverage. These regions are conserved across all samples and temperatures and are not present in the same positions on the coverage maps of the RNA control strand, ENO2, suggesting that it is inherently linked to this particular mRNA sequence.
In order to discover if the mRNA was fragmenting randomly or not, the positions of the 5’ starts of sequencing reads were extracted and their incidence was counted. Assuming a null hypothesis of random fragmentation, these start positions are expected to follow a Poisson point process if the first 50 positions are omitted to avoid 5’-end bias.
λ was calculated by omitting reads whose 5’ start corresponded to a position within the first 50 positions on eGFP mRNA and dividing the number of reads by the number of positions where a read could start from (Equation 2). If certain start positions occur significantly more often than λ, one may reject the null hypothesis and assume that these sites are more prone to scission.
Equation 2:
The significant cleavage sites largely correspond to the dips seen in the base coverage maps. The top 5 most significant 5’ starts and their corresponding positions on eGFP mRNA for one sample are listed in Table 5 below.
Table 5: Significant cleavage sites - 50°C sample at 6 hours.
The table lists the position, number of reads which start at that position, count/λ, and what bases are present at that site
These cleavage sites remain largely consistent among samples and across temperatures, with the positions 701, 560, 380, and 792 appearing repeatedly. In order to better visualise where these positions are on the eGFP mRNA, RNAfold from ViennaRNA was used to model the secondary structure of the full-length mRNA strand. Inspecting the location of some of the cleavage sites identified earlier reveals that the cleavage sites may be located on single-stranded regions of the mRNA (Figure 17).
Figure 17: RNAfold secondary structure prediction of EGFP mRNA, positions 560 and 701 Significant cleavage sites are circled in red. Note the single-stranded structure of the cleavage sites. Generated using RNAfold from ViennaRNA (Lorenz et. al., 2011)
From these results it appears that single-stranded regions are more prone to in-line hydrolysis, which is consistent with the literature (Wayment-Steele et al., 2021). However, some of the other sites which show up in the analysis, such as positions 380 and 792, do not follow this trend and appear to be located on double-stranded regions of the mRNA (Figure 18).
Figure 18: RNAfold secondary structure prediction of EGFP mRNA, positions 380 and 792 Significant cleavage sites are circled in red. Note the double-stranded structure of the cleavage sites. Generated using RNAfold from ViennaRNA (Lorenz et. al., 2011)
The reasoning behind why certain fragments are occurring more often than others merits further investigation as it has implications for the design of more stable mRNA products. One possible explanation for cleavages occurring in these unconventional double-stranded regions is the effect of refolding on mRNA secondary structure. As shown in the figure below (Figure 19), once fragmentation of an RNA polymer occurs, the newly formed fragment adopts a new secondary structure through refolding. This exposes a different region of the RNA polymer, which is hydrolysed in turn. This process continues until there are no remaining sites which may undergo cleavage. This framework is the basis for certain computational models used for predicting RNA degradation (Rybarczyk et al., 2016) and may explain the phenomena observed in this study.
Figure 19: Effect of refolding on tRNA fragmentation Taken from Rybarczyk et al. (2016)