The Illumina sequencing datasets used in our project, provided by the Whelan Lab, show the sequencing data from each round of aptamer selection against the CA125 ovarian cancer biomarker.
Datasets received from this study: Scoville, D.J., Uhm, T.K.B., Shallcross, J.A., and Whelan, R.J. (2017). Selection of DNA aptamers for ovarian cancer biomarker CA125 using one-pot selex and high-throughput sequencing. J. Nucleic Acids
In the past, our lab has done Illumina Sequencing of aptamer selections and used the Galaxy software to analyze the sequencing data, but we lacked a detailed guide on how to use Galaxy, as well as a comparison of the different software available to analyze our aptamer sequencing data. One part of our research serves to provide more clarification to the different software.
Analyze sequencing data
Web-based platform
Does not require any additional download or installation.
Identify similarities in sequence and levels of enrichment
No secondary structure analysis program
Requires pre-requisite programs and codes to install the software.
Identify commonalities such as structural components (motifs, etc), mutations, and levels of enrichment
Built-in tool for secondary structure analysis
In past research projects, we have had issues with Galaxy. When trying to process a job, it would often get held up on the Galaxy cloud server as others tried to process jobs on the cloud simultaneously. In addition, as mentioned, Galaxy does not have built-in secondary structure analysis software, and that is why we have been looking into other software such as AptaSUITE, which has more programs built within.
This is a project previously conducted in our lab, in which aptamers were developed against RNA dependent RNA polymerase from poliovirus 1 (RdRP), as well as LoaP and NasR (antiterminator proteins). Similar methods were used as in our current project, performing aptamer selection, and using Illumina sequencing for the selection rounds, and then analyzing the sequencing data using the Galaxy and Mfold softwares. This project is very similar to our own in the methods used to develop aptamers against target proteins.
Some next steps would be to select aptamer candidates to test for binding affinity from the data used in our project using binding affinity assays.
We currently have DNA samples from aptamer selections that have been sequenced using Sanger Sequencing, in the future we hope to also do Illumina sequencing on the DNA samples and compare the results, looking for more promising aptamer candidates.
We want to improve our skills within our current software: Galaxy, AptaSUITE, MEME, and Mfold.
AptaSUITE has a rather difficult installation process, and we hope to create a simple step-by-step guide on installing the software.
We also want to look into other software available for analyzing aptamer sequencing data such as "FastAptamer" and more.
We will develop workflows/guides for our lab on using this software.
Our lab will conduct aptamer selections against target proteins, perform Illumina sequencing, and use the workflows we have created to analyze sequencing data and identify aptamer candidates.
Spike protein from SARS-CoV-2 (which causes COVID-19)
Other cancer biomarker proteins