Directed evolution is a widely utilized method for engineering proteins and nucleic acids with enhanced or novel functions. By generating a diverse mutant library and applying iterative rounds of selection, researchers can isolate variants with improved biochemical properties such as activity, specificity, or stability. This approach has had transformative impacts across biotechnology, including enzyme design, therapeutics, and synthetic biology — culminating in the 2018 Nobel Prize in Chemistry.
Following one or more rounds of selection, accurate identification of enriched variants is critical for downstream analysis. This step typically involves sequencing individual clones to determine the distribution and identity of beneficial mutations. However, traditional sequencing methods introduce significant limitations.
Sanger sequencing is widely used due to its reliability and accuracy. However, it presents key limitations when applied to high-throughput or full-length variant analysis:
Limited read length (~800–1000 base pairs), which may not capture entire constructs
Low scalability, as each variant must be sequenced individually
Manual processing, including read alignment and mutation mapping
Increased time and cost with growing library size
Not ideal for pooled or mixed-sample sequencing
These factors make Sanger sequencing inefficient for modern directed evolution workflows, especially when analyzing large mutant libraries.
Long-read platforms such as Oxford Nanopore address many of the limitations inherent to Sanger sequencing by enabling high-throughput, full-length analysis:
Extended read lengths, capable of covering entire plasmids or gene constructs
High-throughput potential, suitable for complex or pooled libraries
Single-molecule resolution, allowing detection of rare variants
Fewer preprocessing steps, with reduced need for tiling or assembly
Plasmidsaurus Premium PCR offers amplification-free sequencing and delivers raw .fastq files directly for flexible downstream use
This approach is ideally suited for directed evolution experiments that require full-length, high-resolution variant identification.
Remaining Challenges in Data Analysis Despite improvements in sequencing technology, data analysis remains a barrier — particularly for research groups without dedicated computational expertise. Commercial platforms like Geneious offer graphical interfaces for demultiplexing and consensus generation, but:
Incur substantial licensing costs (e.g., ~$200/year/student)
Require manual processing of individual samples
Are not optimized for high-throughput variant analysis
We built a lightweight, open-source tool that:
Accepts .fastq files and user-supplied barcodes
Filters reads by length
Demultiplexes and assembles full plasmid sequences
Exports final consensus sequences in FASTA format
The goal: make long-read analysis fast, accessible, and scalable — without relying on paid software.