Extract Fasta in list

This software extracts Fasta sequences matching a list of keywords or sequences (new in v009 and later). The software is designed for computers with low RAM and few CPU cores. For big Fasta files you can use Extract Fasta in list parallel or better Extract Fasta in list parallel GO (keywords in title only).

Manual :

1- install Perl free programming language and GNU parallel.

2- unzip the software

3- copy your Fasta files in the “fasta” directory.

4- copy your reference lists in the “lists” directory, one item per line.

The lines of the reference list are interpreted as strings :

GEN1 matches GEN1, GEN11, GEN12 etc.

(GEN1) matches (GEN1)

The fasta block (title + sequence) is flatten before the search, so it is possible to search a sequence.

5- edit parallel_extract_conf.txt to set the number of CPUs for parallel processing.

6- execute the software by the command : perl parallel_extract_fasta-0.1.pl

7- processed files are in the “results” directory.

8- search log files are in the “log” directory.

download

Page updated

Google Sites

Report abuse