Post date: May 04, 2016 7:17:30 PM
1. Assess completeness of the genome assembly with BUSCO (this replaces CEGMA). This should also generate the output I need to train SNAP (which needs to be done as part of the MAKER2 pipeline). All dependencies were downloaded and compiled on my desktop, and I am running this on my desktop computer from ~/Local/cmacBusco (this seemed easier than install all of the dependencies on kingspeak). Here is the command for BUSCO.
python ~/Downloads/BUSCO_v1.2/BUSCO_v1.2.py -o cmacbusco -in final.assembly.fasta -l ~/Downloads/arthropoda/ -m genome
Summary of results for short run (long might be better):
Summarized benchmarks in BUSCO notation:
C:32%[D:0.8%],F:22%,M:44%,n:2675
877 Complete BUSCOs
854 Complete and single-copy BUSCOs
23 Complete and duplicated BUSCOs
615 Fragmented BUSCOs
1183 Missing BUSCOs
2675 Total BUSCO groups searched
2.