Directory for whole genome paired end reads: /uufs/chpc.utah.edu/common/home/gompert-group1/data/lycaeides/whole_genomes/sequences
In this directory I used the 3kb files.
Directory for alignments and variant calling: /uufs/chpc.utah.edu/common/home/gompert-group1/data/lycaeides/lyc_sv
Directory for the new PacBio corrected genome: /uufs/chpc.utah.edu/common/home/gompert-group3/data/LmelGenome/Lmel_dovetailPacBio_genome.fasta
Doing Aligntments:
Directory: /uufs/chpc.utah.edu/common/home/gompert-group1/data/lycaeides/lyc_sv/Alignments
Used the paired end sequences to do alignment to the dovetail melissa genome using bwa mem. Here the script is modified to align paired end reads. Used the script wrap_qsub_slurm_bwa_mem.pl for alignments using bwa. This file created the sam files.
Converted sam files to bam files using the script wrap_qsub_slurm_sam2bam.pl.
These are files aligned to the new genome.
I then copied these sorted bam files with their indexes to the sv-callers/snakemake/data_lyc/bam folder.
Variant calling using sv-callers:
Directory: /uufs/chpc.utah.edu/common/home/gompert-group1/data/lycaeides/lyc_sv/Variantcalling
source acitvate wf
cd snakemake
Then edit the analysis_lyc.yaml and Snakefile to make sure all the paths and everything are set.
For running snakemake:
The cluster specifications are in cluster.yaml file.
To submit Snakemake jobs use run_workflow.sh and execute from command line as ./run_workflow.sh
This will submit all the jobs in parallel to the cluster.