Post date: Jan 04, 2021 2:54:55 PM
Comment créer un fichier nexus à partir de radseq data?
Skyline a l'air d'etre un bon modele pour calculer TMRCA - https://taming-the-beast.org/tutorials/Skyline-plots/
Que retenir de la dernière réunion avec Zach? Je ne m'en souviens pas!
From vcf to nexus with radseq
https://support.bioconductor.org/p/87122/
Step 1 - convert vcf to nexus
I used the code available here on Python:
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwiRstiYxILuAhWio1kKHeDPCC4QFjAAegQIAxAC&url=https%3A%2F%2Fgithub.com%2Fedgardomortiz%2Fvcf2phylip&usg=AOvVaw0m3BixvdFD_Ae9804f_f4E
Import vcf file from cluster:
(base) ipsec-172-16-84-37:~ rozenn$ scp u6028866@kingspeak.chpc.utah.edu:/uufs/chpc.utah.edu/common/home/u6028866/Pando/variants/pando_only_variants/filter_hets_80/pando_80_stringent_filter.vcf /Volumes/Data/Dropbox\ \(GaTech\)/Thesis/Fall2020/4.PandoProject/Analyses/1-vcf2nex/pando_8826.vcf
Add the header to the vcf file
1 - extract the header
2 - add the header to the file
(base) ipsec-172-16-84-37:1-vcf2nex rozenn$ ls
LICENSE README.md pando_8826.vcf vcf2phylip.py
(base) ipsec-172-16-84-37:1-vcf2nex rozenn$ python vcf2phylip.py -i ../2-addHeader2vcf/merged_8826.vcf -b -p
Converting file '../2-addHeader2vcf/merged_8826.vcf':
Number of samples in VCF: 109
Total of genotypes processed: 8826
Genotypes excluded because they exceeded the amount of missing data allowed: 0
Genotypes that passed missing data filter but were excluded for being MNPs: 0
SNPs that passed the filters: 8826
Biallelic SNPs selected for binary NEXUS: 8826
Sample 1 of 109, 'potr-017-S', added to the binary matrix.
Sample 2 of 109, 'potr-025-B', added to the binary matrix.
.
.
.
Sample 109 of 109, 'potr-277-S', added to the binary matrix.
Done!
This gives a .nexus file with 0, 1 and 2. 0 is the reference homozygote genotype, 1 is hets and2 is the alt homs. However, we do not know which is the ref allele, we should only see 0s and 1s. I convert all 2s to 1s.
I did this on Excel. I opened the .nexus file. I selected the potr + genotype line, then separated them with the "text to column" option in "Data". I then replaced all 2's by 0's and saved the file. The format was kept the same.