Post date: Oct 31, 2019 11:58:8 PM
10/31/2019 (Rozenn)
2023 alignments were completed.
We started the file compression from SAM to BAM.
I looked up information about the genome we are using. It is very fragmented and we would like to compare the alignments with another reference genome.
popgenie.org has all the populus reference genomes.
for Populus tremuloides
v1.0 last modified 6/3/15
v1.1 last modified 9/25/16
Also available are
Populus tremula
v1.1 last modified 2/14/19
v2.2 last modified 10/15/19
Populus tremula x Populus tremuloides
v1.0 last modified 6/3/15
Populus trichocarpa
v1.1 last modified 6/7/15
v2.2 last modified 9/20/17
v3.0 last modified 11/25/18
When I download and compare both releases of P. tremuloides genome using cmp, they are exactly the same.
According to Hamzeh et al. 2004 (Wiley) and Wang et al, 2014 (PlosOne), P. tremula is the closest parent to P. tremuloides.
P. tremula
grep -c "^>" > Potra02_genome.fasta gives 1601 scaffolds
The sum of all scaffold characters is 417012181.
Weird : the file was not zipped. smaller than the other genomes?
grep -v "N" -c tremulaSeq gives 7 639 524 characters, potentially base pairs, with 537941 Ns.
What is the average length of the scaffolds?
Do the same thing wih Populus tremuloides genome gives 4 669 075 characters without the "N", with 127374 Ns.
Je ne comprends pas: il y a environ 360 Mbp soit 360 000 000 bp. : ask Zach tomorrow.