Post date: Nov 07, 2013 7:53:39 PM
There is now a figure projects/timema_wgwild/hmm/fstHmm3state.png that has SNPs colored by HMM state (red = high Fst, black = intermediate Fst, blue = low Fst).
We are pretty much set with the HMM analyses unless we want to run models with pre-defined parameters. Here are some key summaries of the HMM results:
##### proportion of SNPs in each state, 1 = high, 2 = intermediate, 3 = low #####
HVAxHVC: 0.13167292 0.84663295 0.02169413
MR1AxMR1C: 0.07653278 0.90173119 0.02173603
R12AxR12C: 0.19216628 0.78717088 0.02066284
LAxPRC: 0.29863857 0.68264369 0.01871774
So, most SNPs are in the intermediate category, but we have more high Fst SNPs than low Fst SNPs, and we have more high Fst SNPs for some population pairs than other.
##### summary of Fst for each state and population pair #######
HVA x HVC
state 1
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.00000 0.00473 0.01424 0.02200 0.03112 0.32310
state 2
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 0.0000 0.0013 0.0056 0.0121 0.0160 0.3576
state 3
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.000e+00 5.356e-07 2.237e-06 3.161e-06 5.195e-06 2.626e-05
MR1AxMR1C
state 1
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.000000 0.006126 0.018220 0.027810 0.039280 0.407200
state 2
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.00000 0.00159 0.00661 0.01443 0.01891 0.46140
state 3
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.000e+00 6.591e-07 2.710e-06 3.719e-06 6.155e-06 2.689e-05
R12AxR12C
state 1
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.000000 0.007105 0.019430 0.027130 0.038920 0.395100 1
state 2
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.00000 0.00147 0.00604 0.01271 0.01702 0.39750
state 3
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.000e+00 6.441e-07 2.706e-06 3.805e-06 6.204e-06 3.314e-05
LAxPRC
state 1
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.00000 0.01262 0.03497 0.05149 0.07145 0.79550
state 2
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.00000 0.00239 0.01009 0.02326 0.02955 0.73830
state 3
# Min. 1st Qu. Median Mean 3rd Qu. Max.
#0.000e+00 1.051e-06 4.376e-06 6.266e-06 1.033e-05 5.665e-05
One thing you will see here is that even the high Fst mean isn't super high, and that it is the mean not the maximum that really distinguishes high and intermediate Fst regions. Also the mean for high Fst SNPs differs among population pairs.
######## number and size of high Fst regions ##############
Number of regions
HVAxHVC = 7041
MR1AxMR1C = 3800
R12AxR12C = 10,628
LAxPRC = 18,135
Size (in SNPs)
#pops Min. 1stQu. Median Mean 3rdQu. Max.
#HVAxHVC 14 48 69 82.13 102 414
#MR1AxMR1C 19 52 74 88.45 110.2 470
#R12AxR12C 12 43 64 79.4 98 729
#LAxPRC 12 37 56 72.32 90 744
So, we have a large number of modest-sized high Fst regions (even the smallest ones include more than 10 SNPs), with some being rather large but not huge (the biggest ones are bigger for the last two population pairs, but the distributions are otherwise kind of similar).
####### model parameters ###########
mean and standard deviation for each state on the logit scale, followed by the state transistion matrix (probability of going from the row state to the column state)
HVAxHVC
mean -4.790926 -5.681323 -12.945876
sd 2.253342 2.253342 2.253342
[,1] [,2] [,3]
[1,] 0.93829792 0.06170208 0.00000000
[2,] 0.02915673 0.93574621 0.03509706
[3,] 0.00000000 0.92500598 0.07499402
MR1AxMR1C
mean -4.598096 -5.482681 -12.672086
sd 2.246369 2.246369 2.246369
[,1] [,2] [,3]
[1,] 0.94643689 0.05356311 0.00000000
[2,] 0.01765167 0.94949804 0.03285029
[3,] 0.00000000 0.93726652 0.06273348
R12AxR12C
mean -4.405266 -5.552374 -12.688071
sd 2.215803 2.215803 2.215803
[,1] [,2] [,3]
[1,] 0.9562962 0.04370382 0.00000000
[2,] 0.0187736 0.94785519 0.03337121
[3,] 0.0000000 0.92596841 0.07403159
LAxPRC
mean -3.731380 -4.997439 -12.207722
sd 2.251596 2.251596 2.251596
[,1] [,2] [,3]
[1,] 0.95608228 0.04391772 0.00000000
[2,] 0.02697389 0.93897042 0.03405570
[3,] 0.00000000 0.92273475 0.07726525
Notice you usually are most likely to stay in the state you are in, except for state 3 the low Fst state.
I am going to add much of this same information to the Timema genome WIKI.