Code/Scripts

In this section, I am including scripts and calculations from analyses used in my research.

The goal of this is to offer a small tool to new biologists and students interested in learning known methods for the analysis of population genetics and phylogenetics. Posts can be in English and Spanish.

The scripts with the R-code of statistical analysis to run morphological analysis developed in my PhD thesis using crayfish as a study model.

Integrating alpha, beta, taxonomic and phylogenetic diversity on anuran fauna in different environmental gradients of tropical forests in western Ecuador - Supplementary material

by Luis Amador, Juan Manuel Guayasamin and Mauricio Soto-Gamboa (2019)

The scripts with the R-code of statistical, ecological, and phylogenetic analysis and the input files to run and replicate the analysis can be found here.

Analysis of Molecular Variance (AMOVA)

For the first chapter of my PhD dissertation, I performed an Analysis of Molecular Variance (AMOVA; Excoffier et al., 1992) with the function amova from the pegas package (Paradis, 2010).

References

Excoffier L, Smouse PE, Quattro JM. 1992. Analysis of molecular variance inferred from metric distances among dna haplotypes: Application to human mitochondrial dna restriction data. Genetics 131: 479–491

Paradis E. 2010. pegas: an R package for population genetics with an integrated–modular approach. Bioinformatics 26: 419–420. 

####Analysis of MOlecular VAriance####


library(pegas)


data <-read.FASTA("my_fasta.fasta")

d.genetic <- dist.dna(data)


##Reading data frame where to find the hierarchical levels#

my.data<-read.delim("my_dataframe.txt", header=TRUE, sep="\t")


##Suppose that our matrix has four levels (Sample, Site, Region and Basin)#

ID <-my.data$Sample

site <-my.data$Site

clade <-my.data$Region

basin <-my.data$Basin


##Then we make "site", "clade" and ""basin" as factors. Factors are used to represent categorical data. Factors are an important class for statistical analysis like AMOVA#

sit <-factor(site)

cl <-factor(clade)

bas <-factor(basin)


### AMOVA one level

amova(d.genetic ~ cl, nperm = 10000) # 1 level

amova(d.genetic ~ sit, nperm = 10000) # 1 level

amova(d.genetic ~ bas, nperm = 10000) # 1 level

amova(d.genetic ~ ID, nperm = 10000) # 1 level


### AMOVA two levels

#Using clades

d_gp <- amova(d.genetic ~ cl/sit, nperm = 10000)

summary(d_gp)

sig2 <- setNames(d_gp$varcomp$sigma2, rownames(d_gp$varcomp))

getPhi(sig2) # Phi table

write.pegas.amova(d_gp, file = "amova1")


#Using basins

d_bs <- amova(d.genetic ~ bas/sit, nperm = 10000) # 2 levels

summary(d_bs)

sig2.1 <- setNames(d_bs$varcomp$sigma2, rownames(d_bs$varcomp))

getPhi(sig2.1)

write.pegas.amova(d_bs, file = "amova2")


###If you want to include the percentage of variation for each level in the table that will go in your manuscript, you have to do a little additional calculation. You have to divide each individual SSD by the total SSD and then multiply that value by 100###