20 - Choosing a network analysis method
Network analysis is commonly used in microbiome studies to identify keystone species and identify clusters of co-occurring or co-exclusion (Faust and Raes, 2012).
There are numerous network analysis methods available, but an objective key to selecting the most appropriate method is based on network properties such as effective number of species and matrix sparsity (Weiss et al., 2016). The number of effective species can be calculated from a matrix with OTUs in rows and samples in columns using the inverse Simpson method in R. In the code below, I use the 'diversity' function from the vegan package (Oksanen, 2018).
library(vegan)
invsimpson <- round(median(diversity(matrix, index="invsimpson")),2)
Sparsity can also be calculated by counting the number of zeros in the same matrix .
zeros <- sum(matrix==0)
tot <- dim(matrix)[1] * dim(matrix)[2]
zero.pct <- round((zeros/tot)*100,2)
Then the key in Weiss et al. 2016 Fig 7 can be used as a guide to pick the network analysis method that improves accuracy or precision. The code below assumes you are not working with time-series data.
method<-ifelse(invsimpson < 13, "SparCC",
ifelse(zero.pct < 50, "LSA/MIC", "Emsemble:CoNet+Pearson"))
References
Faust K, Raes J (2012) Microbial interactions: from networks to models. Nat Rev Microbiol 10:538–550. https://doi.org/10.1038/nrmicro2832
Oksanen J, Blanchet GF, Friendly M, et al (2018) vegan: Community Ecology Package. R package version 2.5-2. https://CRAN.R-project.org/package=vegan
Weiss S, Van Treuren W, Lozupone C, et al (2016) Correlation detection strategies in microbial data sets vary widely in sensitivity and precision. ISME J 10:1669–1681. https://doi.org/10.1038/ismej.2015.235