Folder : microbiome_Lmelissa
Folders:
1). qiimeAnalysis_files : QIIME data analysis files
-- qiime_workflow : decribes the entire workflow with files generated at each step
sub folders in this folder:
A). open_reference_clust : This folder contains files from open reference clustering in QIIME.
Here are the main folders:
removed_chlor_mito_wolba : has the rarefaction files and main files after removing chloroplast, chloroplast and wolbachia and the plant samples. This is the folder with files used the manuscript.
This analysis used the otu_no_singleton file and on that I performed the following steps:
-- summarize_taxa.py
-- biom summarize-taxa
-- filter_samples_from_otu_table.py : remove plant samples
-- filter_taxa_from_otu_table.py : remove chloroplast, mitochondria and wolbachia
-- single_rarefaction.py : 500 and 1000 rarefaction
500_rarefaction : we decided on 500 rarefaction after deciding the level from biom table summary.
otu_table_table_summarized_alldata : mapping file without rarefaction
biom_table_summaries : summaries of biom tables after chosing different levels of rarefactions
otu_table_500.biom 500 rarefaction file
otu_table_1000.biom 1000 rarefaction file
otu_table_filtered.biom file after removing chloroplast, mito and wolbachia
otu_table_filtered_samples.biom biom table after removing samples
reanalysis_nochloro_mito :
has the rarefaction files and main files after removing chloroplast, chloroplast.
no chloroplast : has files after we removed chloroplast sequences
-has mapping files after rarefaction.
-main file is otu_table_1311.biom
-also otu_table_no_chlorMito.biom
removed_chloroplast_mitochondria : after removing both
-has rarefaction files
Here are all the other folders which generated the otu_Singleton file which was used for further summarizing, filtering and rarefying data for the final analysis:
1). rarefied_otu_tables : contains all the rarefaction files
2). mapping_files : contains all mapping files from rarefactions
3). single_rarefaction : single rarefaction files
4). prefilter_otus : seqs_otus.txt, seqs_otus.log, seqs_failures.txt, seqs_clusters.uc, prefiltered_seqs.fna
5). step1_otus : failures.fasta, prefiltered_seqs_clusters.uc, prefiltered_seqs_failures.txt, prefiltered_seqs_otus.log, prefiltered_seqs_otus.txt, step1_rep_Set.fna
6). step2_otus : step2_rep_set.fna, subsampled_failures.fasta, subsampled_failures_clusters.uc, subsampled_failures_otus.log, subsampled_failures_otus.txt
7). step3_otus : failures_clusters.uc, failures_failures.fasta, failures_failures.txt, failures_otus.log, failures_otus.txt
8). step4_otus : failures_failures_clusters.uc, failures_failures_otus.log, failures_failures_otus.txt, step4_rep_set.fna
7). step5_otu_map : final_otu_map.txt, final_otu_map_mc2.txt
8). step6_otu_tables : otu_table_mc2.biom, otu_table_mc2_w_tax.biom, otu_table_mc2_w_tax_no_pynast_failures.biom, rep_set.tre
9). pynast_aligned_seqs : rep_set_aligned.fasta, rep_set_Aligned_pfiltered.fasta, rep_set_failures.fasta, rep_set_log.txt
10). 10kout : mapping file from 10k rarefaction
11). mapping_file_fulldata : mapping file for full data
12). more_files_qiime : otu_table_no_singletons.biom (output of filter_otus_from_otu_table.py), mapping_file_10k.txt (copy of the mapping file_10k rarefaction), mapping_file_fulldata.txt (copy of the mapping file for fulldata), rep_set.fna (file from step 5)
13). uclust_assigned_taxonomy (folder from assign_taxonomy.py) : rep_set_tax_assignments.log, rep_set_tax_assignments.txt
14). taxa_summaries: (alldata),500,1000,1500,2000 : files from summarize_taxa_through_plots.py for all the rarefied data and full data
2). dataAnalysis_results : analysis files, figures, R code
4 folders:
Folder1 : analysis1_finalresults_manuscript
Files in this folder:
PCAs.eps. heatmap_final.eps : final figures for the manuscript
mapping_file_L4_filtered.txt : file after rarefaction after removing chloroplast, mitochondria and wolbachia and removing plant samples. This is the file used for PCA figure, heatmap and all further analysis.
mapping_file_L4_1311_alldata : file after rarefaction after removing just chloroplast and mitochondria. This file used for bacterial composition figure and PCAs.
randomForestSummaries_final
PC_PCoA_all.txt, PC_PCoA_fl.txt : Files with PCA on chord distances data and PCoA data for Zach's analysis.
PCoA.eps : principle coordinate analysis plot
PCACord.eps : PCA on chord distances plot
pcoa_pcaChord_code.R : R code for PCoA analysis and PCA on chord distances analysis and plotting
OTURelativeAbundances_means.txt : Means of relative abundances across samples for OTUs calculated using
mapping_file_L4_1311.txt
Folder: Bacterial composition
bac_comp.csv : main file with top five bacteria and others
mapping_file_L4_1311.txt : main file to calculate relative abundances for others and take the top 5 bacteria
top_bactria.csv, topBacteria.csv, others : just have the top bacteria. I modified this file to change IDs for the main figure. Others is the file without the top 5 bacteria to calculate relative abundances.
bacComps.pdf : final figure in the manuscript
bacComps.svg : imageMagick editing file
Folder2 : analysis2_noChloroplastMitochondria
This folder contains files from analysis after we removed chloroplast and mitochondria but did'nt yet realise about Wolbachia.
Files in this folder :
Figures: PCAs, LDAs, bacterialcomposition, dendragrams, randomForestSummaries
files: frass_larvae.csv : main file for the analysis, mapping_file_L4_1311 : mapping file after rarefaction from qiime, age+frass+larvae.csv : files for test analysis in R
heirClust : heatmaps and dendrograms for this analysis trial and error
Folder3 : R_code
Files in this folder :
microbiome_analysis_final.R : final code for analysis for the manuscript and for all the figures
fancyRplots_Plos.R : just for fancy plots with previous data
code.R : this is the code for the first analysis for the manuscript
normalization_edgeR : code for normalization in edgeR
adegenet_code.R : code for analysis in adegenet
Folder4 : qiime_removed_chlo_mito_wolba
This folder contains files from QIIME analysis. Description for this folder is above in the open reference folder description.
3). fastq_files : raw fastQ files
4). old_analysis_files
Folder1 : first_analysis_closed_reference
Contains all the files from the old analysis with closed reference OTU picking in QIIME.
Folder2 : final_results_old
Contains files from analysis before we removed chloroplast and mitochondria. Really old results where we did ANOVA etc There are lots of files and folders in this but I don't think we will need any of this. But I have still saved all this so that we can go back to all the initial results if we need to.