I believe in open access science and for each publication I try to provide clear methods with supplements, and I publicly archive as much data as possible. Links are provided alongside each publication to supplementary material and data repository archives (e.g. Dryad). Supplements include additional text, tables, figures, analysis input and output files, and command scripts (e.g. R, Unix, Python). I've also uploaded most command scripts and input files to my GitHub page (fvaux).
In case further information or clarification is required though, I summarise information about the samples and data for some projects below.
Feel free to contact me if there are any questions!
Two papers, one dataset
Please note that the 2023 Rārangi and 2024 Kaikōura papers share the same dataset, and so I've combined the archiving on their data and code.
Samples
All kelp tissue clippings are stored at the University of Otago (contact Jon Waters and Ceridwen Fraser).
Sample details for all specimens in both studies are provided in the Excel files stored in the GitHub repository.
Genetic data
The GBS data is stored as demultiplexed forward and reverse reads within the NCBI sequence read archive (PRJNA780921).
Please contact Jon Waters and Ceridwen Fraser if you want a copies of the raw, multiplexed Illumina DNA sequence reads (77 GB).
They intended to archive all multiplexed sequence data for the southern bull kelp projects on NCBI in the near future.
Commands and input files for Stacks, VCFtools, plink, vcf2phylo and other R packages are included are stored in repository on GitHub (Zenodo release, kaikoura_d_antarctica_GBS).
SNP PCAs were generated using the adegenet R package, following this tutorial.
Missing data plots were generated using the vcfR R package, following the software documentation.
Genotype files (VCF, genepop, structure), fasta consensus files, and phylogenetic alignments and tree files for both studies are provided in the GitHub repository.
Files for the connectivity analysis in the 2024 Kaikōura paper are stored in the GitHub repository.
Extra
I've made a video walkthrough of the DNA extraction and DNA purification methods used for this research.
I've uploaded a silhouette for Durvillaea antarctica to PhyloPic (here).
Samples
All kelp tissue clippings are stored at the University of Otago (contact Jon Waters and Ceridwen Fraser).
Sample details for all specimens are provided in the Dryad supplement for Vaux et al. 2022 Mol. Ecol.
Genetic data
The GBS data is stored as demultiplexed forward and reverse reads within the NCBI sequence read archive (PRJNA769149).
Please contact Jon Waters and Ceridwen Fraser if you want a copies of the raw, multiplexed Illumina DNA sequence reads (56 GB).
They intended to archive all multiplexed sequence data for the southern bull kelp projects on NCBI in the near future.
Commands and input files for Stacks, VCFtools, plink, vcf2phylo, delimitR and other R packages are included in the Dryad Supplement, and are also stored in repository on GitHub (turakirae_d_antarctica_GBS).
SNP PCAs were generated using the adegenet R package, following this tutorial.
Missing data plots were generated using the vcfR R package, following the software documentation.
Genotype files (VCF, genepop, structure), fasta consensus files, and phylogenetic alignments and tree files are provided in the Dryad supplement for Vaux et al. 2022 Mol. Ecol.
Extra
I've made a video walkthrough of the DNA extraction and DNA purification methods used for this research.
I've uploaded a silhouette for Durvillaea antarctica to PhyloPic (here).
Samples
All kelp tissue clippings are stored at the University of Otago (contact Jon Waters and Ceridwen Fraser).
Sample details for all specimens are provided in the Dryad supplement for Vaux et al. 2021 J. Phycol.
Genetic data
The GBS data is stored as demultiplexed forward and reverse reads within the NCBI sequence read archive (PRJNA683976).
Please contact Jon Waters and Ceridwen Fraser if you want a copies of the raw, multiplexed Illumina DNA sequence reads (39 GB).
They intended to archive all multiplexed sequence data for the southern bull kelp projects on NCBI in the near future.
Commands and input files for Stacks, VCFtools, plink, vcf2phylo and R packages are included in the Dryad Supplement, and are also stored in repository on GitHub (north_island_d_poha_GBS).
SNP PCAs were generated using the adegenet R package, following this tutorial.
Missing data plots were generated using the vcfR R package, following the software documentation.
Genotype files (VCF, genepop, structure), fasta consensus files, and phylogenetic alignments and tree files are provided in the Dryad supplement for Vaux et al. 2021 J. Phycol.
Sequences for the PCR amplified regions of mtDNA COI, including GenBank accession numbers, are provided in the Dryad supplement for Vaux et al. 2021 J. Phycol.
Extra
I've made a video walkthrough of the DNA extraction and DNA purification methods used for this research.
Samples
Tissue clippings from the North Pacific are held by the NOAA Southwest Fisheries Science Center. I recommend contacting John Hyde.
Tissue clippings from New Caledonia are held in the Pacific Community (SPC) Pacific Marine Specimen Bank.
Tissue clippings from Tasmania were shared by Peter Grewe at CSIRO.
Sample details for all specimens are listed in the Dryad supplement for Vaux et al. 2021 Evol. Appl.
Genetic data
The RAD sequencing data is stored as demultiplexed forward and reverse reads within the NCBI sequence read archive (PRJNA579774).
Linked to this, our samples should appear on GEOME soon.
Please contact Kathleen O'Malley if you want a copies of the raw, multiplexed Illumina DNA sequence reads (>1 TB).
Genotype files (VCF, genepop, structure) and fasta consensus files for the 84 putatively adaptive loci are provided in the Dryad supplement for Vaux et al. 2021 Evol. Appl.
Commands and input files for Stacks, VCFtools, plink, paralog-finder, BWA and R packages are included in the Dryad Supplement, and are also stored in repository on GitHub (pacific_albacore_ddRADseq).
SNP PCAs were generated using the adegenet R package, following this tutorial.
Extra
In October 2019, I presented an HMSC seminar covering most of our results, which is available to watch for free.
I've uploaded two albacore silhouettes to PhyloPic (here and here).
Samples
All tissue clippings are held by the State Fisheries Genomics Lab and are recorded in the lab's progeny database. Please contact Kathleen O'Malley.
All otoliths should be held by the Oregon Department of Fish and Wildlife. Please contact Leif Rasmuson.
Sample details for all specimens are listed in the Dryad supplement for Vaux et al. 2019 Ecol. Evol.
Genetic data
The RAD sequencing data is stored as demultiplexed forward reads within the NCBI sequence read archive (PRJNA560239).
Linked to this, our samples should appear on GEOME soon.
Please contact Kathleen O'Malley if you want a copies of the raw, multiplexed Illumina DNA sequence reads (>100 GB).
Genotype files (VCF, genepop, structure) and fasta consensus files for the 92 outlier loci are provided in the Dryad supplement for Vaux et al. 2019 Ecol. Evol.
Commands and input files for Stacks, VCFtools, plink, paralog-finder and BWA are stored in a repository on GitHub (deacon_rockfish_RADseq).
SNP PCAs were generated using the adegenet R package, following this tutorial.
Other analyses in R followed the same method as R scripts provided for albacore on GitHub.
Morphometric data
Please contact Leif Rasmuson if you want copies of the photographs (~3 GB).
We followed the R script for running the analysis provided in the original ShapeR paper (Libungan and Pálsson 2015 PLOS ONE) and package GitHub page.
Leif or I can also provide our copy of the script, if necessary.
Extra
In January 2020, Leif Rasmuson presented an HMSC seminar on the biology of deacon rockfish, which included some of the findings of this study. Watch for free here.
The deacon rockfish silhouette is uploaded to PhyloPic.
Samples
All samples are held in the Museum of New Zealand Te Papa Tongarewa.
Sample details for all specimens are listed in the Dryad supplement for Vaux et al. 2019 Palaeontology.
The same details should also be listed in the Museum of New Zealand Te Papa Tongarewa database.
Morphometric data
All morphometric data is available in the MorphoJ .csv input format in the Dryad supplement for Vaux et al. 2019 Palaeontology.
Please contact me if you want copies of the photographs (14 GB) or digitisation files (<1 GB).
All R scripts for morphometric analyses are available in a GitHub repository (PhD_morphometrics).
Samples
Sample details for all genetic and morphometric specimens are listed in the Dryad supplement of Vaux et al. 2020 Syst. Biol., although details vary based on the information available from the museum collections.
Taxonomic identifications are likely to be wrong for some samples, given the revisions that occurred during my research. It should be straight forward to trace changes using the publications, but feel free to contact me if lost.
The majority of samples are held in the collections listed below, and the museum databases should provide sample details for many specimens:
Additional samples came from GNS, museum collections at the University of Auckland, Massey University (Ecology), and Victoria University of Wellington, as well as the Natural History Museum, London, the Natural History Museum of Los Angeles County, and Nagoya University.
Please organise new loans through these institutions to access specimens, as my colleagues and I do not have the right to share material for most loans - including DNA extractions.
International researchers interested in New Zealand specimens should also be mindful of representation and involvement of tangāta whenua, and conduct proper indigenous consultation before starting research.
Geological details for fossil sample sites are often provided here:
Genetic data
All of the ddRAD, mitochondrial DNA, nuclear ribosomal DNA sequence data is archived on GenBank.
All accession numbers are listed in the Dryad supplement of Vaux et al. 2020 Syst. Biol.
Most accession numbers are also listed in the main texts of Vaux et al. 2018. Mol. Phylogenet. Evol. and Kantor et al. 2020 Zoosystema.
Some of the mitochondrial and nuclear ribosomal DNA sequences are unverified on GenBank and so will not show up in NCBI BLAST results.
The ddRAD sequence data for the 18 sequenced Penion individuals is stored as demultiplexed forward reads within the NCBI sequence read archive (PRJNA564825).
Please contact me if you want a copy of the raw, multiplexed Illumina DNA sequence reads (~2 GB).
Genotype files (genepop, structure) for the ddRAD analyses are provided in the Dryad supplement of Vaux et al. 2020 Syst. Biol.
The command scripts (Unix and R) are saved in the Dryad supplement of Vaux et al. 2020 Syst. Biol.
SNP PCAs were generated using the adegenet R package, following this tutorial.
Please contact Simon Hills and Mary Morgan-Richards if you want the raw Illumina DNA sequence reads (<100 GB) for the high-throughput sequenced samples.
Morphometric data
All morphometric data is available in the MorphoJ .csv input format in the data supplements (here and here) for Vaux et al. 2018. Mol. Phylogenet. Evol. and Vaux et al. 2020 Syst. Biol.
These studies used different sets of samples, but the guide excel files should help to easily combine everything.
Some specimen photographs are hosted by the museum databases listed above.
Please contact me if you want copies of the photographs (174 GB) or digitisation files (<1 GB).
All R scripts for morphometric analyses are available in a GitHub repository (PhD_morphometrics).
Extra
As part of NZFauna (led by Daniel Thomas), 3D models were produced for some shells (not used for any analysis, but cool!):
I've uploaded silhouettes of Antarctoneptunea, Kelletia and Penion to PhyloPic.