Teresita M. Porter

Posts

21 - [metabarcoding][network analysis] Using CoNet with Cytoscape

CoNet is an ensemble network analysis method that combines several similarity and dissimilarity methods into a single tool.

20 - [metabarcoding][network analysis] Choosing a network analysis method

Network analysis is commonly used in microbiome studies to identify keystone species and identify clusters of co-occurring or co-exclusionary species.

19 - [metabarcoding] Misunderstanding of BLAST max_target_seqs setting and implications for programs with similar behaviour

A recent paper by Shah et al., 2018 explains the default behaviour in BLAST using the --max-target-seq setting.

18 - [metabarcoding] So what about non-destructive COI metabarcoding?

Non destructive COI metabarcoding refers to using a sample preparation method where the individual or bulk community is not homogenized prior to DNA-extraction or direct-PCR.

17 - [metabarcoding][databases] Status of COI records in GenBank and implications for future record re-usability

A new preprint discussing the status of COI records in GenBank shows growth over time, the onset of an increasing proportion of insufficiently identified records, and uneven levels of metadata annotation.

16 - [metabarcoding][normalization] Dealing with different library sizes after high throughput metabarcode sequencing

How to deal with variable sequence library sizes boils down to which objective is being addressed: 1) normalization prior to alpha/beta diversity analyses, 2) normalization prior to differential abundance analysis (DAA).

15 - [linux] Working with compressed files saves disk space

Bioinformatic processing of high throughput sequencing data uses batches of large files as input and creates large batches of out files at nearly every processing step--that can quickly consume lots of disk space!

14 - [metabarcoding] OTUs versus ESVs

So we've all been using operational taxonomic units (OTUs) since the 2000's but now everyone is talking about exact sequence variants (ESVs).

13 - [linux][Perl] Quickly rename files

Renaming batches of poorly named files can be facilitated using command-line tools but first we need to clear up confusion between the linux rename utility and Perl rename script.

12 - [RDPclassifier] High throughput CO1 metabarcode taxonomic assignments

A comparison of the popular top BLAST hit and RDP classifier methods.

11 - [Excel] Make a fast presence-absence matrix from a matrix with abundance values

You don't want to work with abundance data though so let's convert it quickly into a presence-absence matrix.

10 - [UNIX] Installing bcl2fastq 1.8.4 on Ubuntu 12.04.5

When .bcl files were generated with Illumina RTA < 1.18.54, the older bcl2fastq v1.8.4 needs to be used to convert base calls to fastq files. Since Illumina does not provide a package that can be installed by apt-get on Ubuntu, I've compiled the steps I had to take to get this older software up and running.

9 - [BioPerl] Getting taxonomic lineage information out of MEGAN

The free to use community edition of MEGAN6 is a tool that can be used to parse through BLAST output, setting varying stringency criteria to help sort out good from not-so-good taxonomic assignments and is a nice alternative to the widely used top BLAST hit taxonomic assignment approach.

8 - [BioPerl] A note on mining taxonomic information from GenBank

BioPerl offers modules to easily allow taxonomic information to be mined from GenBank, but what happens when taxids are updated/deleted/merged by GenBank staff?

7 - [UNIX] Tricks for processing massive fasta files for BLAST searches

With the increasing use of next-generation sequencing, comes the problem of how to efficiently process massive fasta files for BLAST searches.

6 - [metabarcoding] Mixed-template PCR is different than single template PCR and should be treated as such

Just as the strategy used for single template PCRs can be optimized to account for problems such as GC-content or the presence of PCR inhibitors, so too can mixed template PCRs be optimized to account for problems like the generation of PCR artefacts.

5 - [Perl] Fastq to FASTA conversion

This perl one-liner will convert a fastq file into a fasta file.

4 - [R] Resources in [R] for environmental sequence data analysis using ordination

There are many resources available, but they are scattered across the internet. Here is a list of useful resources related to the VEGAN and ECODIST packages in [R].

3 - [R] Create scree plots using the metaMDS function in the VEGAN package

There does not appear to be a built-in function in VEGAN for creating scree plots. Since scree plots are useful for choosing how many dimensions should be used with the metaMDS function, reference to a sample function is provided.

2 - [R] Specify which NMDS dimensions to plot when there are more than two available

Using the nmds function in the ECODIST package, when evaluating more than two dimensions, all possible pair-wise combinations of dimensions are shown by default when using the plot function. In this example, only two specific dimensions are chosen for plotting.

1 - [R] Create a plot legend without symbols

Normally symbols and text are both used in a plot legend. In this example, some legend entries are simply colour coded without showing a corresponding symbol.

Google Sites

Report abuse