16S sequencing

This document describes the procedure and the costs to sequence 16S rRNA genes from complex microbial communities by the NED team (UMR1388 GenPhySE) and the GeT-PlaGe plateform in Toulouse using Illumina Miseq technology for academic purposes. The aim is to provide a non-profit system to the academic users that do not systematically have enough samples to fill the miseq plate (300 samples). Here we describe the basic procedure using the V3-V4 or the V4-V5 regions. Please contact Olivier Bouchez for any other fragment.

Who is the person to contact ?

The scientific supervisors is Olivier Zemb from INRAE TOULOUSE (emails are first_nameDOTsurnameATtoulouseDOTinra.fr).

What should I send ?

You should send by email:

- the excel file that includes their name (here)

-scan of the payment order.(ie bon de commande ou devis signé)

You should send in the parcel :

- 50 ul of PCR reaction obtained with the primers of your choice (by between 300 and 550 bp , and with the linkers provided by Olivier Bouchez so please contact us first),

- the picture of the corresponding agarose gel

- print out of the excel file that includes their name

- print out of the payment order.(ie bon de commande ou devis signé)

What is the procedure to send the samples ?

1) you send an email with the submission form to olivierDOTzemb A_ T toulouseDOTinrae.fr

2) you receive a quote number and a sale order, you update the submission form with that quote number and you send the samples along with a print-out of the submission form the sale order that you signed. We will also tell you when the next run is scheduled. Please call before you send the samples (05 61 28 51 00).

3) you wait 1-2 months for us to get enough samples and generate the sequences on a whole plate (i.e. 170 samples) in order to minimize the costs. If we do not reach 170 samples after 2 months, we will send an email to ask you if you want to go ahead and pay a little bit more per sample or wait another month.

4) you get V3-V4 the sequences in FASTQ format. Typically this happens 2 months after reception of the samples but you can have a look at the schedule (estimation of samples for the following month). If you asked for it, you also get the table of abundance of the OTUs and one fasta per OTU with all the sequences that were clustered in that OTU.

Where to send the samples ?

You should use a carrier (chronopost or similar) to send your samples with cold blocks. Dry ice is not required for DNA in water or TE in our opinion.

Olivier zemb

INRAE

Equipe NED, UMR1388 GenPhySE

24, Chemin de Borde Rouge CS 52627

31326 Castanet Tolosan Cedex

France

What do I get, when do I get it and at what cost ?

Bottom line rate for French academics : 25 EUR (tax exclusive) / sample

For each sample, about 30 000 sequences of your favorite 500 bp amplicon in Fastq and Fasta format -> 25 EUR per sample (excluding VAT*). You have to send 50 ul of the PCR reaction that was performed with the specified primers (or primers approved by Olivier Bouchez). For the approved primers, please look at https://drive.google.com/file/d/0BxFOqcoefSWxU204SGt6S1lkS00/view?usp=sharing. For some user-defined primers, please look at https://docs.google.com/spreadsheets/d/1QSDo0pmMmBysgR2X1GrYlJUsMp_T3hklyKE36niJLyo/edit#gid=0

Bottom line rate for academics outside France and private companies : 45 EUR (tax exclusive) / sample

Price per sample

Sequencing of V3-V4 regions from ribosomal genes as markers for complex microbial communities

Options:

Optional OTU table of abundance using 2000 sequences per sample (limited to 300K sequences) -> 5 EUR HT per sample (excluding VAT*)**

Optional OTU table of abundance using all the sequences -> under development ***

The cleaned assembled sequences will be downloadable as a zip file as soon as they are processed (5% mismatch allowed for assembly).

We anticipate that the sequences will be available about 2 months after reception, but the delay is not guaranteed.

* Note that CNRS and INSERM will have to pay for VAT. INRA academics do not need to pay this tax.

** You also get one fasta containing all the sequences in each OTU and the table of the counts of each OTU using 500 000 sequences across all samples. It should be noted that 2 OTUs may have the same RDP affiliation (see example below).

Name of Fasta Sample1 Sample2 Sample3 Affiliation using the ribosomal database project classifier Representative sequence

OTU1.fasta 1 count 2 counts 0 counts Bacteria(100);Firmicutes(64);unclassified;unclassified;unclassified;unclassified TTGTGTGT

OTU2.fasta 5 count 20 counts 4 counts Bacteria(100);Firmicutes(64);unclassified;unclassified;unclassified;unclassified TTGTGTGT

*** The FROGS pipeline (http://bioinfo.genotoul.fr/fileadmin/BIO_INFO_STAT_2015/oral/FROGS_GenotoulBioinfo.pdf) has a similar output, but you can input more sequences.

Can I get the OTU analysis with other parameters ?

No. We provide an optional basic bioinformatic analysis based on the ESPRIT-TREE which can cluster up to 300 K sequences at the optimal cutoff for V3-V4 is 0.03 (Sun et al., A large-scale benchmark study of existing algorithms for taxonomy-independent microbial community analysis. Brief. Bioinform. 13 (1), 107, 2011). We are adapting the number of sequences per sample to fit this limit. For example, a project with 150 samples will be using 2000 sequences per samples (out of the 50K sequences generated). We are currently developping a pipeline able to deal with all the sequences (more information soon).

The human microbiome project produced guidelines to analyze 16S rRNA data. Some users prefer QIIME, RDP, MOTHUR, ESPRIT-TREE or M-pick. The PEPI-IBIS initiative aims to list all the options.

Do you keep a copy of the data ?

We may store the Fasta and the remaining DNA for at least 1 year. However, you have to keep some of the samples in your lab/ on your hard-drive.

Do I need to put a member of NED team or the GeT-PLaGe plateform as co-author ?

The short answer is no but you have to mention the GeT-PLaGe plateform in the acknowledgments. However please contact Sylvie Combes or Olivier Zemb (first_nameDOTsurnameATtoulouseDOTinra.fr) if you want to initiate a scientific collaboration to discuss the statistics that might be performed based on the table of abundance to answer a specific question.

Can I use my own favorite primers to look at other regions of the 16S rRNA ?

Yes, but please contact Olivier Bouchez (first_nameDOTsurnameATtoulouseDOTinra.fr) (GeT-PlaGe) to make sure that your favorite primers compatible with the Miseq instrument (note that linkers will be added to your PCR product).

What happens if my samples do not generate the expected number of sequences ?

If the internal standard also failed, we run the whole plate again as soon as possible. However, the lab performing the extraction is in charge of providing high-quality DNA so we don't guarantee a positive result for low quality samples. If you doubt that the quality of your extracted DNA extraction suffice for llumina sequencing, we encourage you to submit 3-4 test samples before you use the service routinely. Having said that, we did not encounter any major problem yet.

Is the above information uptodate ?

To check the lastest news/ pricing, please visit https://sites.google.com/site/olivierzembwebsite/16s-sequencing

Can I see an example of the material and methods section ?

The V3V4 region was amplified from purified genomic DNA with the primers F343 (CTTTCCCTACACGACGCTCTTCCGATCTTACGGRAGGCAGCAG) and R784 (GGAGTTCAGACGTGTGCTCTTCCGATCTTACCAGGGTATCTAATCCT) using 30 amplification cycles with an annealing temperature of 65 degrees (an amplicon of 510 bp, although length varies depending on the organisms. Because MiSeq enables paired 250-bp reads, the ends of each read are overlapped and can be stitched together to generate extremely high-quality, full-length reads of the entire V3 and V4 region in a single run. Single multiplexing was performed using home made 6 bp index, which were added to R784 during a second PCR with 12 cycles using forward primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGAC) and reverse primer (CAAGCAGAAGACGGCATACGAGAT-index-GTGACTGGAGTTCAGACGTGT). The resulting PCR products were purified and loaded onto the Illumina MiSeq cartridge according to the manufacturer instructions. The quality of the run was checked internally using PhiX, and then each pair-end sequences were assigned to its sample with the help of the previously integrated index. Each pair-end sequences were assembled using Flash sofware (Magoc, 2011) using at least a 10bp-overlap between the forward and reverse sequences, allowing 10% of mismatch. The lack of contamination was checked with a negative control during the PCR (water as template). The quality of the stitching procedure was controlled using 4 bacterial samples that are run routinely in the sequencing facility in parallel to the current samples.

Who is doing what on my samples for a routine analysis of microbial operationnal taxonomic units?

Can I see an example of the quote ?

Below is an example :

Devis_Mutua16S_exemple