Home
Introduction and Lectures
Intro to the Workshop and Core
What is Bioinformatics?
Experimental Design and Cost Estimation
Introduction to Command-Line and the Cluster
Logging in and Transferring Files
Intro to Command-Line
Advanced Command-Line (extra)
Running jobs on the Cluster and using modules
Intro to R and Rstudio
Getting Started
Intro to R
Prepare Data in R (extra)
Data in R (extra)
dbcAmplicons
dbcAmplicons Installing Software
dbcAmplicons - Amplicons talk
dbcAmplicons - Bioinformatics talk
Dataset and Metadata
dbcAmplicons - Data processing
dbcAmplicons w/Dada2
Coming soon
Microbial Community Analysis in R
Prepare MCA Analysis
MCA Analysis in phyloseq
Support
Cheat Sheets
Software and Links
Scripts
ETC
Closing thoughts
Workshop Photos
Github
Biocore website

dbcAmplicons pipeline:Amplicons

Amplicons the ‘old’ way

Single PCR
Long primer sequences (~75bp) that contain barcodes and sequencing adapters
Single or dual barcodes
Inline barcodes (within primary reads)

amplicons_figure1

dbcAmplicons

Originally conceived in late 2012 to lower per sample costs on relatively short, targeted (PCR) regions:

16S, ITS, LSU, 18S, etc.
Community profiling
Extraction of mitochondria, virae, chloroplast regions, plasmids by PCR.
Genotyping of samples for phylogenomics, genome to phenotype interactions.

Uses the Illumina platform (mainly the MiSeq), capably of pooling thousands, or even tens of thousands of barcoded samples/targets per sequencing run.

Core Facility friendly, facilitates interactions between and across individual labs, standardizing workflows.

amplicons_figure2

Amplicons: Two Step PCR Approach

2-step PCR, where the first PCR extracts out the target specific region and the second PCR add on adapters and barcodes. Target specific primers include universal sequences CS1 and CS2, the second PCR extends the universal sequences with adapters and barcodes.
Adapters and barcodes are not included in the target specific primers which allows for maximum flexibility in target specific primer usage and the ability to swap out targets, or include multiple targets in the same sequencing reaction without needing to purchase a large number of barcoded, target specific primers.
Barcodes are included in both adapters, therefor a pair of barcodes are used to uniquely identify a samples. This allows for 32 barcode pairs to be able to uniquely identify 1024 samples.

amplicons_figure3

Multiplex multiple amplicons targets

amplicons_figure4

Primer Design

amplicons_figure5 Prokaryote 16S Gene

PCR1 Template specific primer design

Each primer pair contains the following parts

Illumina primers (Illumina/Nextera sequences).
- Provides the sequence necessary for priming of PCR-2, also serves as the sequencing primer site.
Phase-shifting bases [see below]
- Generates diversity in the sequencing reaction
Linker sequence
- Buffers the target specific primer sequence from the rest of the primer, preventing some taxa (longer priming) from being more efficient than others.
Template specific primer sequence
- Target specific primer sequence

amplicons_figure6

Examples PCR1 Template Specific Primers 16S V1-V3 (27F and 534R)

amplicons_figure7

PCR2 Barcoded Illumina Adapter Primers

P5, or P7 sequence
- Primers to the Illumina flow cell, Sequence on the P5 strand typically constitutes R1, those on P7 strand typically constitutes R2.
Barcode sequence
- Uniquely identifies sample
Illumina primers (Illumina/Nextera sequences).
- Necessary for extending PCR1

Examples PCR2 Barcoded Illumina Adapter Primers

amplicons_figure8

Final product

amplicons_figure8b

QA/QC What is a “good” library?

amplicons_figure9

Benefits

Maximum Flexibility, fewer target specific primers needed.
Dual barcoding, allowing for massively multiplexing of samples to occur.
Pool multiple targets per run
Software for demultiplexing

DrawBacks

Two – step PCR reaction
Sequence the target specific primer

Nucleotide diversity

Critically important for imaging clusters, and data quality!

amplicons_figure10

Once a sample library is converted to clusters on a flow cell, “nucleotide diversity” refers to the distribution of nucleotides across the flow cell at any given cycle. From the viewpoint of the instrument software, a high diversity library translates into analyzing images containing an even distribution of spots from 4 different color channels corresponding to the 4 nucleotide bases A, T, C & G. In contrast, an unbalanced nucleotide distribution or “low diversity library” means that for any given image, or to two bases are present at a high percentage.

LOW Diversity Library amplicons_figure11

HIGH Diversity Library amplicons_figure12

Ways to Ensure Nucleotide Diversity

Appropriate nucleotide diversity and cluster density are important for high quality data. Low nucleotide diversity in combination with high cluster density will most-likely lead to poor data quality and/or low data yield.

Sequence the sample at a 30-40% lower density.
Spiking in at a 5-50% a nucleotide balanced library. (such as PhiX, or better a shotgun library of a sample of interest).
Multiplex a high number of amplicon regions 12 or greater).
Build phase-shifted primers.
Build “flipped” primers.

Note: Experience has shown, that 15% shotgun spike-in, plus phase-shifted primers and/or multiple target region typically yields good results.

☰ Menu

Sept. 2019 Microbial Community Analysis Workshop

dbcAmplicons pipeline:Amplicons

Amplicons the ‘old’ way