HTSMultiQC-cleaning-report
A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
Report
generated on 2021-07-12, 11:24
based on data in:
/share/workshop/gwas_workshop/jli/01-HTS_Preproc
HTStream
HTStream quality control and processing pipeline for High Throughput Sequencing data.
Processing Overview
General statistics from the HTStream pipeline.
Preprocessing Statistics
Fragment Reduction
Provides scaled statistics collected throughout the preprocessing pipeline, highlighting variable statistics across experiment.
Basepair Reduction
Provides scaled statistics collected throughout the preprocessing pipeline, highlighting variable statistics across experiment.
hts_Stats
Generates a JSON formatted file containing a set of statistical measures about the input read data.
Sample Name | % PE | % R1 Q30 | % R2 Q30 | GC Content | N Content | Notes |
---|---|---|---|---|---|---|
SL378587_htsStats | 100.00% | 91.37% | 88.00% | 38.36% | 0.0009% | RawReads stats |
SL378588_htsStats | 100.00% | 91.29% | 88.64% | 38.98% | 0.0005% | RawReads stats |
SL378589_htsStats | 100.00% | 91.22% | 88.42% | 38.97% | 0.0004% | RawReads stats |
Read Lengths: Paired End
Distribution of read lengths for each sample.
Base by Cycle: Paired End
Provides a measure of the uniformity of a distribution. The higher the average is at a certain position, the more unequal the base pair composition. N's are excluded from this calculation.
Quality by Cycle: Paired End
Mean quality score for each position along the read. Sample is colored red if less than 60% of bps have mean score of at least Q30, orange if between 60% and 80%, and green otherwise.
hts_SeqScreener
A simple sequence screening tool which uses a kmer lookup approach to identify reads from an unwanted source.
hts_SuperDeduper
A reference free duplicate read removal tool.
Sample Name | % Duplicates | % Ignored | Notes |
---|---|---|---|
SL378587_htsStats | 5.90% | 0.01% | remove PCR duplicates |
SL378588_htsStats | 6.70% | 0.01% | remove PCR duplicates |
SL378589_htsStats | 6.25% | 0.01% | remove PCR duplicates |
SuperDeduper: Duplicate Saturation
Plots the number of duplicates against the number of unique reads per sample.
hts_AdapterTrimmer
Trims adapters which are sequenced when the fragment insert length is shorter than the read length.
Sample Name | % Bp Lost | % Adapters | Avg. Bps Trimmed | Notes |
---|---|---|---|---|
SL378587_htsStats | 3.94% | 29.71% | 39.98 | trim adapters |
SL378588_htsStats | 2.58% | 19.28% | 40.37 | trim adapters |
SL378589_htsStats | 2.73% | 20.72% | 39.85 | trim adapters |
AdapterTrimmer: Trimmed Basepairs Composition
Composition of basepairs trimmed from the ends of paired end and single end reads.
hts_QWindowTrim
Uses a sliding window approach to remove the low quality ends of reads.
Sample Name | % Bp Lost | % R1 of Bp Lost | % R2 of Bp Lost | Avg. Bps Trimmed | Notes |
---|---|---|---|---|---|
SL378587_htsStats | 0.15% | 37.45% | 62.55% | 0.43 | trim low qulity bases from ends of reads |
SL378588_htsStats | 0.19% | 36.72% | 63.28% | 0.55 | trim low qulity bases from ends of reads |
SL378589_htsStats | 0.17% | 37.20% | 62.80% | 0.49 | trim low qulity bases from ends of reads |
QWindowTrim: Trimmed Basepairs Composition
Plots the number of low quality basepairs trimmed from ends of paired end and single end reads.
hts_NTrimmer
Trims reads to the longest subsequence that contains no N's.
Sample Name | Total Bp Lost | % R1 of Bp Lost | % R2 of Bp Lost | % Discarded | Notes |
---|---|---|---|---|---|
SL378587_htsStats | 390 | 81.54% | 18.46% | 0.00% | remove any remanining N characters |
SL378588_htsStats | 746 | 86.33% | 13.67% | 0.00% | remove any remanining N characters |
SL378589_htsStats | 319 | 62.38% | 37.62% | 0.00% | remove any remanining N characters |
NTrimmer: Trimmed Basepairs Composition
Plots the number of N bases trimmed from ends of paired end and single end reads.
hts_LengthFilter
Discards reads below a minimum length threshold.
Sample Name | % PE Lost | Notes |
---|---|---|
SL378587_htsStats | 0.52% | remove reads < 50bp |
SL378588_htsStats | 0.62% | remove reads < 50bp |
SL378589_htsStats | 0.59% | remove reads < 50bp |
hts_Stats 2
Generates a JSON formatted file containing a set of statistical measures about the input read data.
Sample Name | % PE | % R1 Q30 | % R2 Q30 | GC Content | N Content | Notes |
---|---|---|---|---|---|---|
SL378587_htsStats | 100.00% | 93.70% | 90.59% | 37.67% | 0.0000% | final stats |
SL378588_htsStats | 100.00% | 93.36% | 90.74% | 38.55% | 0.0000% | final stats |
SL378589_htsStats | 100.00% | 93.42% | 90.54% | 38.51% | 0.0000% | final stats |
Read Lengths: Paired End
Distribution of read lengths for each sample.
Base by Cycle: Paired End
Provides a measure of the uniformity of a distribution. The higher the average is at a certain position, the more unequal the base pair composition. N's are excluded from this calculation.
Quality by Cycle: Paired End
Mean quality score for each position along the read. Sample is colored red if less than 60% of bps have mean score of at least Q30, orange if between 60% and 80%, and green otherwise.