☰ Menu

      Genome Assembly Workshop 2020

Home
Introduction and Lectures
Intro to the Workshop and Core
Schedule
Genome Assembly
Introduction to the DNA Tech Core
A Brief Overview of Genome Annotation, with a Focus on the Use of Isoseq
Support
Cheat Sheets
Software and Links
Scripts
Prerequisites
CLI - Logging in and Transferring Files
CLI - Intro to Command-Line
CLI - Advanced Command-Line (extra)
CLI - Running jobs on the Cluster and using modules
Conda
R - Getting Started
R - Intro to R
R - Prepare Data in R (extra)
R - Data in R (extra)
More Materials (extra)
Snakemake
Introduction
Challenge Answers
K-mers
K-mers tutorial
PacBio
Introduction to PacBio HiFi Data and Applications
Genome Assembly with PacBio HiFi Data
Improved Phased Assembly (IPA) Using HiFi Data
Assembling the drosophila genome with IPA and HiFi data
ONT Assembly
Introduction to ONT
Assembly using ONT - Hands-on
Bionano
Optical mapping for accurate genome assembly, comparative genomics, and haplotype segregation
Phase Genomics
Using Proximity to Fix Assembly
Genome Assessment
BUSCO
Additional QA/QC and metrics
ETC
Closing thoughts
Workshop Photos
Github page
Report Errors
Biocore website

Create a new RStudio project

Open RStudio and create a new project, for more info see (Using-Projects)[https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects]

Learn more about (renv)[https://rstudio.github.io/renv/articles/renv.html]

Learn more about (packrat)[https://rstudio.github.io/packrat/]

Install Needed Packages

Set some options and make sure the packages ‘knitr’, ‘tidyverse’, ‘reshape2’, and ‘gridExtra’ are installed (if not install it), and then load

In the R console run the following commands:

if (!any(rownames(installed.packages()) == "knitr")){
  install.packages("knitr")
}
library(knitr)

if (!any(rownames(installed.packages()) == "tidyverse")){
  install.packages("tidyverse")
}
library(tidyverse)

if (!any(rownames(installed.packages()) == "reshape2")){
  install.packages("reshape2")
}
library(reshape2)

if (!any(rownames(installed.packages()) == "gridExtra")){
  install.packages("gridExtra")
}
library(gridExtra)

Learn more about the tidyverse.

Open a new R Notebook

An R notebook is an R Markdown document with chunks that can be executed independently and interactively, with output visible immediately beneath the input. This is a part of literate programming, there the ‘code’, description of the ‘code’ and output are all together in one document.

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. See this page more details on using R Markdown.

When you click the preview or Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed R code and plots in chunks like this:

```{r chunk_name}
print('hello world!')
```

Review the R Markdown page and R Markdown cheat sheets.

Try ‘knitting’ to html, pdf, and doc as well as previewing the notebook. Open the resulting documents.

Try executing the code chunks in the R Notebook.

Download the data file for the workshop document and preview/open it

This is the stats file generated after running samtools stats on a bam file produced from running BWA MEM.

In the R console run the following command.

download.file("https://raw.githubusercontent.com/ucdavis-bioinformatics-training/2020-mRNA_Seq_Workshop/master/prerequisites/intro2R/Data_in_R_files/bwa_mem_Stats.log", "bwa_mem_Stats.log")

Download the template Markdown workshop document and open it

In the R console run the following command

download.file("https://raw.githubusercontent.com/ucdavis-bioinformatics-training/2020-mRNA_Seq_Workshop/master/prerequisites/intro2R/data_in_R.Rmd", "data_in_R.Rmd")

Edit the file YAML portion

The top YAML (YAML ain’t markup language) portion of the doc tells RStudio how to parse the document.

---
title: "Data_in_R"
author: your_name
date: current_date
output:
    html_notebook: default
    html_document: default
---

What are we going to do?

We will recreate some of the plots generated with plot-bamstats on the same file

You can view the output of plot-bamstats -> bwa_mem_stats.html