Create a new RStudio project
Open RStudio and create a new project, for more info see https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects
- File > New Project > New Directory > New Project (name the new directory, Ex. Data_in_R) and check “use packrat with this project” if present.
Packrat is a dependency management tool that makes R code more isolated, portable and reproducible by giving the project its own privately managed package library. Learn more about packrat, please see https://rstudio.github.io/packrat/
Set some options and make sure the packages ‘knitr’, ‘tidyverse’, ‘reshape2’, and ‘gridExtra’ are installed (if not install it), and then load
In the R console run the following commands
if (!requireNamespace("knitr")){
install.packages("knitr")
}
library(knitr)
if (!requireNamespace("tidyverse")){
install.packages("tidyverse")
}
library(tidyverse)
if (!requireNamespace("reshape2")){
install.packages("reshape2")
}
library(reshape2)
if (!requireNamespace("gridExtra")){
install.packages("gridExtra")
}
library(gridExtra)
Learn more about the tidyverse see https://www.tidyverse.org.
Open a new R Notebook
An R notebook is an R Markdown document with chunks that can be executed independently and interactively, with output visible immediately beneath the input. More info see https://rmarkdown.rstudio.com/r_notebooks.html
- File -> New File -> R Notebook
- Save the Notebook (Ex. test)
R Markdown
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the preview or Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed R code and plots in chunks like this:
```{r chunk_name}
print('hello world!')
```
Review the R Markdown page and R Markdown cheat sheets.
Try ‘knitting’ to html, pdf, and doc as well as previewing the notebook. Open the resulting documents.
Try executing the code chunks in the R Notebook.
Download the data file for the workshop document and preview/open it
This is the stats file generated after running samtools stats on a bam file generated from running BWA MEM.
In the R console run the following command.
download.file("https://raw.githubusercontent.com/ucdavis-bioinformatics-training/2019-March-Bioinformatics-Prerequisites/master/wednesday/Data_in_R/bwa.samtools.stats", "bwa.samtools.stats")
Download the template Markdown workshop document and open it
In the R console run the following command
download.file("https://raw.githubusercontent.com/ucdavis-bioinformatics-training/2019-March-Bioinformatics-Prerequisites/master/wednesday/Data_in_R/data_in_R.Rmd", "data_in_R.Rmd")
Edit the file YAML portion
The top YAML (YAML ain’t markup language) portion of the doc tells RStudio how to parse the document.
---
title: "Data_in_R"
author: your_name
date: current_date
output:
html_notebook: default
html_document: default
---
LS0tCnRpdGxlOiAiUHJlcGFyZSBEYXRhX2luX1IiCmF1dGhvcjogIkJpb2luZm9ybWF0aWNzIENvcmUiCmRhdGU6ICJgciBmb3JtYXQoU3lzLkRhdGUoKSlgIgpvdXRwdXQ6CiAgICBodG1sX25vdGVib29rOiAKICAgIGh0bWxfZG9jdW1lbnQ6CiAgICAgIGtlZXBfbWQ6IFRSVUUKLS0tCgojIyMgQ3JlYXRlIGEgbmV3IFJTdHVkaW8gcHJvamVjdAoKT3BlbiBSU3R1ZGlvIGFuZCBjcmVhdGUgYSBuZXcgcHJvamVjdCwgZm9yIG1vcmUgaW5mbyBzZWUgPGh0dHBzOi8vc3VwcG9ydC5yc3R1ZGlvLmNvbS9oYy9lbi11cy9hcnRpY2xlcy8yMDA1MjYyMDctVXNpbmctUHJvamVjdHM+CgoqIEZpbGUgPiBOZXcgUHJvamVjdCA+IE5ldyBEaXJlY3RvcnkgPiBOZXcgUHJvamVjdCAobmFtZSB0aGUgbmV3IGRpcmVjdG9yeSwgRXguIERhdGFfaW5fUikgYW5kIGNoZWNrICJ1c2UgcGFja3JhdCB3aXRoIHRoaXMgcHJvamVjdCIgaWYgcHJlc2VudC4KClBhY2tyYXQgaXMgYSBkZXBlbmRlbmN5IG1hbmFnZW1lbnQgdG9vbCB0aGF0IG1ha2VzIFIgY29kZSBtb3JlIGlzb2xhdGVkLCBwb3J0YWJsZSBhbmQgcmVwcm9kdWNpYmxlIGJ5IGdpdmluZyB0aGUgcHJvamVjdCBpdHMgb3duIHByaXZhdGVseSBtYW5hZ2VkIHBhY2thZ2UgbGlicmFyeS4gTGVhcm4gbW9yZSBhYm91dCBwYWNrcmF0LCBwbGVhc2Ugc2VlIDxodHRwczovL3JzdHVkaW8uZ2l0aHViLmlvL3BhY2tyYXQvPiAgCgoKU2V0IHNvbWUgb3B0aW9ucyBhbmQgbWFrZSBzdXJlIHRoZSBwYWNrYWdlcyAna25pdHInLCAndGlkeXZlcnNlJywgJ3Jlc2hhcGUyJywgYW5kICdncmlkRXh0cmEnIGFyZSBpbnN0YWxsZWQgKGlmIG5vdCBpbnN0YWxsIGl0KSwgYW5kIHRoZW4gbG9hZAoKSW4gdGhlIFIgY29uc29sZSBydW4gdGhlIGZvbGxvd2luZyBjb21tYW5kcwpgYGB7ciBzZXR1cCwgcmVzdWx0cz0naGlkZScsIHdhcm5pbmc9RiwgZXJyb3I9Rn0KaWYgKCFyZXF1aXJlTmFtZXNwYWNlKCJrbml0ciIpKXsKICBpbnN0YWxsLnBhY2thZ2VzKCJrbml0ciIpCn0KbGlicmFyeShrbml0cikKCmlmICghcmVxdWlyZU5hbWVzcGFjZSgidGlkeXZlcnNlIikpewogIGluc3RhbGwucGFja2FnZXMoInRpZHl2ZXJzZSIpCn0KbGlicmFyeSh0aWR5dmVyc2UpCgppZiAoIXJlcXVpcmVOYW1lc3BhY2UoInJlc2hhcGUyIikpewogIGluc3RhbGwucGFja2FnZXMoInJlc2hhcGUyIikKfQpsaWJyYXJ5KHJlc2hhcGUyKQoKaWYgKCFyZXF1aXJlTmFtZXNwYWNlKCJncmlkRXh0cmEiKSl7CiAgaW5zdGFsbC5wYWNrYWdlcygiZ3JpZEV4dHJhIikKfQpsaWJyYXJ5KGdyaWRFeHRyYSkKYGBgCgpMZWFybiBtb3JlIGFib3V0IHRoZSB0aWR5dmVyc2Ugc2VlIDxodHRwczovL3d3dy50aWR5dmVyc2Uub3JnPi4KCiMjIyBPcGVuIGEgbmV3IFIgTm90ZWJvb2sKCkFuIFIgbm90ZWJvb2sgaXMgYW4gUiBNYXJrZG93biBkb2N1bWVudCB3aXRoIGNodW5rcyB0aGF0IGNhbiBiZSBleGVjdXRlZCBpbmRlcGVuZGVudGx5IGFuZCBpbnRlcmFjdGl2ZWx5LCB3aXRoIG91dHB1dCB2aXNpYmxlIGltbWVkaWF0ZWx5IGJlbmVhdGggdGhlIGlucHV0LiBNb3JlIGluZm8gc2VlIDxodHRwczovL3JtYXJrZG93bi5yc3R1ZGlvLmNvbS9yX25vdGVib29rcy5odG1sPgoKKiBGaWxlIC0+IE5ldyBGaWxlIC0+IFIgTm90ZWJvb2sKKiBTYXZlIHRoZSBOb3RlYm9vayAoRXguIHRlc3QpCgojIyMgUiBNYXJrZG93bgoKVGhpcyBpcyBhbiBSIE1hcmtkb3duIGRvY3VtZW50LiBNYXJrZG93biBpcyBhIHNpbXBsZSBmb3JtYXR0aW5nIHN5bnRheCBmb3IgYXV0aG9yaW5nIEhUTUwsIFBERiwgYW5kIE1TIFdvcmQgZG9jdW1lbnRzLiBGb3IgbW9yZSBkZXRhaWxzIG9uIHVzaW5nIFIgTWFya2Rvd24gc2VlIDxodHRwOi8vcm1hcmtkb3duLnJzdHVkaW8uY29tPi4KCldoZW4geW91IGNsaWNrIHRoZSAqKnByZXZpZXcqKiBvciAqKktuaXQqKiBidXR0b24gYSBkb2N1bWVudCB3aWxsIGJlIGdlbmVyYXRlZCB0aGF0IGluY2x1ZGVzIGJvdGggY29udGVudCBhcyB3ZWxsIGFzIHRoZSBvdXRwdXQgb2YgYW55IGVtYmVkZGVkIFIgY29kZSBjaHVua3Mgd2l0aGluIHRoZSBkb2N1bWVudC4gWW91IGNhbiBlbWJlZCBSIGNvZGUgYW5kIHBsb3RzIGluIGNodW5rcyBsaWtlIHRoaXM6Cgo8cHJlPjxjb2RlPmBgYHtyIGNodW5rX25hbWV9CnByaW50KCdoZWxsbyB3b3JsZCEnKQpgYGA8L2NvZGU+PC9wcmU+CgpSZXZpZXcgdGhlIFIgTWFya2Rvd24gcGFnZSBhbmQgUiBNYXJrZG93biBjaGVhdCBzaGVldHMuCgpUcnkgJ2tuaXR0aW5nJyB0byBodG1sLCBwZGYsIGFuZCBkb2MgYXMgd2VsbCBhcyBwcmV2aWV3aW5nIHRoZSBub3RlYm9vay4gT3BlbiB0aGUgcmVzdWx0aW5nIGRvY3VtZW50cy4KClRyeSBleGVjdXRpbmcgdGhlIGNvZGUgY2h1bmtzIGluIHRoZSBSIE5vdGVib29rLgoKCiMjIyBEb3dubG9hZCB0aGUgZGF0YSBmaWxlIGZvciB0aGUgd29ya3Nob3AgZG9jdW1lbnQgYW5kIHByZXZpZXcvb3BlbiBpdAoKVGhpcyBpcyB0aGUgc3RhdHMgZmlsZSBnZW5lcmF0ZWQgYWZ0ZXIgcnVubmluZyBzYW10b29scyBzdGF0cyBvbiBhIGJhbSBmaWxlIGdlbmVyYXRlZCBmcm9tIHJ1bm5pbmcgQldBIE1FTS4KCkluIHRoZSBSIGNvbnNvbGUgcnVuIHRoZSBmb2xsb3dpbmcgY29tbWFuZC4KYGBge3IgZG93bmxvYWQxLCByZXN1bHRzPSdoaWRlJywgd2FybmluZz1GLCBlcnJvcj1GfQpkb3dubG9hZC5maWxlKCJodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20vdWNkYXZpcy1iaW9pbmZvcm1hdGljcy10cmFpbmluZy8yMDE5LU1hcmNoLUJpb2luZm9ybWF0aWNzLVByZXJlcXVpc2l0ZXMvbWFzdGVyL3dlZG5lc2RheS9EYXRhX2luX1IvYndhLnNhbXRvb2xzLnN0YXRzIiwgImJ3YS5zYW10b29scy5zdGF0cyIpCmBgYAoKIyMjIERvd25sb2FkIHRoZSB0ZW1wbGF0ZSBNYXJrZG93biB3b3Jrc2hvcCBkb2N1bWVudCBhbmQgb3BlbiBpdAoKSW4gdGhlIFIgY29uc29sZSBydW4gdGhlIGZvbGxvd2luZyBjb21tYW5kCmBgYHtyIGRvd25sb2FkMiwgcmVzdWx0cz0naGlkZScsIHdhcm5pbmc9RiwgZXJyb3I9Rn0KZG93bmxvYWQuZmlsZSgiaHR0cHM6Ly9yYXcuZ2l0aHVidXNlcmNvbnRlbnQuY29tL3VjZGF2aXMtYmlvaW5mb3JtYXRpY3MtdHJhaW5pbmcvMjAxOS1NYXJjaC1CaW9pbmZvcm1hdGljcy1QcmVyZXF1aXNpdGVzL21hc3Rlci93ZWRuZXNkYXkvRGF0YV9pbl9SL2RhdGFfaW5fUi5SbWQiLCAiZGF0YV9pbl9SLlJtZCIpCmBgYAoKIyMjIEVkaXQgdGhlIGZpbGUgWUFNTCBwb3J0aW9uCgpUaGUgdG9wIFlBTUwgKFlBTUwgYWluJ3QgbWFya3VwIGxhbmd1YWdlKSBwb3J0aW9uIG9mIHRoZSBkb2MgdGVsbHMgUlN0dWRpbyBob3cgdG8gcGFyc2UgdGhlIGRvY3VtZW50LgoKPHByZT48Y29kZT4tLS0KdGl0bGU6ICJEYXRhX2luX1IiCmF1dGhvcjogeW91cl9uYW1lCmRhdGU6IGN1cnJlbnRfZGF0ZQpvdXRwdXQ6CiAgICBodG1sX25vdGVib29rOiBkZWZhdWx0CiAgICBodG1sX2RvY3VtZW50OiBkZWZhdWx0Ci0tLTwvY29kZT48L3ByZT4KCgojIyMgV2hhdCBhcmUgd2UgZ29pbmcgdG8gZG8/CgpXZSB3aWxsIHJlY3JlYXRlIHNvbWUgb2YgdGhlIHBsb3RzIGdlbmVyYXRlZCB3aXRoIHBsb3QtYmFtc3RhdHMgb24gdGhlIHNhbWUgZmlsZQoKWW91IGNhbiB2aWV3IHRoZSBvdXRwdXQgb2YgcGxvdC1iYW1zdGF0cyAtPiA8aHR0cDovL2h0bWxwcmV2aWV3LmdpdGh1Yi5pby8/aHR0cHM6Ly9naXRodWIuY29tL3VjZGF2aXMtYmlvaW5mb3JtYXRpY3MtdHJhaW5pbmcvMjAxOS1NYXJjaC1CaW9pbmZvcm1hdGljcy1QcmVyZXF1aXNpdGVzL21hc3Rlci93ZWRuZXNkYXkvRGF0YV9pbl9SL2J3YV9tZW1fU3RhdHMvYndhX21lbV9TdGF0cy5odG1sPgo=