UCD Bioinformatics Core Workshop

Home
Preliminaries
Before the course
File downloads
Data Analysis
Differential Expression
Enrichment Analysis
R Code for Quizzes
Upcoming
Upcoming Bioinformatics Core Courses

R code for selected quiz answers

Note: There are multiple ways of doing just about anything in R, the code shown below is one of many ways.

DE

Quiz 1

How many genes are in the counts table?: nrow(counts)
How many samples are in the counts table?: ncol(counts)
What is the total count across all genes for sample SRR8245064?: colSums(counts)["SRR8245064"]
How many genes have a count of 0 in every sample?: sum(rowSums(counts) == 0)

Quiz 2

Which sample has the largest normalization factor?: rownames(d0$samples)[which.max(d0$samples$norm.factors)]
Is the sample with the largest normalization factor the sample with the smallest total counts?: rownames(d0$samples)[which.min(d0$samples$lib.size)] == rownames(d0$samples)[which.max(d0$samples$norm.factors)]
Make an MDS plot of the unfiltered data.: plotMDS(d0, col = as.numeric(factor(metadata$simplified_cell_type)))

Quiz 3

Based on the above model, how many genes are significantly differentially expressed between naive-like and memory-like?: length(which(top.table$adj.P.Val < 0.05))
Based on the above model, and without taking significance into account, how many genes have higher expression in naive-like than in memory-like?: length(which(top.table$logFC > 0))
How many genes have an unadjusted p-value less than 0.05 for the comparison of naive to memory-like in the above model?: length(which(top.table$P.Value < 0.05))
What is the adjusted p-value for the last gene with unadjusted P < 0.05?: top.table$adj.P.Val[max(which(top.table$P.Value < 0.05))]

Quiz 4

For the model ~0 + simplified_cell_type, how many genes are differentially expressed between naive-like and effector-like?:

mm.1 <- model.matrix(~0 + simplified_cell_type, data = metadata)
y.1 <- voom(d, mm.1); fit.1 <- lmFit(y.1, mm.1)
tmp.1 <- contrasts.fit(fit.1, contrasts = makeContrasts(simplified_cell_typenaive_like - simplified_cell_typeeffector_like, levels = colnames(coef(fit.1))))
tmp.1 <- eBayes(tmp.1)
length(which(topTable(tmp.1, n = Inf, sort.by = "P")$adj.P.Val < 0.05))

Enrichment

Quiz 1

Rerun the KS test analysis using the molecular function (MF) ontology. What is the top GO term listed?:

GOdata.1 <- new("topGOdata", ontology = "MF", allGenes = geneList, geneSelectionFun = function(x)x, annot = annFUN.org , mapping = "org.Mm.eg.db")
resultKS.1 <- runTest(GOdata.1, algorithm = "weight01", statistic = "ks")
tab.1 <- GenTable(GOdata.1, raw.p.value = resultKS.1, topNodes = length(resultKS.1@score), numChar = 120)
tab.1[1, "Term"]

How many genes from the top table are annotated with the term ‘protein serine/threonine kinase activity’?: tab.1[which(tab.1[,'Term'] == 'protein serine/threonine kinase activity'),'Annotated']

Quiz 2

How many pathways have a p-value less than 0.05?: length(which(outdat$p.value < 0.05))
Which pathway has the most genes annotated to it (excluding genes not in the top table)?: outdat$pathway.name[which.max(outdat$Annotated)]
Make a pathview diagram for mmu04740: ``` foldChangeList <- tmp$logFC xx <- as.list(org.Mm.egENSEMBL2EG) names(foldChangeList) <- xx[tmp$Gene.stable.ID] head(foldChangeList)

mmu04740 <- pathview(gene.data = foldChangeList, pathway.id = “mmu04740”, species = “mmu”, limit = list(gene=max(abs(foldChangeList)), cpd=1)) ```

☰ Menu

RNA Seq Differential Expression and Enrichment Analysis

R code for selected quiz answers

DE

Quiz 1

Quiz 2

Quiz 3

Quiz 4

Enrichment

Quiz 1

Quiz 2