back to Big Data Biology main page
You don't need to watch all these videos urgently. See the week by week guide in the VLE for a gentle video-watching schedule. New concepts take time to learn, so be patient with yourself. :-)
We include video guides to the workshops. Use these if you need some extra guidance, after the live workshop.
Lecture 1 video (Part A) | 35 minutes Introduction to big data biology concepts. Introduction to high-throughput methods. Lecture 1 video (Part B) | 18 minutes Outline of the module. How we teach, and what you need to do.
Download as PDF | Powerpoint
Installing R and R studio | 9 minutes Website to download R Studio Keywords: R studio, packages, tidyverse, install.packages(), library()
Loading data and saving data in R | 10 minutes Keywords: read.table, read.delim, save.image, load
walkthrough of ggplot2| 26 minutes See also data-to-viz.com to find out what plot may suit your data. Keywords:tidyverse, ggplot2, geom_histogram, geom_density, geom_violin, geom_boxplot, stat_compare_means
Making plots in R | 17 minutes See the R commands. Keywords: boxplot, scatterplot, barplot, box and whiskers plot, histogram, hist, stripchart, summary, head
Data types in R | 13 minutes Explains data types in R (strings, characters, numeric, Booleans). Keywords: variables, case-sensitive, TRUE/FALSE, strings, characters, numeric, Boolean(TRUE/FALSE), substring, paste, sub, max, min, abs, and (&), or (|), factors, as.character, as.numeric
Data structures in R | 13 minutes Explains vectors, data frames, matrices (matrix), lists and how to use them. Keywords:table, list, read.table, data.frame, column names, matrix, matrices, vector, list, dim, names
What is a function? | 7 minutes Keywords: Apply, Sapply, lapply
Apply and Sapply | 15 minutes Keywords: Apply, Sapply, applying a function to rows or columns of a matrix
Feel free to use and/or modify any R code on this website.
R tools to summarise large data sets | 30 minutes Keywords: summary, mean, median, nrow, ncol, dim, subset, hist, head, tail R code is here
Multiple test correction Keywords: Bonferroni
Exploring data with plots and summaries | 12 minutes See the R commands. Keywords: correlation, mean, median, summary, hist, plot, log scale, log10
Correlation is not causation | 7 minutes Keywords:pirates, ice cream, sharks, correlation, causation, cor.test, crocodilians
Linear models | 9 minutes Keywords: gradient, intercept, lm, glucosinolates, F-test, Brassica
Please note: not all these workshop videos will be available at the start of term. We'll upload them before you need them though.
Introduction to the Fungal Ecology Dataset | 11 minutes Keywords: Daphne Ezer, ecology, soil
Workshop 1 (part 1) | 9 minutes Keywords:fungi, ecology, metagenomics, operational taxonomic units (OTUs), read.csv, dim, class
Workshop 1 (part 2) | 14 minutes Keywords:fungi, ecology, metagenomics, rownames, colnames, ecology, hist, histogram, which, colSums, barplot
Workshop 1 (part 3, optional) | 12 minutes Keywords:grep, logical functions, which, TRUE, FALSE, sort, table
Workshop 2 (part 1) | 15 minutes Keywords:which, grep, load, sapply
Workshop 2 (part 2) | 16 minutes Keywords:sapply, unique, missing data, library(seqinr), write.fasta, save.image
Workshop 3 | 20 minutes Keywords:dim, sum, barplot, plot, technical artifacts, rainbow colour palette, length, legend, order, Chi-squared test, chisq.test, library(MASS)
Workshop 4 | 20 minutes Keywords: fungal diversity, load, as.numeric, unique, pie chart, library(Rgraphviz), Simpson's Index of Diversity, ANOVA test
Introduction to the Yeast Dataset | 7 minutes Keywords: Fission yeast, Schizosaccharomyces pombe, Daniel Jeffares, essential genes, Pombase, Angeli, gene expression, mRNA half life, protein copies/cell
Workshop 1 | 25 minutes Keywords: setwd, rm, hist, load, subset, nrow, ncol, summary, log10, pdf
Workshop 2 | 24 minutes Keywords: box and whiskers plot, boxplot, wilcox.test, log10, mRNA copies per cell, essential genes
Workshop 3 | 38 minutes Keywords: ggplot2, transposon, merge
Workshop 4 | 34 minutes Keywords: conservation, phyloP, bar plot, matrix, chisq.test, figure legend
Introduction to the Brassica Dataset | 11 minutes Keywords: Brassica, Andrea Harper, Oilseed rape, RPKM, RNAseq, glucosinolate
Workshop 1 | 22 minutes Keywords:read.delim, dim, class, Brassica, rownames, OSR101_RPKM2[,-1], using square brackets, [], is.numeric, sapply, hist, is.numeric, row.names, rowmeans, subset, RPKM, summary, write.table, barplot, save.image
Workshop 2 | 24 minutes Keywords:read.table, read.delim, rownames, "for" loops, OSR_merge[1:5.1:5], paste, linear model, lm, abline, line of best fit, anova, coefficients, summary, as.data.frame, ncol
Workshop 3 | 26 minutes Keywords:qqnorm, qqline, read.delim, library("car"), qqPlot, results$P.value, merge (to merge data frames), order, library(ggplot2), geom_point, theme_classic, -log10, Bonferroni multiple test correction, gl, false discovery rate, FDR, colnames
Workshop 4 | 16 minutes Keywords: read.delim, NCBI Blast, stringsAsFactors, rownames
Introduction to the Crocodile Macroevolution Dataset Keywords: Katie Davis, evolution, phylogeny
Workshop 1 | 27 minutes Keywords:macroevolution, pseudosuchia, global climate change, read.csv, dim, class, plot, xlim, sea level, save.image
Workshop 2 | 28 minutes Keywords:macroevolution, pseudosuchia, global climate change, read.csv, dim, class, plot, xlim, sea level, save.image
Workshop 3 | 17 minutes Keywords:macroevolution, pseudosuchia, loading libraries, BAMMtools, Bayesian, phylogeny, phylogenetic trees, speciation, plotRateThroughTime
Workshop 4 | 26 minutes Keywords:ggplot2, Detrended Cross Correlation Analysis,