Introduction

Aims

In this session learn how to create fully reproducible outputs in a variety of formats using R Markdown.

Learning outcomes

The successful student will be able to:

  • Include objects created within text using inline coding.
  • Include special characters in text.
  • Use a greater variety YAML options and be aware of the variety of possible outputs.
  • Use a greater variety of code chunk options including figure legends.
  • Include nicely formatted tables.
  • Cite and list references using a .bib file.

Start with a R markdown document.

  • File | New File | R Markdown.
  • Add a title.
  • Add your name.
  • Delete everything except:
    • the YAML header between the —
    • the first code chunk which begins: ```{r setup, include=FALSE}

Formatting

  • *italic* and **bold** to gives you italic and bold.
  • >

creates a block quote

  • [My module page](http://www-users.york.ac.uk/~er13/58M_BDS_2019/) gives you My module page

Inline code

You can run code inline to access variables in a piece of text by putting it between `r and ` . For example by writing in the Rmd:

The squareroot of 2 is `r sqrt(2)` you will get

The squareroot of 2 is 1.4142136 in the knitted output.

Special characters

You can include special characters in a markdown document using LaTeX markup. This has $ signs on the outside and uses backslashes and curly braces to indicate that what follows should be interpreted as a special character with special formatting. For example, to get \(\bar{x} \pm s.e.\) you write $\bar{x} \pm s.e.$

Example

We will read in ecoli.txt and summarise it for reporting in text. These data are from an investigation of the growth of three E.coli strains on four different media. The data are measures of optical density (in arbitrary units) which gives an indication of the number of cells in the medium.

Add a chunk to read the data in:

file <- here::here("data", "ecoli.txt")
ecoli <- read.table(file, header = TRUE)

Add a chunk summarise it:

ecolisum <- ecoli %>% 
  group_by(Strain, medium) %>% 
  summarise(mean_od = mean(dens),
            n_od  = length(dens),
            sd_od = sd(dens),
            se_od = sd(dens)/sqrt(n_od))

To report this you could write:

The optical density of strain `r ecolisum$Strain[1]` on `r ecolisum$medium[1]` is `r ecolisum$mean_od[1]` $\pm$ `r ecolisum$sd_od[1]`.

Which would give you:

The optical density of strain 1 on Circle is 9.3125 \(\pm\) 3.9686045.

You would probably want to include code for rounding appropriately.

Tables

There are several options for formatting tables. I tend to use knitr::kable() with the kableExtra package (Zhu 2019) or the flextable package (Gohel 2019). These both take a dataframe as an input.

library(kableExtra)
ecolisum %>% 
  knitr::kable(caption = "Summary statistics for experiment",
               col.names = c("Strain", 
                             "Medium", 
                             "Mean", 
                             "N",
                             "Stdev",
                             "Stder"),
               digits = 2) %>%
  kable_styling(bootstrap_options = c("striped", "condensed"),
                font_size = 11) %>% 
  add_header_above(c(" " = 2, "Optical Density" = 4))
Summary statistics for experiment
Optical Density
Strain Medium Mean N Stdev Stder
1 Circle 9.31 8 3.97 1.40
1 Colibroth 14.69 8 4.84 1.71
1 Eplus 17.31 8 3.27 1.16
1 GoCo 9.11 8 3.75 1.33
2 Circle 10.60 8 7.30 2.58
2 Colibroth 15.31 8 4.89 1.73
2 Eplus 15.72 8 6.79 2.40
2 GoCo 18.12 8 4.28 1.51
3 Circle 16.62 8 5.70 2.01
3 Colibroth 20.39 8 3.86 1.37
3 Eplus 14.74 8 5.67 2.00
3 GoCo 9.41 8 4.91 1.74

References

  • A references list can be added by creating a .bib file containing references in BibTeX format and another line to the YAML header.
    • citation("package") in the console will give packages references in BibTeX format.
    • BibTeX format is also available through most referencing software (e.g., PaperPile).
    • the YAML line is bibliography: mybibfile.bib where the file can be specified using inline code: '`r here::here("refs", "mybibfile.bib")`'
  • Citations are added using:
    • statement [@Codd1990-th] for statement (Codd 1990).
    • Codd [-@Codd1990-th] said for Codd (1990) said.
  • Every citation used results in the reference being added to a list at the bottomof te output.

Rticles and other packages

Demo

Exercise

Either:

  1. continue working with the example in Workshop 2: Tidying data and the tidyverse. and Workshop 4: Reproducibility and an introduction to R Markdown. to develop a report generated through R Markdown. The data are in Y101_Y102_Y201_Y202_Y101-5.csv.

  2. Start working with your own assessment.

Do:

  • Continue to follow the good practice in Reproducibility and an introduction to R Markdown. to organise and document your analysis.
  • Report summary information about the dataset in the text reproducibily using inline coding, tables and figures.
  • Use references and a .bib file.
  • Try a RMarkdown template.

Good references for R Markdonw

R Markdown: The Definitive Guide (Xie, Allaire, and Grolemund 2018) RStudio’s Guide

The Rmd file

Rmd file

Codd, E F. 1990. The Relational Model for Database Management: Version 2. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc.

Gohel, David. 2019. Flextable: Functions for Tabular Reporting. https://CRAN.R-project.org/package=flextable.

Xie, Yihui, J.J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.

Zhu, Hao. 2019. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.