Back to Big Data Biology main page

All data are from unpublished research by Katie E. Davis & Alex Payne.

Introduction

The biological question

Our biological question is to explore how the diversification of Pseudosuchia was shaped by environmental changes in the geological past.

You will do this by testing for significant correlations between the speciation rate of Pseudosuchia and global environmental change through geological time.

Over the course of the four workshops you will be guided towards answering the hypothesis by showing you:

  • How to explore the different data types.
  • How to partition the phylogenetic data by habitat so that you can test whether ecology affected biotic responses to environmental change.
  • How to plot phylogenetic trees.
  • How to plot time series from environmental & diversification data.
  • How to carry out correlation analyses between these time series.
  • How to plot the output as a histogram.
  • How to test for statistical significance.

Aim of this workshop

The purpose of this workshop is to familiarise yourself with the environmental data you will be using throughout the series of workshops. You have been provided with two sets of data.

  • Global temperature through geological time.
  • Global sea level through geological time.

Learning outcomes

  • Loading & exploring environmental time series data.
  • Understand how to manipulate data to create better plots.
  • Saving data.

Getting started

In RStudio you should go to the dropdown menus at the top right and choose the following: File -> New File -> R Script. Remember to save your script regularly!

I recommend setting up your directories (you might be more familiar with the term “folders”, it means the same thing) such that all your data and your script are in the same place.

You can find all the data needed for these workshops here:

https://www-users.york.ac.uk/~kd856/WorkshopData/

Loading & plotting data - temperature

Let’s start by exploring the environmental data. You’ve been provided with two time series of data, global temperature and global sea level. First, load in the temperature data.

temperature <- read.csv("temperatureTimeSeries.csv", header=FALSE)

Check to see if it’s loaded in correctly.

head(temperature)
##          V1        V2
## 1 0.3010611 -3.968815
## 2 0.3010611 -3.903490
## 3 0.3010611 -3.772841
## 4 0.3010611 -3.707516
## 5 0.3010611 -3.054268
## 6 0.3010611 -3.838166

Let’s try plotting it.

plot(temperature)

See how noisy it is? Let’s smooth it before we go any further.

finaltemp <- smooth(smooth(temperature$V2))

And replot.

plot(finaltemp)

See the difference?

Do you notice anything wrong with the axes? The x-axis needs converting to time. At the moment it’s just shown as an index. Column 1 (V1) in the csv file is time (measured in millions of years - MYR) so try this. We’ll also rename the x-axis to something more sensible. You can also rename the y-axis if you like.

plot(temperature$V1, finaltemp, xlab='Time')

Now you’ll notice one more thing we need to fix. The x-axis is backwards. We need to reverse it so that time 0 (i.e. the present day) is at the right hand end of the axis).

plot(temperature$V1,finaltemp,  xlab='Time', xlim=c(220,0))

Let’s make it look a bit nicer.

plot(temperature$V1,finaltemp, xlab='Time', xlim=c(220,0), type = 'l')

You can save this to a PDF if you like.

Warning! The y-axis is NOT measured in degrees C/F. It’s a temperature proxy and just tells us about relative temperatures through time. You might find it interesting/useful to read more about temperature proxies between the next workshop.

Loading & plotting data - sea level

Now let’s take a look at the sea level data.

seaLevel <- read.csv("seaLevelTimeSeries.csv",header=TRUE)

Let’s check it looks right.

head(seaLevel)
##      Age     SL
## 1 244.80 -50.67
## 2 244.77 -49.73
## 3 244.74 -48.79
## 4 244.71 -45.01
## 5 244.68 -43.12
## 6 244.65 -40.29

Now let’s try plotting it.

plot(seaLevel)

The data aren’t as noisy as the temperature time series so we’re ok to leave it but we do need to reverse the time scale again. The x-axis is already measured in MYR as sensible labels are used in the headers so we don’t need to change that this time.

plot(seaLevel, xlim=c(250,0))

Let’s make it look a bit nicer.

plot(seaLevel, xlim=c(250,0), type = 'l')

Saving data

It’s good practice to save your data regularly, it also saves you having to rerun code when you return to a project. This is how we do this for R data.

save.image("Crocs_Workshop1.RData")

For next time

For next time do some background reading on how we measure temperature and sea level change through geological time and what significant events have happened in the past, e.g., greenhouse/icehouse climates. If you like you could then try adding markers to your plots to show when these events happened. As you progress through the workshops you might want to think about what events might be of particular relevance to Pseudosuchia, particularly with respect to the global changes in temperature & sea level.

Hint: You could try using abline() to add vertical line markers to your plots. There are also lots of other ways to make fancier plots if you want to explore some more, e.g., ggplot2().

I also recommend you familiarise yourself with the concepts behind palaeo-environmental time series. This will help your understanding of how we obtain these data, how we use them, and how they can help us to gain an understanding of past environmental change.

Resources

Try these websites for a basic overview of environmental proxy data:

https://en.wikipedia.org/wiki/Paleoclimatology

http://www.real-project.eu/palaeoenvironmental-sciences-lexicon/

This paper is also a good start:

https://link.springer.com/article/10.1007/s13253-019-00374-2

This paper talks more about how we can apply palaeoenvironmental data to ecological questions:

https://www.sciencedirect.com/science/article/pii/S0169534711002692

Very basic overview of Pseudosuchia (Wikipedia is great as a jumping off point into a new topic but please do not cite it in your reports!):

https://en.wikipedia.org/wiki/Pseudosuchia

This very relevant paper is about Pseudosuchia & climate change. Note that although they test similar hypotheses they do not use a phylogenetic approach:

https://www.nature.com/articles/ncomms9438

And this is a very easy reading news article about this paper:

https://www.cbsnews.com/news/could-global-warming-lead-to-bigger-badder-crocs/