Introduction

Aims

In the second of three related workshops we will learn to apply and interpret the glm() function to poisson (count) response data.

Objectives

By actively following the first lecture, working through workbook examples during the workshop workshop and any completing follow-up independent study the successful student will be able to:

  • Explain the link between the general linear models and the generalised linear model
  • Recognise where a generalised linear model for a poisson distributed response would be appropriate and apply glm()
  • Determine which effects are significant using using summary() and anova()

You can optionally stretch yourself by asking for more in-depth explanations about the meaning of the estimates, and the direction and magnitude of the effects or creating figures to go with your analyses.

Workbook Instructions

The workbook for this session is divided in to 2 sections.

You are not expected do all of the workbook examples

Choose one from each section that best matches your biological interests. For each example you choose, you should:

  • write comments in your scripts!
  • read in the data file
  • check you understand the structure of the data
  • identify the response and explanatory variables
  • build a model with glm()
  • examine the model result using summary() and anova()
  • what are the model estimates?
  • interpret the results
  • use plot(mod, which = 1) and plot(mod, which = 2) to examine the assumptions

Optional Extension: Practice your plotting skills.

Workbook

Section 1

Choose one of the following examples

Parkinson’s disease

This example examines the progression of Parkinson’s disease in flies with a Parkinson’s disease-associated mutation. The effect of age on locomoter ability of flies with a Parkinson’s disease-associated mutation was determined by using a climbing assay. Ten flies of each tested age were placed in 3 replicate vials and the number able to climb to a set height within set time was recorded. The data in are park.txt and the ages are given in days. Each row is a vial. Can you predict the number of mutant flies able to climb by their age?


Protein kinase

This example concerns the effect of a Mitogen-activated protein (MAP) Kinase Inhibitor on the number of nuclei in neurons The importance of Mitogen-activated protein kinases in regulating cell division led researchers to hypothesise that MAPK inhibition might effect cytokinesis following mitosis. They treated samples of neurons with PD089059, a MAPK inhibitor, at various concentrations (1 to 15 in arbitrary units) and recorded the number of nuclei per cell. The data in are kinase.txt. Each row is a cell. Can you predict the number of nuclei by PD089059 concentration?


Section 2

Choose one of the following examples

Number of mutations

This example is about how the number of mutations in a hypermutable tetranucleotide marker is affected by a person’s age and whether of not they have cancer. The number of mutations in the hypermutable tetranucleotide marker D7S1482 were analysed in buccal specimens from 30 head and neck carcinoma cases and 43 controls. Also recorded was the subject’s age (in years). The goal of analysis was to determine whether age and cancer status could predict the number of mutations. The data are in mutation.txt and comprise the following variables:

  • mut : the number of mutations
  • age : a continuous measure of the subject’s age to the nearest 0.1 of a degree.
  • cat : a factor with two levels, “control” and “tumour”

Birds catching insects

This example considers the effects of age and group size on the ability of birds to catch insects. The number of insect prey individuals of a particular bird species manages to collect varies. In an effort to understand this variation, researchers recorded the number of prey an individual caught, its age and how it spent the majority of it’s time (as a single individual, in a pair or in a group of many). The data are in prey.txt and comprise following variables:

  • prey : the number of insect prey items caught
  • age : the individual’s age in years (to one tenth of a year)
  • group : how the individual spent the majority of its time; a factor with three levels “many”, “pair”, “single”

The goal of analysis was to determine if the number of prey items caught could be explained by age, habitat and group.

The Rmd file

Suggested analyses and interpretation for Workbook examples are marked:

#============== WORKBOOK EXAMPLE ==============#

Rmd file