Introduction

Aims

To explain how to analyse data that are counts falling into mutually exclusive categories using two types of chi-squared test and the difference between these two tests.

Learning Outcomes

By actively following the lecture and practical and carrying out the independent study the successful student will be able to:

  • recognise when to use chi-squared Goodness of Fit and Contingency tests (MLO 2)
  • be able to carry out, interpret and report scientifically both types in R (MLO 3 and 4)

Philosophy

Workshops are not a test. It is expected that you often don’t know how to start, make a lot of mistakes and need help. Do not be put off and don’t let what you can not do interfere with what you can do. You will benefit from collaborating with others and/or discussing your results.

The lectures and the workshops are closely integrated and it is expected that you are familar with the lecture content before the workshop. You need not understand every detail as the workshop should build and consolidate your understanding. You may wish to refer to the slides as you work through the workshop schedule.

Slides

Goodness of Fit and Contingency chi-squared tests: pdf (recommended) / pptx

Getting started

W Start RStudio from the Start menu.

R In RStudio, set your working directory to the folder you created last week for your 17C Data Analysis work.

R Make a new script file called practical3.R to carry out the rest of the work.

Exercises

Inductions

In a local maternity hospital, the total numbers of births induced on each day of the week over a six week period were recorded a

Day No.inductions
Monday 43
Tuesday 36
Wednesday 35
Thursday 38
Friday 48
Saturday 26
Sunday 24
Total 250

Inductions - coding the \({\chi}^2\)

We can use a chi-squared test to ascertain whether there is a pattern in these data that might suggest that surgeons are more reluctant to perform inductions on some days than on others.

R Make a vector obs that holds the number of inductions on each day.

R Examine the ‘structure’ of the obs object using str()

Q What is your null hypothesis and what type of test is required?

We can carry out a Goodness of Fit chi-squared test on these data by coding the how the test works ourselves. If the null hypothesis is true, 1/7th of inductions would be expected to occur each day i.e., 1/7th of 250

R Assign the total number of inductions to a variable called totalinductions