To explain how to analyse data that are counts falling into mutually exclusive categories using two types of chi-squared test and the difference between these two tests.

By actively following the lecture and practical and carrying out the independent study the successful student will be able to:

- recognise when to use chi-squared Goodness of Fit and Contingency tests (MLO 2)
- be able to carry out, interpret and report scientifically both types in R (MLO 3 and 4)

Workshops are not a test. It is expected that you often don’t know how to start, make a lot of mistakes and need help. Do not be put off and don’t let what you can not do interfere with what you can do. You will benefit from collaborating with others and/or discussing your results.

The lectures and the workshops are closely integrated and it is expected that you are familar with the lecture content before the workshop. You need not understand every detail as the workshop should build and consolidate your understanding. You may wish to refer to the slides as you work through the workshop schedule.

Start RStudio from the Start menu.

In RStudio, set your working directory to the folder you created last week for your 17C Data Analysis work.

Make a new script file called practical3.R to carry out the rest of the work.

In a local maternity hospital, the total numbers of births induced on each day of the week over a six week period were recorded a

Day | No.inductions |
---|---|

Monday | 43 |

Tuesday | 36 |

Wednesday | 35 |

Thursday | 38 |

Friday | 48 |

Saturday | 26 |

Sunday | 24 |

Total | 250 |

We can use a chi-squared test to ascertain whether there is a pattern in these data that might suggest that surgeons are more reluctant to perform inductions on some days than on others.

Make a vector `obs`

that holds the number of inductions on each day.

Examine the ‘structure’ of the `obs`

object using `str()`

What is your null hypothesis and what type of test is required?

We can carry out a Goodness of Fit chi-squared test on these data by coding the how the test works ourselves. If the null hypothesis is true, 1/7th of inductions would be expected to occur each day *i.e.,* 1/7th of 250

Assign the total number of inductions to a variable called `totalinductions`