Suggested answer to exercise: Analyses for qualitative data, 2

Question 2: What conditions do the data have to meet for the test to be valid?

Suggested answer

The chi-squared test is a large sample test and the usual rule is that
the large sample approximation holds if all expected frequencies are greater
than 5 for a 2 by 2 table.
Although one observed frequency is 5, no expected values will be as small.
This is because if the null hypothesis were true then the overall probability
of being positive for *P. alcalifaciens* would be 28/627 = 0.04
and this proportion would apply to those who have and those who have
not travelled abroad.
Thus the expected numbers positive for *P. alcalifaciens* would be
254 × 28/627 = 11.3 for those who have travelled abroad and
373 × 28/627 = 16.7 among those who have not travelled abroad.
The other expected values can be calculated in a similar way but
will be large because the expected values must add to the marginal totals
for each row and column.

