- C-1 Case-control studies
- C-2 Assessment bias
- C-3 Recall bias
- C-4 Sample survey: selecting a representative sample
- C-5 Generalisability and extrapolation of results
- C-6 Maximising response rates to questionnaire surveys

The main purpose of matching is to control for confounding (see A-1.6). However it should be appreciated that confounding factors can be controlled for in other ways (see E-5) and these other ways become increasingly appealing when we consider some of the problems associated with matching:

1) It is not possible to examine the effects of the matching variables upon the status of the disease/disorder (either present or absent). Thus although the disease or condition of interest will be related to the matching variables, the matching variables should not be of interest in themselves.

2) If we match we should take the matching into account in the statistical analysis. This makes the analysis quite complicated (see E-6.1, Breslow & Day 1980).

3) In a 1-1 matched case-control study matched pairs are analysed together and so missing information on a control means that it's case is also treated as missing in the statistical analysis. Similarly missing information on a case leads to the loss of information on its matched control(s).

4) Bias can arise if we match on a variable that turns out to form part of the causal pathway between the risk factor under study and disease. This bias is said to be due to overmatching.

See Bland & Altman (1994c) and Breslow & Day (1980), for further discussion on matching.

Sometimes by chance a random sample is not as representative as we would like. For example in our cross-sectional survey to investigate associations between unemployment and current health it may be particularly important to ensure that we have an adequate representation of all postal areas in the borough, thereby reflecting the socioeconomic deprivation that exists. One way of doing this is to undertake stratified random sampling. Stratified random sampling is a means of using our knowledge of the population to ensure the representative nature of the sample and increase the precision of population estimates. Post-code area would be known as the stratification factor. Usually we undertake proportional stratified sampling. The total sample size is allocated between the strata proportionally, with the proportion determined by the strata total size as a proportion of the total population size. For example if 10% of the borough live in one postal code area then we randomly select 10% of the sample from this strata.

Stratification does not depart from the principle of random sampling. All it means is that before any selection takes place, the population is divided into strata and we randomly sample in each strata. It is possible to have more than one stratification factor. For example in addition to stratifying by post-code area, we may stratify by age group within the post code area. Nonetheless, we have to be careful not to stratify by too many factors. Stratified random sampling requires that we have a large population, for which all of the members and their stratification factors are listed. Obviously as the number of stratification factors increase then so also does the time and expense involved. Nonetheless we can be more confident of the representative nature of the sample and thereby the generalisability of the results.

Altman DG. & Bland JM. (1998)
Generalisation and extrapolation.
*British Medical Journal* ** 317 ** 409-410.

Bland JM & Altman DG. (1994c).
Matching.
*British Medical Journal* ** 309 ** 1128.

Breslow NE and Day NE. (1980) *Statistical Methods in Cancer Research:
Volume 1 - The analysis of case-control studies.* IARC Scientific
Publications No. 32, Lyon.

Edwards P., Roberts I., Clarke M., DiGuiseppi C, Pratap S., Wentz R., Kwan
I. (2002).
Increasing response rates to postal questionnaires: systematic review.
*British Medical Journal* ** 324 ** 1183-1185.

Back to Brief Table of Contents.

Back to Martin Bland's home page.

This page is maintained by Martin Bland.

Last updated: 10 September, 2009.