You will often see the words ‘adjusted for’ in reports of studies. This almost always means that some sort of regression analysis has been done, and if we are talking about the difference between two means this will be multiple linear regression.

In clinical trials, regression is often used to adjust for
prognostic variables and baseline measurements. For example,
Levy *et al.* (2000) carried out a trial of education
by a specialist asthma nurse for patients who had been
taken to an accident and emergency department due
to acute asthma. Patients were randomized to have two
1-hour training sessions with the nurse or to usual care.
The measurements were 1 week peak expiratory flow
and symptom diaries made before treatment and after
3 and 6 months. We summarized the 21 PEF measurements
(three daily) to give the outcome variables mean
and standard deviation of PEF over the week. We also
analysed mean symptom score.

The primary outcome variable was mean PEF, shown in Figure 15.3.

**Figure 15.3** Mean of 1-week diary peak expiratory flow
6 months after training by an asthma specialist nurse or usual
care (data from Levy *et al.* 2000).

There is no obvious difference between the two groups and the mean PEF was 342 litre/min in the nurse intervention group and 338 litre/min in the control group. The 95% CI for the difference, intervention minus control, was −48 to 63 litre/min, P = 0.8, by the two-sample t method. However, although this was the primary outcome variable, it was not the primary analysis. We have the mean diary PEF measured at baseline, before the intervention, and the two mean PEFS are strongly related.We can use this to reduce the variability by carrying out multiple regression with PEF at 6 months as the outcome variable and treatment group and baseline PEF as predictors. If we control for the baseline PEF in this way, we might get a better estimate of the treatment effect because we will remove a lot of variation between people. We get:

PEF@6m = 18.3 +
0.99 × PEF@base + 20.1 ×
intervention

95% CI −10.5 to 47.2
0.91 to 1.06
0.4 to 39.7

P<0.001
P=0.046

Figure 15.4 shows the regression equation as two parallel lines, one for each treatment group.

**Figure 15.4** Mean PEF after 6 months against baseline PEF
for intervention and control asthmatic patients, with fitted
analysis of covariance lines (data from Levy *et al.* 2000).

Multiple regression
in which qualitative and quantitative predictor
variables are both used is also known as
**analysis of covariance**.
The vertical distance between the lines is the
coefficient for the intervention, 20.1 litre/min. By including
the baseline PEF we have reduced the variability and
enabled the treatment difference to become apparent.

There are clear advantages to using adjustment. In clinical trials, multiple regression including baseline measurements reduces the variability between subjects and so increases the power of the study. It makes it much easier to detect real effects and produces narrower confidence intervals. It also removes any effects of chance imbalances in the predicting variables.

Is adjustment cheating? If we cannot demonstrate an effect without adjustment (as in the asthma nurse trial), is it valid to show one after adjustment? Adjustment can be cheating if we keep adjusting by more and more variables until we have a significant difference. This is not the right way to proceed. We should be able to say in advance which variables we might want to adjust for because they are strong predictors of our outcome variable. Baseline measurements almost always come into this category, as should any stratification or minimization variables used in the design. If they were not related to the outcome variable, there would be no need to stratify for them. Another variable which we might expect to adjust for is centre in multi-centre trials, because there may be quite a lot of variation between centres in their patient populations and in their clinical practices. We might also want to adjust for known important predictors. If we had no baseline measurements of PEF, we would want to adjust for height and age, two known good predictors of PEF. We should state before we collect the data what we wish to adjust for and stick to it.

In the PEF analysis, we could have used the differences between the baseline and 6 month measurements rather than analysis of covariance. This is not as good because there is often measurement error in both our baseline and our outcome measurements. When we calculate the difference between them, we get two lots of error. If we do regression, we only have the error in the outcome variable. If the baseline variable has a lot of measurement error or there is only a small correlation between the baseline and outcome variables, using the difference can actually make things worse than just using the outcome variable. Using analysis of covariance, if the correlation is small the baseline variable has little effect rather than being detrimental.

Levy, M.L., Robb, M., Allen, J., Doherty, C., Bland, J.M.,
and Winter, R.J.D. (2000). A randomized controlled
evaluation of specialist nurse education following accident
and emergency department attendance for acute
asthma. *Respiratory Medicine*, **94**, 900–8.

Adapted from pages 227–228 of
*An Introduction to Medical Statistics* by Martin Bland, 2015,
reproduced by permission of
Oxford University Press.

Back to *An Introduction to Medical Statistics
*contents

Back to Martin Bland’s Home Page

This page maintained by Martin Bland

Last updated: 7 August, 2015