# Using multiple regression for adjustment

This is a section from Martin Bland’s text book An Introduction to Medical Statistics, Fourth Edition. I hope that the topic will be useful in its own right, as well as giving a flavour of the book. Section references are to the book.

## 15.3 Using multiple regression for adjustment

You will often see the words ‘adjusted for’ in reports of studies. This almost always means that some sort of regression analysis has been done, and if we are talking about the difference between two means this will be multiple linear regression.

In clinical trials, regression is often used to adjust for prognostic variables and baseline measurements. For example, Levy et al. (2000) carried out a trial of education by a specialist asthma nurse for patients who had been taken to an accident and emergency department due to acute asthma. Patients were randomized to have two 1-hour training sessions with the nurse or to usual care. The measurements were 1 week peak expiratory flow and symptom diaries made before treatment and after 3 and 6 months. We summarized the 21 PEF measurements (three daily) to give the outcome variables mean and standard deviation of PEF over the week. We also analysed mean symptom score.

The primary outcome variable was mean PEF, shown in Figure 15.3.

Figure 15.3 Mean of 1-week diary peak expiratory flow 6 months after training by an asthma specialist nurse or usual care (data from Levy et al. 2000).

There is no obvious difference between the two groups and the mean PEF was 342 litre/min in the nurse intervention group and 338 litre/min in the control group. The 95% CI for the difference, intervention minus control, was −48 to 63 litre/min, P = 0.8, by the two-sample t method. However, although this was the primary outcome variable, it was not the primary analysis. We have the mean diary PEF measured at baseline, before the intervention, and the two mean PEFS are strongly related.We can use this to reduce the variability by carrying out multiple regression with PEF at 6 months as the outcome variable and treatment group and baseline PEF as predictors. If we control for the baseline PEF in this way, we might get a better estimate of the treatment effect because we will remove a lot of variation between people. We get:

PEF@6m   =   18.3     +     0.99 × PEF@base     +     20.1 × intervention
95% CI     −10.5 to 47.2     0.91 to 1.06               0.4 to 39.7
P<0.001                     P=0.046

Figure 15.4 shows the regression equation as two parallel lines, one for each treatment group.

Figure 15.4 Mean PEF after 6 months against baseline PEF for intervention and control asthmatic patients, with fitted analysis of covariance lines (data from Levy et al. 2000).

Multiple regression in which qualitative and quantitative predictor variables are both used is also known as analysis of covariance. The vertical distance between the lines is the coefficient for the intervention, 20.1 litre/min. By including the baseline PEF we have reduced the variability and enabled the treatment difference to become apparent.

There are clear advantages to using adjustment. In clinical trials, multiple regression including baseline measurements reduces the variability between subjects and so increases the power of the study. It makes it much easier to detect real effects and produces narrower confidence intervals. It also removes any effects of chance imbalances in the predicting variables.

In the PEF analysis, we could have used the differences between the baseline and 6 month measurements rather than analysis of covariance. This is not as good because there is often measurement error in both our baseline and our outcome measurements. When we calculate the difference between them, we get two lots of error. If we do regression, we only have the error in the outcome variable. If the baseline variable has a lot of measurement error or there is only a small correlation between the baseline and outcome variables, using the difference can actually make things worse than just using the outcome variable. Using analysis of covariance, if the correlation is small the baseline variable has little effect rather than being detrimental.

### Reference

Levy, M.L., Robb, M., Allen, J., Doherty, C., Bland, J.M., and Winter, R.J.D. (2000). A randomized controlled evaluation of specialist nurse education following accident and emergency department attendance for acute asthma. Respiratory Medicine, 94, 900–8.

Adapted from pages 227–228 of An Introduction to Medical Statistics by Martin Bland, 2015, reproduced by permission of Oxford University Press.