The sample comprised colleagues and family of J.M.B. chosen to give a wide range of PEFR but in no way representative of any defined population. Two measurements were made with a Wright peak flow meter and two with a mini Wright meter, in random order. All measurements were taken by J.M.B., using the same two instruments. (These data were collected to demonstrate the statistical method and provide no evidence on the comparability of these two instruments.) We did not repeat suspect readings and took a single reading as our measurement of PEFR. Only the first measurement by each method is used to illustrate the comparison of methods, the second measurement being used in the study of repeatability.

Wright peak flow meter | Mini Wright peak flow meter | |||
---|---|---|---|---|

First PEFR | Second PEFR | First PEFR | Second PEFR | |

Subject | (l/min) | (l/mi) | (l/min) | (l/min) |

1 | 494 | 490 | 512 | 525 |

2 | 395 | 397 | 430 | 415 |

3 | 516 | 512 | 520 | 508 |

4 | 434 | 401 | 428 | 444 |

5 | 476 | 470 | 500 | 500 |

6 | 557 | 611 | 600 | 625 |

7 | 413 | 415 | 364 | 460 |

8 | 442 | 431 | 380 | 390 |

9 | 650 | 638 | 658 | 642 |

10 | 433 | 429 | 445 | 432 |

11 | 417 | 420 | 432 | 420 |

12 | 656 | 633 | 626 | 605 |

13 | 267 | 275 | 260 | 227 |

14 | 478 | 492 | 477 | 467 |

15 | 178 | 165 | 259 | 268 |

16 | 423 | 372 | 350 | 370 |

17 | 427 | 421 | 451 | 443 |

If we have repeated measurements by each of the
two methods on the same subjects we can calculate the mean for each method on
each subject and use these pairs of means to compare the two methods using the
analysis for assessing agreement described above.
The estimate of bias will be unaffected, but the estimate of the standard deviation
of the differences will be too small, because some of the effect of
repeated measurement error has been removed.
We can correct for this.
Suppose we have two measurements obtained by each method, as in the table.
We find the standard deviation of differences between repeated measurements
for each method separately, *s*_{1} and *s*_{2},
and the standard deviation of the differences between the means for
each method, *s*_{D}.
The corrected standard deviation of differences,
*s*_{c}, is
√(*s*_{D}^{2} +
1/2 *s*_{1}^{2} + 1/2 *s*_{2}^{2}).
This is approximately √(2*s*_{D}^{2}), but
if there are differences between the two
methods not explicable by repeatability errors alone (i.e. interaction between
subject and measurement method) this approximation may produce an
overestimate.
For the PEFR, we have *s*_{D} = 33.2,
*s*_{1} = 21.6,
*s*_{2} = 28.2 l/min.
*s*_{c} is thus √(33.2^{2} +
1/2 × 21.6^{2} + 1/2 × 28.2^{2}) or 41.6 l/min.
Compare this with the estimate
38.8 l/min which was obtained using a single measurement.
On the other hand,
the approximation √(2*s*_{D}^{2}) gives an overestimate (47.0 l/min).

In the *Lancet* paper we gave the formula as
*s*_{c}, is
√(*s*_{D}^{2} +
1/4 *s*_{1}^{2} + 1/4 *s*_{2}^{2}).

This formula was given correctly in
Bland JM, Altman DG. (1999) Measuring agreement in method comparison
studies. *Statistical Methods in Medical Research* **8**, 135-160.

I think that I was to blame for this mistake, though I have no idea how it came about. It was a long time ago. Sorry about that. However, people who never make mistakes seldom make anything. If you notice that I have made a mistake, please let me know. Sooner or later, I will try to correct it.

This is an extract from Bland and Altman (1999), Section 5.1, "Equal numbers of replicates".

When we make repeated measurements of the same subject by each of two methods,
the measurements by each method will be distributed about the expected measurement
by that method for that subject.
These means will not necessarily be the same for the two methods.
The difference between method means may vary from subject to subject.
This variability constitutes method times subject interaction.
Denote the measurements on the two methods by *X* and *Y*.
We are interested in the variance of the difference between
single measurements by each method, *D* = *X* – *Y*.
If we partition the variance for each method we get

where *σ _{t}*

(1)

We wish to estimate this variance from an analysis of the means of the measurement for each subject, , that is from . With this model, the use of the mean of replicates will reduce the within-subject variance but it will not affect the interaction terms, which represent patient-specific differences. We thus have

where *m _{x}* is the number of observations on each subject by method X,
because only the within-subject within-method error is being averaged.
Similarly

The distribution of
depends only on the errors and interactions, because the true value
is included in both *X* and *Y*, which are differenced.
It follows from equation (1) that

If is the observed variance of the differences between the within-subject means, is estimated by

In the common case with two replicates of each method we have

as given in the correction above.

Back to Publications on comparing two methods of medical measurement.

Back to Comparing two methods of medical measurement menu.

Back to Martin Bland's home page.

This page maintained by Martin Bland.

Last updated: 3 July, 2009.