Measurement in Health and Disease: Measurement error

Accuracy and precision
Sources of variation
The within-subject standard deviation, s_w
Analysis of variance
Reporting the measurement error
Assumptions in the calculation of the within-subject standard deviation
Data which go off the scale
Repeatability dependent on the magnitude of the variable
Correlation coefficients in the study of repeatability
The intra-class correlation coefficient (ICC)
Reference

Accuracy and precision

In this lecture we shall consider the problem of the precision and repeatability of measurements which are numerical variables such as blood pressure and forced expiratory volume (FEV). We shall look at how good a measurement is from the clinical point of view, for giving us information about the individual subject or patient. We also look at the repeatability of measurement methods from the point of view of the researcher, that is how good a method is at telling us something about the population.

We shall have a lot to say about ‘error’, a word which comes from a Latin root meaning ‘to wander’. In statistics we use the term error to mean the variation of observations or estimates about some central value. If we make several measurements of FEV on subject, they will not all be the same, because the subject cannot blow in exactly the same way each time. This variation is called error. It is not the same as a mistake, and does not imply any fault on the part of the observer. A measurement mistake might be if we transpose digits in recording the FEV, writing 9.4 litres instead of 4.9.

We will first distinguish precision and accuracy. A measurement is precise if repeated observations of the same quantity are close together. It is accurate if observations are close to the true value of the quantity. Thus a measurement can be precise without being accurate, but cannot be accurate without being precise. In this lecture I shall be concerned with precision.

Sources of variation

First we consider different sources of variation. Figure 1 shows three histograms of Peak Expiratory Flow Rate (PEFR) in male medical students.

Figure 1. Distribution of PEFR for 54 male medical students, with 20 repeated measurements for two students
See d for details. d

The upper histogram shows a sample of single measurements of PEFR obtained from 54 different students, whereas the lower histograms each show 20 repeated measurements of PEFR on a single student Table 1. The variability between students shown in the upper histogram is much greater than that shown within the same student shown in the lower histograms. There are two different kinds of variation here: variation within individuals because repeated measurements are not all the same, and variation between individuals because some people can blow harder than others.

We measure PEFR for several reasons: for example, to compare a patient’s PEFR to a reference interval for diagnostic purposes, to monitor changes in lung function over time, or to compare two groups of subjects as in a clinical trial or epidemiological study. In each case, we want to be sure that the variation between measurements, the within-subject variation, does not swamp the difference for which we looking. Because PEFR is known to have high variation between measurements, it is customary to make several observations to achieve this, and use their mean or maximum. The latter is used because of the special nature of this measurement, the maximum rate of flow which the subject can achieve.

Table 1. Repeated PEFR (litre/min) measurements for two male medical students
Student A Student B
685 695 660 660 690 530 535 530 535 525
690 665 665 685 680 530 520 530 525 520
675 660 660 670 690 525 535 520 535 535
685 645 660 690 680 530 525 530 540 530

**Table 1. Repeated PEFR (litre/min) measurements for two male medical students**
Student A		Student B
685	695	660	660	690	530	535	530	535	525
690	665	665	685	680	530	520	530	525	520
675	660	660	670	690	525	535	520	535	535
685	645	660	690	680	530	525	530	540	530

If we suppose that a subject has a true PEFR, which would be the mean of all possible measurements, then the difference between an individual measurement and the true value is its error. Many factors could influence this error. We would expect that a series of PEFR measurements made on a subject by different observers at different times spread over six months would vary more than a series over one morning by one observer. We might be interested in different types of variability for different purposes. Monitoring short term changes in blood pressure in a single patient requires one type of error, interpreting random blood pressure in a screening clinic another. In the first case, we are detecting shifts in mean blood pressure over a short period of time, in the second we are determining from one or two measurements whether the subject’s mean blood pressure is above some cut-off point such as 90mm Hg diastolic.

We need to define what we mean by measurement error rather carefully. The British Standards Institution (1979) considered this question for laboratory measurements, and made the distinction between repeatability, incorporating variability between measurements made by the same operator in the same laboratory, and reproducibility, incorporating variability between measurements made by different operators working in different laboratories. The same considerations arise when we have complex measurements such as assays, where we might have the error estimated separately for different stages in the measurement, giving an intra-assay or within-assay error and an inter-assay or between-assay error. For the first we would take repeated readings from the same assay and estimate their error, and for the second we would take repeated assays on the same subject.

Sometimes we are able to separate the effects of the different sources of variation and sometimes not. In this lecture we describe techniques for estimating the variability between methods which work whether the measurements are all made by one observer on the same occasion, or made by different observers on different occasions, or made repeatedly by the subjects themselves. We discuss studies where the same group of observers are used to measure several subjects in the next lecture.

We first consider the problem of estimating the variation between repeated measurements for the same subject. Essentially, we want to know how far from the true value a single measurement is likely to be. This estimation will be simplest if we assume that the error is the same for everybody, irrespective of the value of the quantity being measured. This will not always be the case, and the error may depend on the magnitude of the quantity, for example being proportional to it.

The within-subject standard deviation, s_w

We start with the case where the measurement error is assumed to be the same for everyone. This is a simple model, and it may be that some subjects will show more individual variation than others. If the measurement error varies from subject to subject, independently of magnitude so that it cannot be predicted, then we have to estimate its average value. We estimate the within-subject variability as if it were the same for all subjects.

Consider the data of Table 1. Calculating the standard deviations in the usual way, we get standard deviations s₁ = 14.3178 and s₂ = 5.6835 for the two students. We can get a combined estimate averaged over the two students. We actually average the variances, the squares of the standard deviations, allowing for possibly different samples sizes. It is the same method as used in a two sample t test. We get

Student A					Student B
685	695	660	660	690	530	535	530	535	525
690	665	665	685	680	530	520	530	525	520
675	660	660	670	690	525	535	520	535	535
685	645	660	690	680	530	525	530	540	530

Child	PEFR (litre/min)	mean	s.d.
1	190 220 200 200	202.50	12.58
2	220 200 240 230	222.50	17.08
3	240 230 215 210	223.75	13.77
4	260 260 240 280	260.00	16.33
5	210 300 280 265	263.75	38.60
6	260 260 280 270	267.50	9.57
7	270 265 280 270	271.25	6.29
8	275 270 275 275	273.75	2.50
9	280 280 270 275	276.25	4.79
10	260 280 280 300	280.00	16.33
11	245 290 290 295	280.00	23.45
12	275 275 275 305	282.50	15.00
13	280 290 300 290	290.00	8.16
14	320 290 300 290	300.00	14.14
15	300 300 310 300	302.50	5.00
16	270 250 330 370	305.00	55.08
17	300 310 310 305	306.25	4.79
18	300 300 340 315	313.75	18.87
19	315 325 330 295	316.25	15.48
20	320 330 330 330	327.50	5.00
21	335 320 335 375	341.25	23.58
22	350 320 340 365	343.75	18.87
23	360 320 350 345	343.75	17.02
24	330 340 380 390	360.00	29.44
25	335 385 360 370	362.50	21.02
26	400 400 420 395	403.75	11.09
27	400 420 425 420	416.25	11.09
28	430 460 480 470	460.00	21.60

Source	Sum of squares	Degrees of freedom	Mean square	F ratio	P
Source	Sum of squares	Degrees of freedom	Mean square	F ratio	P
Subject	365604.24	27	13540.90	35.14	0.0000
Residual	32368.75	84	385.34
Total	397972.99	111	3585.342

1st 2nd	1st 2nd	1st 2nd	1st 2nd	1st 2nd
0.92 0.94	1.37 1.39	1.49 1.51	1.60 1.63	1.75 1.87
1.04 1.72	1.37 1.52	1.49 1.60	1.60 1.66	1.76 1.62
1.05 1.18	1.38 1.16	1.50 1.45	1.60 1.68	1.76 1.82
1.08 1.28	1.38 1.29	1.50 1.47	1.60 1.75	1.77 1.78
1.10 1.11	1.38 1.37	1.50 1.58	1.61 1.44	1.77 1.85
1.17 1.24	1.38 1.39	1.51 1.51	1.61 1.53	1.78 1.72
1.19 1.25	1.38 1.40	1.51 1.54	1.61 1.55	1.78 1.76
1.19 1.26	1.38 1.43	1.51 1.73	1.61 1.61	1.80 1.72
1.19 1.37	1.39 1.44	1.52 1.53	1.61 1.61	1.80 1.76
1.20 1.24	1.40 1.38	1.53 1.46	1.62 1.57	1.80 1.79
1.21 1.19	1.40 1.42	1.53 1.48	1.62 1.68	1.80 1.82
1.22 1.26	1.40 1.57	1.53 1.48	1.63 1.70	1.80 1.82
1.22 1.38	1.42 1.45	1.53 1.51	1.64 1.61	1.82 1.88
1.23 1.28	1.42 1.46	1.53 1.56	1.64 1.72	1.85 1.73
1.23 1.54	1.42 1.83	1.53 2.01	1.65 1.43	1.85 1.81
1.27 1.31	1.43 1.38	1.54 1.56	1.65 1.60	1.85 1.89
1.28 1.27	1.43 1.38	1.54 1.57	1.65 2.05	1.86 1.90
1.28 1.29	1.43 1.41	1.55 0.69	1.66 1.64	1.87 1.88
1.28 1.38	1.43 1.51	1.55 1.56	1.67 1.50	1.88 1.82
1.29 1.23	1.43 1.54	1.55 1.60	1.67 1.63	1.89 1.90
1.29 1.28	1.43 1.65	1.56 1.60	1.69 1.67	1.89 2.00
1.32 1.37	1.45 1.29	1.57 1.57	1.69 1.69	1.92 2.00
1.33 1.32	1.45 1.42	1.57 1.60	1.69 1.79	1.92 2.10
1.33 1.35	1.45 1.48	1.58 1.36	1.70 1.82	1.94 1.43
1.33 1.42	1.46 1.47	1.58 1.49	1.72 1.69	1.94 2.10
1.34 1.39	1.46 1.49	1.58 1.60	1.72 1.73	1.95 2.27
1.34 1.44	1.47 1.19	1.58 1.60	1.72 1.74	1.97 2.03
1.35 1.40	1.47 1.44	1.58 1.65	1.73 1.73	2.10 2.20
1.35 1.40	1.47 1.53	1.58 1.67	1.74 1.71	2.10 2.21
1.35 1.40	1.47 1.65	1.59 1.41	1.74 1.79	2.11 2.13
1.35 1.59	1.48 1.35	1.59 1.60	1.74 1.80	2.15 2.07
1.36 1.25	1.48 1.48	1.59 1.71	1.75 1.61	2.21 2.02
1.36 1.32	1.49 1.47	1.60 1.58	1.75 1.84

1st 2nd	1st 2nd	1st 2nd	1st 2nd	1st 2nd
ND ND	0.2 0.6	0.4 0.3	0.9 0.2	2.7 2.4
ND ND	0.3 ND	0.4 0.4	0.9 0.3	2.7 4.0
ND ND	0.3 ND	0.4 0.4	0.9 0.7	2.8 2.2
ND ND	0.3 ND	0.4 0.4	0.9 0.7	2.8 3.9
ND 0.1	0.3 ND	0.4 1.1	0.9 3.3	2.8 6.8
ND 0.1	0.3 ND	0.4 1.4	1.0 0.2	3.1 1.6
ND 0.1	0.3 ND	0.5 0.1	1.0 1.6	3.2 2.9
ND 0.2	0.3 0.1	0.5 0.3	1.1 0.4	3.2 3.0
ND 0.2	0.3 0.1	0.5 0.3	1.1 0.9	3.2 4.5
ND 0.2	0.3 0.2	0.5 0.3	1.2 0.8	3.5 3.4
ND 0.6	0.3 0.2	0.5 0.4	1.2 0.9	3.5 4.9
0.1 ND	0.3 0.3	0.5 1.0	1.2 1.5	3.6 0.2
0.1 0.1	0.3 0.3	0.6 ND	1.2 1.8	3.7 2.6
0.1 0.1	0.3 0.3	0.6 0.3	1.3 0.3	3.8 3.6
0.1 0.2	0.3 0.4	0.6 0.5	1.4 0.7	3.9 5.5
0.1 0.2	0.3 0.4	0.6 0.6	1.5 0.6	4.0 3.1
0.1 0.4	0.3 0.4	0.6 0.8	1.6 0.8	4.1 3.4
0.1 0.5	0.3 0.4	0.6 0.8	1.6 1.3	4.1 3.7
0.2 ND	0.3 0.5	0.6 1.0	1.7 4.7	4.1 5.0
0.2 ND	0.3 0.6	0.7 0.1	1.8 0.9	4.4 1.7
0.2 ND	0.4 ND	0.7 0.2	1.8 1.9	4.7 4.5
0.2 0.1	0.4 ND	0.7 0.3	1.8 2.1	4.8 4.3
0.2 0.1	0.4 0.1	0.7 0.3	1.8 2.3	4.9 1.4
0.2 0.1	0.4 0.1	0.7 0.8	1.9 1.2	4.9 3.9
0.2 0.1	0.4 0.1	0.7 0.9	1.9 1.5	6.5 5.4
0.2 0.1	0.4 0.1	0.7 1.4	1.9 2.8	7.0 4.0
0.2 0.2	0.4 0.2	0.8 0.4	2.0 1.4	7.6 4.7
0.2 0.2	0.4 0.2	0.8 0.5	2.0 3.1	7.8 3.6
0.2 0.3	0.4 0.3	0.8 0.8	2.0 3.4	9.3 5.4
0.2 0.3	0.4 0.3	0.8 0.9	2.1 2.9	9.9 7.2
0.2 0.3	0.4 0.3	0.8 1.8	2.3 4.1
0.2 0.5	0.4 0.3	0.9 0.2	2.7 1.4

Student A					Student B
685	695	660	660	690	530	535	530	535	525
690	665	665	685	680	530	520	530	525	520
675	660	660	670	690	525	535	520	535	535
685	645	660	690	680	530	525	530	540	530

Student A					Student B
685	695	660	660	690	530	535	530	535	525
690	665	665	685	680	530	520	530	525	520
675	660	660	670	690	525	535	520	535	535
685	645	660	690	680	530	525	530	540	530