The question came from Paul Wicks.

##
This is the scenario:

We are trying to describe a population in terms of a marker in the blood. We
have 24 blood samples. The 24 measurements have standard deviation SD1.

1) In the first case we work out a mean and a standard error of the mean. The
standard error is SD1/sqrt(24). Let us call it SE1.

2) In the second case we mix up the blood samples into 3 'pooled' samples with
8 samples in each. The mean of the pooled samples is the same as the mean of
the unpooled samples. The 3 pooled samples have standard deviation SD2. The
standard error of our three samples is SD2/sqrt(3). Let us call it SE2.

Now I would think that SE1=SE2. I am wrong. In our experiment. SE2 is approx
SE1*sqrt(8), i.e. implying SD1=SD2. I theorized that SD2=SD1/SQRT(8) as I felt
that the pooled samples would show less variation.

Where have I gone wrong? Why is it SD1=SD2 and not SE1=SE2?

## My answer

The problem is that in measurement there are two sources of variation, between
the subjects, the "true" variation, and within the subject, the measurement
error. Pooling samples affects the first component but not the second.

Let us call the between-subjects standard deviation *s*_{b} and
the within-subject standard deviation *s*_{w}.

First we measure 24 subjects. The variance is

SD1^{2} = *s*_{b}^{2} +
*s*_{w}^{2}

and the standard error of the mean is

SE1 = root((*s*_{b}^{2} +
*s*_{w}^{2})/24) =
root(*s*_{b}^{2}/24 + *s*_{w}^{2}/24
).

Now we pool 8 subjects' blood, presumably chosen at random. The variance of
measurements of such pools is

SD2^{2} = *s*_{b}^{2}/8 +
*s*_{w}^{2}.

The between-pools component is smaller than the between-subjects component for
single samples, because this will be the average of the "true" values for 8
subjects. The measurement error is the same as for a single sample, because it
comes from the measurement process, not the subjects. There are 3 such
samples, so the standard error of the mean is

SE2 = root((*s*_{b}^{2}/8 +
*s*_{w}^{2})/3) =
root(*s*_{b}^{2}/24 +
*s*_{w}^{2}/3).

So SD1 should be greater than SD2, and SE1 should be less than SE2. How much
greater or less depends on the relative sizes of *s*_{b} and
*s*_{w}.

If *s*_{b} is much greater than *s*_{w}, the standard
errors will be similar.

If *s*_{b} is much less than *s*_{w}, the standard
deviations will be similar.

If *s*_{b}=0, then SE1 = SE2/root(8), as Paul found. I would
conclude that the measurement error is large.

Back to
frequently asked questions on the design and analysis of measurement studies.

Back to
measurement studies menu.

Back to
Martin Bland's Home Page.

This page maintained by Martin Bland.

Last updated: 13 January, 2004.

Back to top.