2. How could they have done a potentially valid two sample test?

Suggested answer

They could average all the measurements on a single mouse to get one observation for the mouse. They would then have two groups, each of three observations, and could do a two sample t test, provided the assumptions of Normal distribution and uniform variance were satisfied. (I think that they would be because this is a simple anatomical dimension and such things usually follow an approximately Normal distribution.)

All the data would be used, because the average for the mouse would be a very accurate figure for that mouse. We are not throwing information away. However, the important variability is going to be between mice, not within a single liver.

