This website is for students following the M.Sc. programme in the Department of Health Sciences at the University of York.
This page contains questions from students and my answers. If you email me any question about the course, I will add it to this page, anonymously, with my answer. In this way anyone else who had the same question will see the answer. I suggest that you check this page at least once a week.
Questions are posted in chronological order, with the most recent question first.
Am I correct in thinking that the table in paper 3 is comparing each specialist with the clinical record. Also can we ignore the Wilcoxon numbers?
That is my interpretation.
The Wilcoxon tests are testing the null hypothesis that the carers do not tend to give higher or lower scores than the clinical record. I won't be asking you about things which are not in the course.
Looking at the plots in Paper 1, we can see that for both 2D and 3D the amount of variation increases as the number of follicles increases? Is that right?
I agree that the amount of variation increases as the number of follicles increases.
I think the miobography exerise has helped me clarify my point for paper 1. The authors should stick to reporting and comparing ICCs if they are interested in interobserver reliability, right? By adding in the plots they have confused matters because the plots are designed for comparing methods, which is not what the authors use them for. So we can directly compare the ICCs, which support the use of 3D even though it takes longer?
You are correct about the inter-observer reliability. They can use the plots as they have, treating each observer as if it were a different method. They can learn something about the nature of the variability and error from them. Of course, they are looking at the agreement between the observers, not the methods of measurement.
I know that the two plots in Paper 1 are not comparing 2D to 3D, but I thought the point of using a Bland-Altman plot is to compare two different methods. So by having the data plotted separately they are not actually comparing the 2D to 3D directly, but they are comparing interobserver reliability for 2D and interobserver reliability for 3D and then making indirect comparisons between the two methods. So is their conclusion that 3D has better reliability actually valid since they didn't graph the two methods against each other (only independently with comparisons)?
I think their conclusion that 3D has better reliability is correct, because they show that the observers are closer together using the three dimensional than they are using the two dimensional method.
You are correct, of course, that they could and should compare the two methods directly, as they are supposed to be measuring the same thing.
I don't know if you can answer this question, so just let me know if it would be giving something away. In the first paper by Jayaprakasan, the Bland-Altman plot is actually given as two plots. If I am reading this right, the authors plot the mean of 2D counts against the difference in both the 2D and 3D counts, then do the same in a separate plot for the 3D counts. Should there be only one plot of the combined mean for both tests versus the difference in means for both tests?
I am happy to answer any questions about the papers.
You are not reading Figure 2 correctly. Part (a) shows a plot for the agreement between the two observers using the two dimensional counts. Part (b) shows a plot for the agreement between the two observers using the three dimensional counts. The plots do not show the agreement between the two dimensional counts and the three dimensional counts.
From Week 1, where you are talking about the coefficient of variation I understand that it is a ratio of sw/mean. But I want to make sure I understand the interpretation of that. It says that error is proportional to the magnitude of measurement, which I take to mean that as the thing we are measuring gets bigger the likelihood that there is error increases. Then when you write this out it should read that the Sw is X% of the true value? Lower percentage is better because you are closer to the true value?
It means that as the thing we are measuring gets bigger, the errors tend to get bigger. I don't think we should talk about the likelihood of error, because there is always error. It is not a question of whether there is a measurement error, but how big that measurement error might be. Otherwise, you are correct.
In Week 7, I don't understand why a value of Cronbach's alpha between 0.7 and 0.8 is acceptable for research and not for clinical application.
For clinical application an alpha value of 0.9 to 0.95 would still means that the questions are somewhat identical or have overlap. So why would you want to ask patients repetitious questions?
The argument is that when we want to compare two groups, a fairly large measurement error is acceptable because if we take enough subjects we can still demonstrate differences. This is not the case if we are going to take a clinical action on an individual, such as give them a drug or cut something off, as the consequences of a mistake for the individual may be severe. We need something more reliable.
I think you ask highly related questions because you ant to be sure. If you look at the GHQ depression scale, some of the items are quite similar.
At the end of the lecture in Weeek 3, you talk about comparing repeatability coefficients, is this the same as a correlation coefficient?
No, the repeatability coefficient, Week 1, is the upper 95% limit for differences between a pair of measurements on the same person. The repeatability coefficient is 2 root 2 times the within subject standard deviation, sw, or 2 times the standard deviation of differences between pairs of observations on the subject. It is "reliability" which is used as a synonym for the correlation coefficient between pairs of observations on the same person.
I thought that correlation was not the way to compare agreement. Is that right?
Not between two different methods of measurement, because it ignores the systematic bias which might be present.
In the last paragraph of the notes for Week 3, when you compare the limits of agreement and repeatability what are you actually trying to say?
The limits of agreement are the 95% percent limits for differences between two different measurement methods. The repeatability is the 95% percent limit for differences between two measurements by the same method. Hence they can be compared and can tell us whether a measurement agrees with another method as well as it agrees with itself.
I understand from the repeatability coefficients you can determine if the variation in measurements is due to error or actual variation in your population, which you say for the example is some measurement error and something else. I don't understand what that has to do with the limits of agreement.
We can determine how much of our variation might be due to measurement error, which not quite the same.
In the lecture I say:
"For the mini meter, the coefficient of repeatability is twice the standard deviation of the differences, or 56.4 l/min. For the large meter the coefficient is 43.2 l/min.
"Compare these repeatability coefficients to the limits of agreement, –80 l/min to +76 l/min. We estimate that the mini meter will be within 56 l/min of another measurement by itself, but only within 80 l/min of a measurement by the Wright peak flow meter. We can conclude that not all the variation between the two instruments is because of their measurement error, but there is some other source of variation."
What this means is that if we measure PEFR twice using a mini meter, repeated measurements might be up to 56 l/min apart. If we measure PEFR using a mini meter and a large meter, the two measurements might be 80 l/min apart. If only measurement error was causing the differences between the two meters, we would expect these two numbers to be about the same. They are not, so there is some other cause of of the differences between the two meters.
If I am away the week before the exam, how can I get the papers which will be used?
The papers will be given out one week before the exam. They will also be available on my website from that day and I will email them to anyone who has not signed the register.
To Measurement in Health and Disease index.
To Martin Bland's M.Sc. index.
This page maintained by Martin Bland.
Last updated: 9 June, 2008.