Exercise: observer agreement about sex

This website is for students following the M.Sc. in Evidence Based Practice at the University of York.

183 students were observed twice by different student observers. These measured height (mm), arm circumference (mm), head circumference, and pulse (beats/min) and recorded sex and eye colour. They entered these into a computer file. Eye colour and sex were entered as numerical codes.

The following table shows sex as recorded by two observers:

Sex recorded by
first observer Sex recorded by
second observer Total
female male
female 118 1 119
male 1 63 64
Total 119 64 183

Sex recorded by first observer	Sex recorded by second observer	Total
female	male
female	118	1	119
male	1	63	64
Total	119	64	183

This is the output from SPSS 16, where kappa is a statistic available from crosstabs:

Symmetric measures

Value Assym. Std.
Error^a
Approx T^b
Approx. Sig.
Measure of agreement
Kappa 0.976 0.017
13.203
.000

N of Valid Cases 183
a: Not assuming the null hypothesis.
b: Using the asymptotic standard error assuming the null hypothesis.

Symmetric measures
	Value	Assym. Std. Error^a	Approx T^b	Approx. Sig.
Measure of agreement	Kappa	0.976	0.017	13.203	.000
N of Valid Cases	183
a: Not assuming the null hypothesis. b: Using the asymptotic standard error assuming the null hypothesis.

This is the output from a Stata command for Cohen's kappa:

. kap sex1 sex2

             Expected
Agreement   Agreement     Kappa   Std. Err.         Z      Prob>Z
-----------------------------------------------------------------
  98.91%      54.52%     0.9760     0.0739      13.20      0.0000

Note that SPSS uses the standard error shown by Stata to calculate the T statistic, not the one SPSS prints.

Question 1:

What is meant by “Agreement” and “Expected agreement”?

Check suggested answer 1.

Question 2:

What does kappa mean and what can we conclude?

Check suggested answer 2.

Question 3:

What is “Z” from Stata, T from SPSS?

Check suggested answer 3.

Question 4:

“Prob>Z” in Stata and "sig" in SPSS is the P value. What is it testing?