The Bayesian approach to computer-aided diagnosis

This is a section from Martin Bland’s text book An Introduction to Medical Statistics, Fourth Edition. I hope that the topic will be useful in its own right, as well as giving a flavour of the book. Section references are to the book.

22.3 An example: the Bayesian approach to computer-aided diagnosis

Bayes’ theorem may be stated in terms of the probability of Diagnosis A having observed Data B, as:

PROB(Diagnosis A | Data B) is proporional to PROB(Data B | Diagnosis A) × PROB(Diagnosis A)

If we have a large dataset of known diagnoses and their associated symptoms and signs, we can estimate PROB(Diagnosis A) easily. It is simply the proportion of times A has been diagnosed. For a patient, the data B are the particular combination of signs and symptoms with which the patient presents. The problem of finding the probability of a particular combination of symptoms and signs for each diagnosis is more difficult. We can say that the probability of a given symptom for a given diagnosis is the proportion of times the symptom occurs in patients with that diagnosis. If the symptoms are all independent, the probability of any combination of symptoms can be then found by multiplying their individual probabilities together (Section 6.2). In practice the assumption that signs and symptoms are independent is most unlikely to be met and a more complicated analysis would be required to deal with this. However, some systems of computer-aided diagnosis have been found to work quite well with the simple approach.

We thus have the probability of each diagnosis and the probability of each combination of symptoms and signs for each diagnosis. When a new patient presents, we obtain the data and compute

PROB(Data|Diagnosis) × PROB(Diagnosis)

for each diagnosis and sum these. We then divide the product by this sum for each diagnosis and this gives us the probability for each diagnosis given the signs and symptoms.

PROB(Diagnosis A) is called the prior probability of Diagnosis A, because it is the probability of Diagnosis A before the data are observed. PROB(Diagnosis A|Data B) is called the posterior probability of Diagnosis A given Data B, the probability of the diagnosis for someone with the observed signs and symptoms denoted by B. PROB(Data B|Diagnosis A) is called the likelihood of Diagnosis A for Data B, the probability of the observed signs and symptoms for someone with the diagnosis.

Adapted from pages 357–358 of An Introduction to Medical Statistics by Martin Bland, 2015, reproduced by permission of Oxford University Press.

Back to An Introduction to Medical Statistics contents

Back to Martin Bland’s Home Page

This page maintained by Martin Bland
Last updated: 7 August, 2015