# An Introduction to Medical Statistics ## Overview of An Introduction to Medical Statistics

An Introduction to Medical Statistics, now in its fourth edition, is a book for medical students, doctors, medical researchers, and all who want an introduction to statistics in a medical or health context. The book has 445 pages with 236 figures.

The approach is firmly embedded in medical research, all the methods described being illustrated with the use of real data, either from my own research or from the medical literature. Equations and formulae are given where appropriate and manual calculation is described, but the use of computers for calculations is emphasised, together with the graphical methods which computers make easy. For those who do not like to take things on trust, mathematical appendices are included which explain the derivation of the various statistical formulae, and graphical simulations are used to illustrate some of the more surprising statistical principles.

The book begins with the design of clinical and epidemiological studies, then describes methods for summarising and presenting the data collected. Next probability is introduced, and the ways in which it can be used to interpret data. The most commonly used statistical methods are described, their assumptions and how these can be checked, their interpretation, and how to choose the appropriate method for the analysis of different types of data in various circumstances.

The book goes on to describe the analysis of data where there is more than one predictor or explanatory variable, for continuous measurements, for yes or no variables, for counts, and for time to event data. Meta-analysis is described for different tyes of data, including checking for possible publicatiion bias and network metanalysis for studies which include several treatments. This is followed by the estimation of required sample size for different kinds of study.

A chapter on measurement in medicine discusses measurement error and observer variation, for quantitative measurements and for categories. The limits of agreement method for agreement between different methods of measuring the same quantity is described. The chaper goes on to describe the analysis of diagnostic methods and the estimation of reference ranges and centile charts and the creation and validation of composite questionnaire scales.

The last two chapters cover large-scale mortality statistics, expectation of life, and population structure and we finish with a new chapter on the Bayesian approach to statistics. This may have been rather rash on my part, but we see these methods more often now in the medical literature, so I thought I should try an introduction.

Two types of exercises are included: 122 multiple choice questions of the five branch True/False type, and long exercises involving interpretation of analyses and critical reading of research. No exercises requing calculations are included in this edition. Full solutions are given for all exercises.

## Fourth edition

The fourth edition was published in July 2015. Due to the adoption of a larger page format, the page count of the fourth edition is similar to that of the third edition, but several new topics have been added and others extended.

New topics in the fourth edition include:

• Assumptions and approximations
• Minimization
• Using different colours in presentation graphs
• Bootstrap or resampling methods
• Seasonal effects in regression on time
• Dealing with counts: Poisson regression and negative binomial regression
• Regression for data where observations are not independent
• Assessing agreement using Cohens kappa
• Weighted kappa
• Combining variables using principal components analysis
• Composite scales and sub-scales
• Internal consistency of scales and Cronbachs alpha
• Presenting composite scales
• Complete new chapter: Missing data
• Complete new chapter: The Bayesian Approach
• New chapter, containing extended material: Time to event data
• New chapter, containing greatly extended material: Meta-analysis — data from several studies
• Many new exercises and multiple choice questions

## Contents of An Introduction to Medical Statistics

High-lighted sections can be read on the Web. Sections which are new in the fourth edition are indicated.

1. Introduction

2. The design of experiments
• Comparing treatments
• Random allocation
• Methods of allocation without random numbers
• Volunteer bias
• Intention to treat
• Cross-over designs
• Selection of subjects for clinical trials
• Response bias and placebos
• Assessment bias and double blind studies
• Laboratory experiments
• Experimental units and cluster randomized trials
• Consent in clinical trials
• Minimization — NEW
• Multiple Choice questions: Clinical trials
• Exercise: The ‘Know Your Midwife’ trial

3. Sampling and observational studies
• Observational Studies
• Censuses
• Sampling
• Random sampling
• Sampling in clinical and epidemiological studies
• Cross-sectional studies
• Cohort studies
• Case-control studies
• Questionnaire bias in observational studies
• Ecological studies
• Multiple Choice questions: Observational studies
• Exercise: Campylobacter jejuni infection

4. Summarizing data
• Types of data
• Frequency distributions
• Histograms and other frequency graphs
• Shapes of frequency distribution
• Medians and quantiles
• The mean
• Variance, range and interquartile range
• Standard deviation
• Multiple Choice Questions: Summarizing data
• Exercise: Mean and standard deviation — NEW
• Appendix 4A: The divisor for the variance
• Appendix 4B: Formulae for the sum of squares

5. Presenting data
• Rates and proportions
• Significant figures
• Presenting tables
• Pie charts
• Bar charts
• Scatter diagrams
• Line graphs and time series
• Misleading graphs
• Using different colours — NEW
• Logarithmic scales
• Multiple choice questions: Data presentation
• Exercise: Creating presentation graphs
• Appendix 5A: Logarithms

6. Probability
• Probability
• Properties of probability
• Probability distributions and random variables
• The Binomial Distribution
• Mean and variance
• Properties of means and variances
• The Poisson Distribution
• Conditional probability
• Multiple choice questions: Probability
• Exercise: Probability in court — NEW
• Appendix 6A: Permutations and combinations
• Appendix 6B: Expected value of a sum of squares

7. The Normal distribution
• Probability distributions for continuous variables
• The Normal distribution
• Properties of the Normal distribution
• Variables which follow a Normal distribution
• The Normal plot
• Multiple choice questions: The Normal distribution
• Exercise: Distribution of some measurements obtained by students — NEW
• Appendix: The Chi-squared, t, and F distributions

8. Estimation
• Sampling distributions
• Standard error of a sample mean
• Confidence intervals
• Standard error and confidence interval for a proportion
• The difference between two means
• Comparison of two proportions
• Number needed to treat
• Standard error of a sample standard deviation
• Confidence interval for a proportion when numbers are small
• Confidence interval for a median and other quantiles
• Bootstrap or resampling methods — NEW
• What is the correct confidence interval?
• Multiple choice questions: Confidence intervals
• Exercise: Confidence intervals in two acupuncture studies — NEW

9. Significance tests
• Testing a hypothesis
• An example: the sign test
• Principles of significance tests
• Significance levels and types of error
• One and two sided tests of significance
• Significant, real and important
• Comparing the means of large samples
• Comparison of two proportions
• The power of a test
• Multiple significance tests
• Repeated significance tests and sequential analysis
• Multiple choice questions: Significance tests
• Exercise: Crohn’s disease and cornflakes

10. Comparing the means of small samples
• The t distribution
• The one sample t method
• The means of two independent samples
• The use of transformations
• Deviations from the assumptions of t methods
• What is a large sample?
• Serial data
• Comparing two variances by the F test
• Comparing several means using analysis of variance
• Assumptions of the analysis of variance
• Comparison of means after analysis of variance
• Random effects in analysis of variance
• Units of analysis and cluster-randomized trials
• Multiple choice questions: Comparisons of means
• Exercise: Some analyses comparing means — NEW
• Appendix: The ratio mean/standard error

11. Regression and correlation
• Scatter diagrams
• Regression
• The method of least squares
• The regression of X on Y
• The standard error of the regression coefficient
• Using the regression line for prediction
• Analysis of residuals
• Deviations from assumptions in regression
• Correlation
• Significance test and confidence interval for r
• Uses of the correlation coefficient
• Using repeated observations
• Intraclass correlation
• Multiple choice questions: Regression and correlation
• Exercise: Serum potassium and ambient temperature — NEW
• Appendix: The least squares estimates
• Appendix: The variance about the regression line
• Appendix: The standard error of b

12. Methods based on rank order
• Non-parametric methods
• The Mann Whitney U Test
• The Wilcoxon matched pairs test
• Spearman’s rank correlation coefficient, ρ
• Kendall’s rank correlation coefficient, τ
• Continuity corrections
• Parametric or non-parametric methods?
• Multiple choice questions: Rank-based methods
• Exercise: Some applications of rank-based methods — NEW

13. The analysis of cross-tabulations
• The chi-squared test for association
• Tests for 2 by 2 tables
• The chi-squared test for small samples
• Fisher’s exact test
• Yates’ continuity correction for the 2 by 2 table
• The validity of Fisher’s and Yates’ methods
• Odds and odds ratios
• The chi-squared test for trend
• Methods for matched samples
• The chi-squared goodness of fit test
• Multiple choice questions: Categorical data
• Exercise: Some analyses of categorical data
• Appendix: Why the chi-squared test works
• Appendix: The formula for Fisher’s exact test
• Appendix: Standard error for the odds ratio

14. Choosing the statistical method
• Method oriented and problem oriented teaching
• Types of data
• Comparing two groups
• One sample and paired samples
• Relationship between two variables
• Multiple choice questions: Choice of statistical method
• Exercise: Choosing a statistical method

15. Multifactorial methods
• Multiple Regression
• Significance tests and estimation in multiple regression
• Using multiple regression for adjustment — NEW
• Transformations in multiple regression — NEW
• Interaction in multiple regression
• Polynomial regression
• Assumptions of multiple regression
• Qualitative predictor variables
• Multi-way analysis of variance
• Logistic regression
• Stepwise regression
• Seasonal effects — NEW
• Dealing with counts: Poisson regression and negative binomial regression — NEW
• Other regression methods
• Data where observations are not independent — NEW
• Multiple choice questions: Multifactorial methods
• Exercise: A multiple regression analysis

16. Time to event data — NEW CHAPTER — includes expanded material
• Time to event data
• KaplanMeier survival curves
• The logrank test
• The hazard ratio
• Cox regression
• Multiple choice questions: Time to event data
• Exercise: Survival after retirement

17. Meta-analysis — NEW CHAPTER — includes greatly expanded material
• What is a meta-analysis?
• The forest plot
• Getting a pooled estimate
• Heterogeneity
• Measuring heterogeneity
• Investigating sources of heterogeneity
• Random effects models
• Continuous outcome variables
• Dichotomous outcome variables
• Time to event outcome variables
• Individual participant data meta-analysis
• Publication bias
• Network meta-analysis
• Multiple choice questions: Meta-analysis
• Exercise: Dietary sugars and body weight

18. Determination of sample size
• Estimation of a population mean
• Estimation of a population proportion
• Sample size for significance tests
• Comparison of two means
• Comparison of two proportions
• Detecting a correlation
• Accuracy of the estimated sample size
• Trials randomized in clusters
• Multiple choice questions: Sample size
• Exercise: Estimation of sample sizes

19. Missing data — NEW CHAPTER — all new material
• The problem of missing data
• Types of missing data
• Using the sample mean
• Last observation carried forward
• Simple imputation
• Multiple imputation
• Why we should not ignore missing data
• Multiple choice questions: Missing data
• Exercise: Last observation carried forward

20. Clinical measurement
• Making measurements
• Repeatability and measurement error
• Assessing agreement using Cohen’s kappa — NEW
• Weighted kappa — NEW
• Comparing two methods of measurement
• Sensitivity and specificity
• Normal range or reference interval
• Centile charts — NEW
• Combining variables using principal components analysis — NEW
• Composite scales and subscales — NEW
• Internal consistency of scales and Cronbachs alpha — NEW
• Presenting composite scales — NEW
• Multiple choice questions: Measurement
• Exercise: Two measurement studies — NEW

21. Mortality statistics and population structure
• Mortality rates
• Age standardization using the direct method
• Age standardization by the indirect method
• Demographic life tables
• Vital statistics
• The population pyramid
• Multiple choice questions: Population and mortality
• Exercise: Mortality and type 1 diabetes — NEW

22. The Bayesian approach — NEW CHAPTER — all new material
• Bayesians and Frequentists
• Bayes’ theorem
• An example: the Bayesian approach to computer-aided diagnosis
• The Bayesian and frequency views of probability
• An example of Bayesian estimation
• Prior distributions
• Maximum likelihood
• Markov Chain Monte Carlo methods
• Bayesian or Frequentist?
• Multiple choice questions: Bayesian methods
• Exercise: A Bayesian network meta-analysis

Appendix 1: Suggested answers to multiple choice questions and exercises

References

Index

## Reviews

### Extracts from reviews of the first edition

The first edition was well reviewed, e.g.:
At last I have a book on medical statistics that I can safely recommend to my students. --- Journal of the Royal Statistical Society.
It is a book which I think anyone teaching an introductory course in medical statistics should seriously consider as the main text. --- Statistics in Medicine.
If you want understand some of the statistical ideas important to medicine but fear being overwhelmed by mathematics you will welcome “An Introduction to Medical Statistics” by M. Bland. --- British Medical Journal.

### Reviews of the second edition

European Journal of Orthodontics

Martin Bland’s textbook is one of those most commonly recommended by academic medical statisticians in the UK for students and professionals in health-related disciplines. According to the British Medical Journal reviewer of the first edition, ‘If you want to understand some of the statistical ideas important to medicine but fear being overwhelmed by mathematics you will welcome this book’. And it is certainly sufficiently explicit and prescriptive for those at the research stage of their careers. The second edition is rather longer than the first, in particular sections on multifactorial methods and determination of sample size have been greatly expanded to form additional chapters. Each chapter includes several traditional multiple choice questions, and a longer question: a section at the back of the book gives full solutions to both. As in most other biostatistics texts, the clinical and epidemiological examples used are medical rather than dental, but do not presuppose specialized medical knowledge: the issues in dental specialties are fundamentally similar, and a dental reader should find the medical orientation no obstacle. The second edition is still good value at 14.95 pounds.

R. G. Newcombe. (1996) European Journal of Orthodontics 18(3) , 308.

N.B. The price is now higher, but still good value! -- MB.

Title: An Introduction to Medical Statistics
Author: Martin Bland
Publisher: Oxford Medical Publications
Price: \$27.95.
Comment: This paperback makes aspects of statistics and design of experiments, sampling and observational studies, data presentation, probability and other painful aspects of statistics relatively painless although it does have a lot of math.

### Reviews of the third edition

The third edition was reviewed by Les Huson (The Statistician, 50, 548). The review ends:
The coverage may not be very different from that of other introductory texts, but in my view the style and content are, and they alone make this text one of the best of its kind. The approach is very data driven, and the use of real data makes this even more appealing. The concern throughout is with statistical practice -- i.e. with extracting meaningful information from real data -- and not statistical theory, although the necessary theoretical ideas are explained in a non-mathematical way. The writing style -- first person throughout -- is also attractive and makes the text easy to read and digest, although it should also be said that this book contains a large amount of material and to work through it thoroughly takes time! Using the companion volume also [Statistical Questions in Evidence-based Medicine], and working through the exercises, would mean a very thorough course of study indeed.

All in all, this is an excellent book -- it has been on my bookshelf since the first edition, and in my view it should be the first choice for any student wanting a serious introduction to the practice of medical statistics.

## Availability

An Introduction to Medical Statistics is published world-wide by Oxford University Press . See OUP web site for details.

## Corrections

### Corrections to the second edition

Despite the combined proof-reading efforts of Doug Altman, Janet Peacock, and myself, a few errors remain. An up to date list of corrections to the second edition is maintained on this site.

### Corrections to the third edition

An up to date list of corrections to the third edition is maintained on this site.

## Copies of data sets used in the book

Most of the datasets from An Introduction to Medical Statistics can be found for downloading on my download page.

Back to Martin Bland’s Home Page.

This page maintained by Martin Bland.
Last updated: 17 August, 2015.