Allocation by minimisation in a cluster randomised trial

J. Martin Bland
Professor of Health Statistics
Department of Health Sciences
University of York

Talk presented to the Third Annual Conference on Randomised Controlled Trials in the Social Sciences: Methods and Synthesis, September / October 2008.

Introduction

In this talk I shall describe how we used minimisation to allocate schools to treatment in a cluster randomised trial. The trial is the Together4All trial, taking place in Northern Ireland. I have concealed the names of schools and geographical areas involved.

The starting point was a question about how 14 schools could be randomised into two groups for a cluster randomised trial. One suggestion had been to use matched pairs, but there was concern that matching reduces power if there are fewer than 10 allocation units (schools) per group or correlation between the outcome variable in paired units is small. Could we use some form of stratified randomisation? For example, the schools could be divided into two groups by size (e.g., 8 largest vs. 6 smallest) and then within each of those blocks randomly allocate half to the intervention and half to the control? Would this be better than matching or was there an alternative suggestion?

I replied that matching has several problems:

I suggested that, rather than stratify, we should use minimisation. Minimisation is a method for allocating a small number of subjects into groups so that the groups are balanced on several variables (Tavez 1974). Groups may not be exactly balanced on all variables, which may be impossible.

We then received a request. We were told that the principals of the participating schools would like to know as soon as possible if they were in the programme or control group. The researchers though that they would like this to be done by an independent person in front of the school principals and an observer from the funder. They would then like a report drawn up describing the random allocation process that can be shared with the evaluation team appointed to lead the study. We decided that I would be able to do this on a visit which we were planning to make to Northern Ireland to do some teaching on cluster randomised trials.

The next question concerned data on schools key stage performance. Data were available on English and Maths for KS1 and KS2. The percentage of children falling below expected levels might be the best way to present this. Which would be the more important, Maths or English? I replied that we could minimise on both variables if we wished.

We were then told that one of the 14 schools was dropping out entirely. They did not want to be in either the experimental or control group, and no data could be collected from them. Would it be better to have seven experimental and six control schools, vice-versa, or should it be entirely random? My reply encapsulated the statistician’s credo: ‘Random is good.’

Planned allocation method

The final plan for minimisation for the Together4All trial was to allocate 13 schools into two groups. These groups were to be balanced on the following variables:

  1. total school size (number of pupils),

  2. religious denomination,

  3. geographical area,

  4. percentage of pupils receiving free school meals,

  5. percentage of pupils falling below expected level in KS1 English,

  6. percentage of pupils falling below expected level in KS1 Maths,

  7. percentage of pupils falling below expected level in KS2 English,

  8. percentage of pupils falling below expected level in KS2 Maths.

To carry out the allocation by minimisation, I first classified each of the variables into two, or possibly more, categories, as follows:

1. Total school size (number of pupils)

The observed school sizes were:
83, 101, 104, 120, 126, 164, 194, 210, 235, 344, 451, 666, and 683.

The median size was 194. To split the sample near the middle, I therefore proposed to group them as small (less than 200 pupils) and large (greater than or equal to 200 pupils).

2. Religious denomination

Religious denomination is a more difficult issue. This is a particularly important variable for schools in Northern Ireland. The 13 schools in this trial were classified as:

      Catholic 9
      Protestant       3
      Integrated 1

This is problematic, because whichever intervention group the integrated school joins, it cannot be balanced by similar school in the other group. I proposed that we should include this school with the Protestant schools and to group schools as Catholic or not Catholic.

3. Geographical area

There were schools in two separate urban areas in the study and also some rural schools. For this talk I have labelled them, with no originality, areas A, B, and C. The distribution was:

      Area A       4
      Area B 6
      Area C 3

I proposed that we should retain these three categories for the allocation.

4. Percentage of pupils receiving free school meals

The percentages of pupils receiving free school meals were:
8, 15, 15, 19, 19, 20, 20, 25, 26, 26, 26, 30, and 49.

The median percentage was 20. For the allocation, I proposed to group them as Low (20% or less) and High (greater than 20%).

5. Percentage of pupils falling below expected level in KS1 English

The percentages of pupils falling below the expected level in Key Stage 1 English were:
0, 0, 0, 0, 0, 1, 2, 5, 5, 6, 8, 13, and 27.

The median was 2. As there were 5 schools with zero, I decided to group the schools as None or Any.

6. Percentage of pupils falling below expected level in KS 1 Maths

The percentages of pupils falling below the expected Key Stage 1 Maths level were:
0, 0, 0, 0, 1, 2, 3, 6, 7, 8, 13, 14, and 17.

The median was 3. I proposed that schools should be grouped these as Low (less than 5) and High (5 or more).

7. Percentage of pupils falling below expected level in KS 2 English

The percentages of pupils falling below the expected level in Key Stage 2 English were:
0, 0, 0, 0, 2, 6, 16, 17, 18, 28, 28, 32, and 41.

The median was 16. I proposed grouping them as Low (less than 10) and High (greater than or equal to 10).

8. Percentage of pupils falling below expected level in KS 2 Maths

The percentages of pupils falling below expected level in Key Stage 2 Maths were:
0, 0, 0, 0, 2, 6, 13, 16, 18, 22, 30, 32, and 40.

The median was 13. I proposed that we should group them as Low (less than 10) and High (greater than or equal to 10).

Summary of the grouped variables

When schools were grouped in this way, we got the following table:


School

School
size


Catholic


Area
% free
school
meals
% falling below expected level in
KS1
English
KS1
Maths
KS2
English
KS2
Maths
School A Low No Area C Low Any High Low Low
School B High No Area B High Any Low High High
School C Low No Area B High Any Low Low Low
School D Low No Area A High Any Low High High
School E Low Yes Area B Low Any High High High
School F Low Yes Area A High None High Low Low
School G High Yes Area A High None Low Low Low
School H High Yes Area A High None Low High High
School I High Yes Area B Low Any High High High
School J Low Yes Area C Low None High Low Low
School K Low Yes Area C Low Any High High High
School L High Yes Area B Low None Low Low Low
School M High Yes Area B Low Any Low High High

The columns for KS2 English and KS2 Maths are identical, so I proposed that we should use a single KS2 variable. Hence the proposed final table for use in the minimisation allocation was:



School

School
size


Catholic


Area
% free
school
meals
% falling below expected level in
KS1
English
KS1
Maths
KS2
School A Low No Area C Low Any High Low
School B High No Area B High Any Low High
School C Low No Area B High Any Low Low
School D Low No Area A High Any Low High
School E Low Yes Area B Low Any High High
School F Low Yes Area A High None High Low
School G High Yes Area A High None Low Low
School H High Yes Area A High None Low High
School I High Yes Area B Low Any High High
School J Low Yes Area C Low None High Low
School K Low Yes Area C Low Any High High
School L High Yes Area B Low None Low Low
School M High Yes Area B Low Any Low High

Allocation by Minimisation

On 28 April 2008, I presented these proposals to a meeting of the principals of the 13 schools, the Together4All researchers, and a representative of the funding body, in a hall in Northern Ireland. I explained the principles of minimisation and then, as no suggestions for changes were made, carried out the allocation using the categories suggested above. A video recording of the event was made.

I carried out the minimisation using the free DOS program MINIM.EXE. This is by Stephen Evans, Simon Day, and Patrick Royston. At each stage, the algorithm asks ‘to which group would allocation of the next school make the two groups more balanced?’. If it would not make any difference to the balance which group was chosen, the school would allocated randomly. Otherwise, it was allocated to the group which would make the two groups better balanced.

In the live presentation of this talk, I then ran MINIM to allocate the schools. (I have carried out another allocation for this written version.) The first task, which I had done in advance, was to set up a file with the variable names and categories used in the minimisation. MINIM asks for the name of this file. It then asks whether I have a patient to allocate. (The program was written with clinical trials in mind.) It then asked me, for each of the minimising variables, into which category the ‘patient’ falls. Having the data, it calculated that the lack of balance would be the same whichever group the school was allocated to. The school was allocated randomly and in this demonstration went into the control group.

It is possible to list the allocation so far:

        Intervention   Control
Total:
  Low 0 1
  High 0 0
Catholic:
  No 0 1
  Yes 0 0
Area:
  Area A 0 0
  Area B 0 0
  Area C 0 1
Meals:
  Low 0 1
  High 0 0
KS1 Eng.:
  None 0 0
  Any 0 1
KS1 Maths.:
  Low 0 0
  High 0 1
KS2:
  Low 0 1
  High 0 0
Totals : 0 1
Grand Total : 1

I then proceed to enter the data for School B. To balance the groups, this school would be allocated to control if it had any categories in common with the first. It went to Intervention.

        Intervention   Control
Total:
  Low 0 1
  High 1 0
Catholic:
  No 1 1
  Yes 0 0
Area:
  Area A 0 0
  Area B 1 0
  Area C 0 1
Meals:
  Low 0 1
  High 1 0
KS1 Eng.:
  None 0 0
  Any 1 1
KS1 Maths.:
  Low 1 0
  High 0 1
KS2:
  Low 0 1
  High 1 0
Totals : 1 1
Grand Total : 1

If School B had gone to control, there would have been two schools in Catholic = No and in Key Stage 1 English percentage = Any. The groups would have been more imbalanced.

The algorithm is to take the categories which the new school falls into and sum the frequencies. We then put the new school into the group with the lower sum. In this case, we have Total = High, Catholic = No, Area = B, Meals = High, KS1 Eng. = Any, KS1 Maths. = Low, and KS2 = High. For intervention, this sum is 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0, for Control it is 0 + 1 + 0 + 0 + 0 + 1 + 0 = 2. Hence we allocate to Intervention.

The new sums for this combination of categories are Intervention = 7, Control = 2. If we had allocated the second school to Control, the new sums would be Intervention = 0, Control = 9. Allocation to Intervention makes the groups more balanced.

For the allocation of School C, the categories are Total = Low, Catholic = No, Area = B, Meals = High, KS1 Eng. = Any, KS1 Maths. = Low, and KS2 = Low. For intervention, the sum is 0 + 1 + 1 + 1 + 1 + 1 + 0 = 5, for Control it is 1 + 1 + 0 + 0 + 1 + 0 + 1 = 4. Hence we allocate to Control.

        Intervention   Control
Total:
  Low 0 2
  High 1 0
Catholic:
  No 1 2
  Yes 0 0
Area:
  Area A 0 0
  Area B 1 1
  Area C 0 1
Meals:
  Low 0 1
  High 1 1
KS1 Eng.:
  None 0 0
  Any 1 2
KS1 Maths.:
  Low 1 1
  High 0 1
KS2:
  Low 0 2
  High 1 0
Totals : 1 2
Grand Total : 3

For School D the categories are Total = Low, Catholic = No, Area = A, Meals = High, KS1 Eng. = Any, KS1 Maths. = Low, and KS2 = High. For Intervention, the sum is 0 + 1 + 0 + 1 + 1 + 1 + 1 = 5, for Control it is 2 + 2 + 0 + 1 + 1 + 1 + 0 = 7. Hence we allocate to Intervention.

The objective is always to allocate to the group which will make the totals more similar. When it would not make any difference, we allocate randomly. The allocation proceeds until all the schools have been allocated.

Results of the minimisation for the T4A trial

This table shows the results of the actual allocation for the Together4All trial.

        Intervention   Control
Total:
  Low 3 4
  High 3 3
Catholic:
  No 2 2
  Yes 4 5
Area:
  Area A 2 2
  Area B 3 3
  Area C 1 2
Meals:
  Low 3 4
  High 3 3
KS1 Eng.:
  None 2 3
  Any 4 4
KS1 Maths.:
  Low 4 3
  High 2 4
KS2:
  Low 2 4
  High 4 3
Totals : 6 7
Grand Total : 13

The groups are quite well balanced. For total number of pupils, religion, geographical area, free school meals, and attainment in Key Stage 1 English, the balance was as good as it would be possible to achieve. For example, of the seven schools with low numbers of children, three were in Intervention and four in Control, and of the six schools with high numbers of children, three were in Intervention and three in Control.

There were only two variables where the balance was not as good as it might be. In Key Stage 1 maths, the Control group had more schools with a higher percentage of children below the standard level. In Key Stage 2 English/maths the intervention group had more schools with a higher percentage of children below the standard level. Hence where there was a slight imbalance, this was itself balanced across the two variables.

Thanks go to David Torgerson, Tim Hobbs, Nuala Magee, Michael Little, and to The Atlantic Philanthropies for funding Together 4 All.

Reference

Taves DR. Minimization: a new method of assigning patients to treatment and control groups. Clin. Pharmacol. Ther. 1974; 15, 443-445.

Back to Some full length papers and talks.

Back to Martin Bland's Home Page.

This page is maintained by Martin Bland.
Last updated: 8 December, 2008.

Back to top.