Talk presented to the Third Annual Conference on Randomised Controlled Trials in the Social Sciences: Methods and Synthesis, September / October 2008.
In this talk I shall describe how we used minimisation to allocate schools to treatment in a cluster randomised trial. The trial is the Together4All trial, taking place in Northern Ireland. I have concealed the names of schools and geographical areas involved.
The starting point was a question about how 14 schools could be randomised into two groups for a cluster randomised trial. One suggestion had been to use matched pairs, but there was concern that matching reduces power if there are fewer than 10 allocation units (schools) per group or correlation between the outcome variable in paired units is small. Could we use some form of stratified randomisation? For example, the schools could be divided into two groups by size (e.g., 8 largest vs. 6 smallest) and then within each of those blocks randomly allocate half to the intervention and half to the control? Would this be better than matching or was there an alternative suggestion?
I replied that matching has several problems:
I suggested that, rather than stratify, we should use minimisation. Minimisation is a method for allocating a small number of subjects into groups so that the groups are balanced on several variables (Tavez 1974). Groups may not be exactly balanced on all variables, which may be impossible.
We then received a request. We were told that the principals of the participating schools would like to know as soon as possible if they were in the programme or control group. The researchers though that they would like this to be done by an independent person in front of the school principals and an observer from the funder. They would then like a report drawn up describing the random allocation process that can be shared with the evaluation team appointed to lead the study. We decided that I would be able to do this on a visit which we were planning to make to Northern Ireland to do some teaching on cluster randomised trials.
The next question concerned data on schools key stage performance. Data were available on English and Maths for KS1 and KS2. The percentage of children falling below expected levels might be the best way to present this. Which would be the more important, Maths or English? I replied that we could minimise on both variables if we wished.
We were then told that one of the 14 schools was dropping out entirely. They did not want to be in either the experimental or control group, and no data could be collected from them. Would it be better to have seven experimental and six control schools, vice-versa, or should it be entirely random? My reply encapsulated the statistician’s credo: ‘Random is good.’
The final plan for minimisation for the Together4All trial was to allocate 13 schools into two groups. These groups were to be balanced on the following variables:
To carry out the allocation by minimisation, I first classified each of the variables into two, or possibly more, categories, as follows:
The observed school sizes were:
83, 101, 104, 120, 126, 164, 194, 210, 235, 344, 451, 666, and 683.
The median size was 194. To split the sample near the middle, I therefore proposed to group them as small (less than 200 pupils) and large (greater than or equal to 200 pupils).
Religious denomination is a more difficult issue. This is a particularly important variable for schools in Northern Ireland. The 13 schools in this trial were classified as:
Catholic | 9 |
Protestant | 3 |
Integrated | 1 |
This is problematic, because whichever intervention group the integrated school joins, it cannot be balanced by similar school in the other group. I proposed that we should include this school with the Protestant schools and to group schools as Catholic or not Catholic.
There were schools in two separate urban areas in the study and also some rural schools. For this talk I have labelled them, with no originality, areas A, B, and C. The distribution was:
Area A | 4 |
Area B | 6 |
Area C | 3 |
I proposed that we should retain these three categories for the allocation.
The percentages of pupils receiving free school meals were:
8, 15, 15, 19, 19, 20, 20, 25, 26, 26, 26, 30, and 49.
The median percentage was 20. For the allocation, I proposed to group them as Low (20% or less) and High (greater than 20%).
The percentages of pupils falling below the expected level in Key Stage 1 English were:
0, 0, 0, 0, 0, 1, 2, 5, 5, 6, 8, 13, and 27.
The median was 2. As there were 5 schools with zero, I decided to group the schools as None or Any.
The percentages of pupils falling below the expected Key Stage 1 Maths level were:
0, 0, 0, 0, 1, 2, 3, 6, 7, 8, 13, 14, and 17.
The median was 3. I proposed that schools should be grouped these as Low (less than 5) and High (5 or more).
The percentages of pupils falling below the expected level in Key Stage 2 English were:
0, 0, 0, 0, 2, 6, 16, 17, 18, 28, 28, 32, and 41.
The median was 16. I proposed grouping them as Low (less than 10) and High (greater than or equal to 10).
The percentages of pupils falling below expected level in Key Stage 2 Maths were:
0, 0, 0, 0, 2, 6, 13, 16, 18, 22, 30, 32, and 40.
The median was 13. I proposed that we should group them as Low (less than 10) and High (greater than or equal to 10).
When schools were grouped in this way, we got the following table:
School | School size | Catholic | Area | % free school meals | % falling below expected level in | |||
---|---|---|---|---|---|---|---|---|
KS1 English | KS1 Maths | KS2 English | KS2 Maths | |||||
School A | Low | No | Area C | Low | Any | High | Low | Low |
School B | High | No | Area B | High | Any | Low | High | High |
School C | Low | No | Area B | High | Any | Low | Low | Low |
School D | Low | No | Area A | High | Any | Low | High | High |
School E | Low | Yes | Area B | Low | Any | High | High | High |
School F | Low | Yes | Area A | High | None | High | Low | Low |
School G | High | Yes | Area A | High | None | Low | Low | Low |
School H | High | Yes | Area A | High | None | Low | High | High |
School I | High | Yes | Area B | Low | Any | High | High | High |
School J | Low | Yes | Area C | Low | None | High | Low | Low |
School K | Low | Yes | Area C | Low | Any | High | High | High |
School L | High | Yes | Area B | Low | None | Low | Low | Low |
School M | High | Yes | Area B | Low | Any | Low | High | High |
The columns for KS2 English and KS2 Maths are identical, so I proposed that we should use a single KS2 variable. Hence the proposed final table for use in the minimisation allocation was:
School | School size | Catholic | Area | % free school meals | % falling below expected level in | ||
---|---|---|---|---|---|---|---|
KS1 English | KS1 Maths | KS2 | |||||
School A | Low | No | Area C | Low | Any | High | Low |
School B | High | No | Area B | High | Any | Low | High |
School C | Low | No | Area B | High | Any | Low | Low |
School D | Low | No | Area A | High | Any | Low | High |
School E | Low | Yes | Area B | Low | Any | High | High |
School F | Low | Yes | Area A | High | None | High | Low |
School G | High | Yes | Area A | High | None | Low | Low |
School H | High | Yes | Area A | High | None | Low | High |
School I | High | Yes | Area B | Low | Any | High | High |
School J | Low | Yes | Area C | Low | None | High | Low |
School K | Low | Yes | Area C | Low | Any | High | High |
School L | High | Yes | Area B | Low | None | Low | Low |
School M | High | Yes | Area B | Low | Any | Low | High |
On 28 April 2008, I presented these proposals to a meeting of the principals of the 13 schools, the Together4All researchers, and a representative of the funding body, in a hall in Northern Ireland. I explained the principles of minimisation and then, as no suggestions for changes were made, carried out the allocation using the categories suggested above. A video recording of the event was made.
I carried out the minimisation using the free DOS program MINIM.EXE. This is by Stephen Evans, Simon Day, and Patrick Royston. At each stage, the algorithm asks ‘to which group would allocation of the next school make the two groups more balanced?’. If it would not make any difference to the balance which group was chosen, the school would allocated randomly. Otherwise, it was allocated to the group which would make the two groups better balanced.
In the live presentation of this talk, I then ran MINIM to allocate the schools. (I have carried out another allocation for this written version.) The first task, which I had done in advance, was to set up a file with the variable names and categories used in the minimisation. MINIM asks for the name of this file. It then asks whether I have a patient to allocate. (The program was written with clinical trials in mind.) It then asked me, for each of the minimising variables, into which category the ‘patient’ falls. Having the data, it calculated that the lack of balance would be the same whichever group the school was allocated to. The school was allocated randomly and in this demonstration went into the control group.
It is possible to list the allocation so far:
Intervention | Control | |||
Total: | ||||
Low | 0 | 1 | ||
High | 0 | 0 | ||
Catholic: | ||||
No | 0 | 1 | ||
Yes | 0 | 0 | ||
Area: | ||||
Area A | 0 | 0 | ||
Area B | 0 | 0 | ||
Area C | 0 | 1 | ||
Meals: | ||||
Low | 0 | 1 | ||
High | 0 | 0 | ||
KS1 Eng.: | ||||
None | 0 | 0 | ||
Any | 0 | 1 | ||
KS1 Maths.: | ||||
Low | 0 | 0 | ||
High | 0 | 1 | ||
KS2: | ||||
Low | 0 | 1 | ||
High | 0 | 0 | ||
Totals : | 0 | 1 | ||
Grand Total : | 1 |
I then proceed to enter the data for School B. To balance the groups, this school would be allocated to control if it had any categories in common with the first. It went to Intervention.
Intervention | Control | |||
Total: | ||||
Low | 0 | 1 | ||
High | 1 | 0 | ||
Catholic: | ||||
No | 1 | 1 | ||
Yes | 0 | 0 | ||
Area: | ||||
Area A | 0 | 0 | ||
Area B | 1 | 0 | ||
Area C | 0 | 1 | ||
Meals: | ||||
Low | 0 | 1 | ||
High | 1 | 0 | ||
KS1 Eng.: | ||||
None | 0 | 0 | ||
Any | 1 | 1 | ||
KS1 Maths.: | ||||
Low | 1 | 0 | ||
High | 0 | 1 | ||
KS2: | ||||
Low | 0 | 1 | ||
High | 1 | 0 | ||
Totals : | 1 | 1 | ||
Grand Total : | 1 |
If School B had gone to control, there would have been two schools in Catholic = No and in Key Stage 1 English percentage = Any. The groups would have been more imbalanced.
The algorithm is to take the categories which the new school falls into and sum the frequencies. We then put the new school into the group with the lower sum. In this case, we have Total = High, Catholic = No, Area = B, Meals = High, KS1 Eng. = Any, KS1 Maths. = Low, and KS2 = High. For intervention, this sum is 0 + 0 + 0 + 0 + 0 + 0 + 0 = 0, for Control it is 0 + 1 + 0 + 0 + 0 + 1 + 0 = 2. Hence we allocate to Intervention.
The new sums for this combination of categories are Intervention = 7, Control = 2. If we had allocated the second school to Control, the new sums would be Intervention = 0, Control = 9. Allocation to Intervention makes the groups more balanced.
For the allocation of School C, the categories are Total = Low, Catholic = No, Area = B, Meals = High, KS1 Eng. = Any, KS1 Maths. = Low, and KS2 = Low. For intervention, the sum is 0 + 1 + 1 + 1 + 1 + 1 + 0 = 5, for Control it is 1 + 1 + 0 + 0 + 1 + 0 + 1 = 4. Hence we allocate to Control.
Intervention | Control | |||
Total: | ||||
Low | 0 | 2 | ||
High | 1 | 0 | ||
Catholic: | ||||
No | 1 | 2 | ||
Yes | 0 | 0 | ||
Area: | ||||
Area A | 0 | 0 | ||
Area B | 1 | 1 | ||
Area C | 0 | 1 | ||
Meals: | ||||
Low | 0 | 1 | ||
High | 1 | 1 | ||
KS1 Eng.: | ||||
None | 0 | 0 | ||
Any | 1 | 2 | ||
KS1 Maths.: | ||||
Low | 1 | 1 | ||
High | 0 | 1 | ||
KS2: | ||||
Low | 0 | 2 | ||
High | 1 | 0 | ||
Totals : | 1 | 2 | ||
Grand Total : | 3 |
For School D the categories are Total = Low, Catholic = No, Area = A, Meals = High, KS1 Eng. = Any, KS1 Maths. = Low, and KS2 = High. For Intervention, the sum is 0 + 1 + 0 + 1 + 1 + 1 + 1 = 5, for Control it is 2 + 2 + 0 + 1 + 1 + 1 + 0 = 7. Hence we allocate to Intervention.
The objective is always to allocate to the group which will make the totals more similar. When it would not make any difference, we allocate randomly. The allocation proceeds until all the schools have been allocated.
This table shows the results of the actual allocation for the Together4All trial.
Intervention | Control | |||
Total: | ||||
Low | 3 | 4 | ||
High | 3 | 3 | ||
Catholic: | ||||
No | 2 | 2 | ||
Yes | 4 | 5 | ||
Area: | ||||
Area A | 2 | 2 | ||
Area B | 3 | 3 | ||
Area C | 1 | 2 | ||
Meals: | ||||
Low | 3 | 4 | ||
High | 3 | 3 | ||
KS1 Eng.: | ||||
None | 2 | 3 | ||
Any | 4 | 4 | ||
KS1 Maths.: | ||||
Low | 4 | 3 | ||
High | 2 | 4 | ||
KS2: | ||||
Low | 2 | 4 | ||
High | 4 | 3 | ||
Totals : | 6 | 7 | ||
Grand Total : | 13 |
The groups are quite well balanced. For total number of pupils, religion, geographical area, free school meals, and attainment in Key Stage 1 English, the balance was as good as it would be possible to achieve. For example, of the seven schools with low numbers of children, three were in Intervention and four in Control, and of the six schools with high numbers of children, three were in Intervention and three in Control.
There were only two variables where the balance was not as good as it might be. In Key Stage 1 maths, the Control group had more schools with a higher percentage of children below the standard level. In Key Stage 2 English/maths the intervention group had more schools with a higher percentage of children below the standard level. Hence where there was a slight imbalance, this was itself balanced across the two variables.
Thanks go to David Torgerson, Tim Hobbs, Nuala Magee, Michael Little, and to The Atlantic Philanthropies for funding Together 4 All.
Back to Some full length papers and talks.
Back to Martin Bland's Home Page.
This page is maintained by Martin Bland.
Last updated: 8 December, 2008.