In many trials, it is desirable to try to balance the treatment arms within important prognostic factors (subject characteristics that are known to be correlated with the outcome).
Randomisation should ensure this in the long run, but it is advantageous to ensure balance throughout a large trial (to avoid temporal effects being correlated with treatment) and in smaller trials. Stratification is one way of achieving balance. Minimisation is an alternative method.
Say you want to make sure the treatment groups A and B are balanced with respect to a biomarker which is known to be highly prognostic of the outcome in your trial. You use random permuted blocks separately in those who are positive for the biomarker and those who are negative:
Biomarker– | A A B B A B B A B A B A A B |
Biomarker+ | A B B A B B A A |
Stratification results in perfect balance:
A | B | Total | |
Biomarker– | 7 | 7 | 14 |
Biomarker+ | 4 | 4 | 8 |
Total | 11 | 11 | 22 |
This ensures that the treatment groups are balanced 1:1 at the end of each block within biomarker– and biomarker+ subjects. It means we don't end up by chance with more of the biomarker+ subjects getting treatment A for instance. This can happen if we don't use stratification, even with blocks:
All subjects | A A B B | A B B A B A | B A A B | A B B A | B B A A |
Biomarker | + + – – | + – – – – – | + – + – | + + – – | – – + – |
Lack of stratification results in more biomarker– subjects being allocated to B by chance and more biomarker+ subjects being allocated to A :
A | B | Total | |
Biomarker– | 5 | 9 | 14 |
Biomarker+ | 6 | 2 | 8 |
Total | 11 | 11 | 22 |
This could skew the unadjusted trial results by making treatment A look good since biomarker+ subjects have a better outcome than biomarker– subjects.
The FDA has some good advice on what they call some basic tenets of stratification [pdf]: