Sealed Envelope can carry out simulations of the randomisation system using an automated testing programme. The randomisations generated by this approach are available for download on the specification page.
A data specification document is provided to the automated testing programme. This defines the data to be submitted to the randomisation form. The testing programme submits this data to the randomisation form to simulate a randomisation taking place. This process is repeated a set number of times (known as replications or reps) to produce the simulated dataset.
Here is an example of a data specification:
{
"sample_size": 400,
"fields": {
"siteId": {
"min": 1,
"max": 10,
"type": "int"
},
"dob": {
"format": "d/m/Y",
"min": "1 Jan 2000",
"max": "31 Dec 2010",
"type": "date"
},
"initials": {
"type": "string",
"length": 2
},
"eligible": {
"value": ["Yes"],
"type": "enum"
},
"gender": {
"weight": [2, 1],
"value": ["Male", "Female"],
"type": "enum"
},
"consent": {
"value": ["Yes"],
"type": "enum"
},
"severity": {
"weight": [1, 2],
"value": [ "Low", "High"],
"type": "enum"
}
},
"stubName": "mytrial"
}
It is possible to alter the data submitted to the form to more closely reflect the expected distributions of individual variables in your trial by changing the weight parameter on categorical variables. For example if you expect twice as many women to be recruited compared to men, the weighting on gender would be set to [1, 2]
.
You can ask Sealed Envelope to make these changes and re-run the simulation.
You can download the simulated data and import into a spreadsheet or statistics package for analysis. You can check, for instance, that the randomisation protocol is balancing the treatment groups within strata. If you want to make changes to the randomisation protocol or carry out more simulations you should contact Sealed Envelope.
In this example a simulation has been carried out using the data specification above. The randomisation protocol was minimisation on gender, severity and age-group with a 25% chance that a purely random allocation will be made (equivalent to using a biased coin with an 87.5% chance of choosing the treatment that reduces imbalance). The analysis was carried out using Stata.
First we import the simulated dataset.
insheet using mytrialRandom.2012-10-31.150000.tsv
Now lets start exploring the dataset.
. tab gender
gender | Freq. Percent Cum.
------------+-----------------------------------
Female | 124 31.00 31.00
Male | 276 69.00 100.00
------------+-----------------------------------
Total | 400 100.00
We can see that gender has been allocated according to the weightings in the data specification (2:1 Male:Female).
. li initials gender severity dob agegroup in 1/5
+---------------------------------------------------------------+
| initials gender severity dob agegroup |
|---------------------------------------------------------------|
1. | QO Male High 08/08/2001 6.5 years or over |
2. | MT Male Low 29/09/2002 6.5 years or over |
3. | YZ Male High 06/12/2003 6.5 years or over |
4. | PK Male Low 15/11/2009 <6.5 years |
5. | MH Female High 29/09/2003 6.5 years or over |
+---------------------------------------------------------------+
Initials and date of birth (dob) have been generated with random strings and dates. The agegroup variable was calculated by the randomisation system from the date of birth so did not need to be included in the data specification.
. tab gender group
| group
gender | Active Control | Total
-----------+----------------------+----------
Female | 62 62 | 124
Male | 138 138 | 276
-----------+----------------------+----------
Total | 200 200 | 400
. tab severity group
| group
severity | Active Control | Total
-----------+----------------------+----------
High | 138 139 | 277
Low | 62 61 | 123
-----------+----------------------+----------
Total | 200 200 | 400
. tab agegroup group
| group
agegroup | Active Control | Total
------------------+----------------------+----------
6.5 years or over | 94 96 | 190
<6.5 years | 106 104 | 210
------------------+----------------------+----------
Total | 200 200 | 400
The minimisation has clearly closely controlled the balance in the three minimisation factors. By way of contrast the balance within sites, which is not controlled by minimisation, can be seen to vary quite widely:
. tab siteid group
| group
siteId | Active Control | Total
-----------+----------------------+----------
1 | 20 22 | 42
2 | 21 23 | 44
3 | 22 23 | 45
4 | 14 17 | 31
5 | 16 6 | 22
6 | 18 22 | 40
7 | 18 26 | 44
8 | 26 27 | 53
9 | 25 18 | 43
10 | 20 16 | 36
-----------+----------------------+----------
Total | 200 200 | 400
We can check the minimisation algorithm by calculating the marginal scores at each observation:
gen Active=0
gen Control=0
forvalues i=2/400 {
foreach group of varlist Active Control {
local total 0
foreach factor of varlist gender severity agegroup {
qui count if `factor'==`factor'[`i'] & group=="`group'" & _n<`i'
local total = `total' + r(N)
}
qui replace `group'=`total' in `i'
}
}
Control should be preferred by minimisation when its marginal total is lower than that for the Active group:
. tab group if Control < Active
group | Freq. Percent Cum.
------------+-----------------------------------
Active | 20 11.70 11.70
Control | 151 88.30 100.00
------------+-----------------------------------
Total | 171 100.00
The proportion allocated to Control in this situation is very close to the expected value of 0.875. We can test this:
. cii 171 151
-- Binomial Exact --
Variable | Obs Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
| 171 .8830409 .0245759 .825158 .9270753
The 95% confidence interval is consistent with 0.875. The same analysis for the Active group is:
. tab group if Active < Control
group | Freq. Percent Cum.
------------+-----------------------------------
Active | 137 87.82 87.82
Control | 19 12.18 100.00
------------+-----------------------------------
Total | 156 100.00
. cii 156 137
-- Binomial Exact --
Variable | Obs Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
| 156 .8782051 .0261849 .8163508 .9250541
So again the confidence interval includes the expected proportion 0.875.
Finally where the scores are tied, the group should be chosen at random:
. tab group if Active == Control
group | Freq. Percent Cum.
------------+-----------------------------------
Active | 43 58.90 58.90
Control | 30 41.10 100.00
------------+-----------------------------------
Total | 73 100.00
. cii 73 43
-- Binomial Exact --
Variable | Obs Mean Std. Err. [95% Conf. Interval]
-------------+---------------------------------------------------------------
| 73 .5890411 .0575852 .4676846 .7029424
The confidence interval includes the expected value of 0.5.