By Gareth Ambler
The Hosmer-Lemeshow goodness of fit test can be used to test whether observed binary responses, Y, conditional on a vector of p covariates (risk factors and confounding variables) x, are consistent with predictions, π. In other words it is a test of the hypothesis
H0: Pr(Y=1|x) = π
The predictions, π, often come from a recently fitted logistic regression model, so that:
π = logit(β0 + β1x1 + β2x2 + ... + βpxp)
where βj are the regression parameters. See Lemeshow and Hosmer's American Journal of Epidemiology article for more details.
Although the Hosmer-Lemeshow test is currently implemented in
Stata (see lfit
), hl can be used to assess
predictions not just from the last regression model, but also
from an external source (such as a published risk score). In
addition, by using the plot
option you can easily
see how the observed and expected proportions compare within the
groups formed by the Hosmer-Lemeshow test. This is commonly
referred to as a calibration plot:
hl allows calculation of both the usual C statistic
(based on equally sized groups) and the H statistic (based on
fixed cut-points on the predictions). To calculate H use the
q()
option with your own grouping variable.
Predictions have already been obtained and are stored in the
variable phat
. The binary response variable is
y
. To calculate C we type:
hl y phat
This uses the default of ten equally sized groups (decile groups) to construct the test statistic, C. To calculate H using the risk groups 0 - 0.1, 0.1 - 0.2, ..., 0.9 - 1, we type:
egen dec=cut(phat), at(0(0.1)1) hl y phat, q(dec) plot
The calibration plot produced by this command is shown below. The larger circles indicate that these points are based on more data. The reason there isn't 10 groups is because there were no predictions below 0.10 or above 0.72.
To obtain hl type the following into Stata:
net from https://www.sealedenvelope.com/
and follow the instructions on screen. This will ensure the files are installed in the right place and you can easily uninstall the command later if you wish.