Rescale categorical variables using numbers found in value labels

By Tony Brady

rescale recodes categorical variables using numbers found in the value labels matching a regular expression.

rescale is useful when you have categorical variables that include scale values in the value labels. For instance you may have the variable eligible:

Tabulation of eligible

But the underlying categories are coded 1, 2 and 3 rather than the codes given in the value label (0, 1 and 999):

Tabulation of eligible without labels

It's easy to recode the variable and keep the value labels intact by using rescale:

Tabulation of eligible after rescale

Note rescale works on numeric variables with value labels. If you have a categorical variable stored in a string format, first encode it to convert to numeric, then run rescale.

By default rescale uses the first integer in the value label as the value to recode to. If this is incorrect you can specify your own pattern to match on. For example:

rescale, regex("\(([0-9]+)\)")

would match numbers in brackets in the value labels – such as 12 month survival (0), 12 month survival (1).


To obtain rescale type the following into Stata:

net from

and follow the instructions on screen. This will ensure the files are installed in the right place and you can easily uninstall the command later if you wish.