Codelists and NHS England data#

Last modified: 20 Jun 2025

When requesting access to certain NHS England datasets, researchers must submit a codelist to ensure data minimisation.

Data minimisation in this context is referred to as ‘subsetting’, where only health records that relate directly to the research question are made available to researchers. Subsetting entails matching a list of clinical codes (a ‘codelist’) with linked health datasets. The researcher is only given access to records that appear in both the codelist and the linked health dataset(s). This reduces disclosure risk and ensures the project’s scope remains as approved by UK LLC’s application process.

Which NHS England datasets require a codelist?#

Six NHS England datasets in the UK LLC TRE require a clinical code list:

How do I create a codelist?#

We recommend that researchers draw on pre-defined codelists and existing resources rather than define their own. There are online repositories of codelists generated by researchers who have used electronic health records. The following resources may be of particular use:

UK LLC is committed to supporting reproducible and transparent research practices. As such, we maintain a library of all codelists researchers provide which can be shared with other researchers on request.

What should a codelist look like?#

UK LLC provides a codelist template as a downloadable MS Excel file. The template contains information on which coding systems should be used for each of the NHS England datasets. The coding systems used by each dataset are outlined on the Coded variables page.

What happens if I need to update my codelist during my project?#

If you need to add more clinical codes without changing the scope of your project, you can email a new codelist to support@ukllc.ac.uk, explaining why the codelist has changed. If you need to add clinical codes that are outside the original project scope, you should submit an amendment via UK LLC Apply.