Working with LPS data

Working with LPS data#

Last modified: 20 Nov 2024

Why are there duplicate study IDs in my LPS dataset?

In most cases LPS data is one row per person. However, there are a few exceptions. Please check the relevant LPS documentation associated with the dataset you are working on.

Are there quirks in some datasets?

This section is work-in-progress and will be updated as further quirks are brought to our attention – if you identify quirks, please notify the UK LLC Data Team support@ukllc.ac.uk

Quirk 1: ncds58_ncds5_mother_child_v0001#

Datasets which are >1000 variables wide are split on loading to the UKSERPUKLLC database due to SQL field limit of 1024. Where this is the case the table name will contain a 1, 2 etc nested between the version and date in the table name. In most cases the 2 (or more) parts can be merged/joined on LLC_XXXX_stud_id. This can be done when the table has a 1-row per participant structure.

The following table(s) is/are an exception to this:

  • ncds58_ncds5_mother_child_vXXXX_1_YYYYMMDD

  • ncds58_ncds5_mother_child_vXXXX_2_YYYYMMDD.

These data require a join on 2 fields, LLC_XXXX_stud_id and person, because this table is at the child-level whereas the key ID LLC_XXXX_stud_id is at the parent-level.

What is the relationship between participants in NIHRBIO_COPING and GLAD?

As an overview, consider the diagram below. GLAD in the UK LLC TRE contains participants in the Green AND Orange. NIHRBIO_COPING in the UK LLC TRE only contains those in the RED but NOT the orange:

docs/FAQ/images/user_guide/COPING_GLAD.png

Work is ongoing to create a ‘true’ individual-level ID in the UK LLC TRE. This is known as Anonymous Linking Field (ALF2), and used in conjunction with llc_XXXX_stud_id, it will be possible to unpick these relationships.

Do LPS have weighting variables in the TRE?

LPS name

Weighting variables in the TRE?

Further information

AIRWAVE

TBC

TBC

ALSPAC

No

Published paper with missing data: The Avon Longitudinal Study of Parents and Children - a resource for COVID-19 research: questionnaire data capture July 2021 to December 2021, with a focus on long COVID. You can also find other papers on the Welcome Open Research site (search for ALSPAC and COVID) that explain how to deal with missing data.

BCS70

Yes

Search for weighting variables (e.g. ‘design weight’) using the Variables search in Explore and use the Advanced Options to filter on BCS70.

BIB

No

The BIB cohort recruited people during pregnancy who attended a 28-week antenatal appointment at the hospital. The aim was to invite all attendees to participate in the BIB cohort. BIB didn’t use any sample frame or weighting during recruitment and the population is broadly representative of people having babies in Bradford during this time. Compared to other LPS in UK LLC, the Bradford cohort falls into the highest deprivation groups and is more ethnically diverse (c. 50% South Asian).

ELSA

Yes

Search for weighting variables (e.g. ‘cross-sectional weight’) using the Variables search in Explore and use the Advanced Options to filter on ELSA.

EPICN

No

Eligible participants were recruited by post. Individuals were requested to provide detailed dietary, biological and other health data, and to be followed up over a few years, and so the response rate was c. 45%. Therefore participants were not a random population sample, but they were closely similar to UK population samples with respect to many characteristics, including anthropometry, blood pressure, and lipids, although with a lower proportion of smokers.

EXCEED

No

Published paper with missing data: Extended Cohort for E-health, Environment and DNA (EXCEED) COVID-19 focus

FENLAND

No

TBC

GENSCOT

No

Published paper with missing data: Generation Scotland: an update on Scotland’s longitudinal family health study

GLAD

No

Published paper with missing data: Comparison of depression and anxiety symptom networks in reporters and non-reporters of lifetime trauma in two samples of differing severity

MCS

Yes

See the MCS User Guide to understand how the weighting variables are named. Search for weighting variables (e.g. ‘weight1’) using the Variables search in Explore and use the Advanced Options to filter on MCS.

NCDS58

Yes

Search for weighting variables (e.g. ‘design weight’) using the Variables search in Explore and use the Advanced Options to filter on NCDS58.

NEXTSTEP

Yes

Search for weighting variables (e.g. ‘design weight’) using the Variables search in Explore and use the Advanced Options to filter on NEXTSTEP.

NICOLA

No

Weighting is explained in Early key findings from a study of older people in Northern Ireland

NIHRBIO_COPING

No

Published paper with missing data: Risk and protective factors for new onset binge eating, low weight, and self-harm symptoms in over 25,000 individuals in the UK during the COVID-19 pandemic

NSHD46

Yes

Search for weighting variables (e.g. ‘design weight’) using the Variables search in Explore and use the Advanced Options to filter on NSHD46.

SABRE

No

Published paper with further information: Ethnic differences in associations between fat deposition and incident diabetes and underlying mechanisms: The SABRE study

TEDS

No

TBC

TRACKC19

No

TRACKC19 has not calculated sampling weights.

TWINSUK

No

Most of the data in the TRE is derived from the CoPE questionnaires. For more details on how to deal with missing data visit: Wellcome Open Research Gateways.

UKHLS

Yes

See UKHLS’s guidance on selecting the correct weight for your analysis. Search for weighting variables (e.g. ‘xw’) using the Variables search in Explore and use the Advanced Options to filter on UKHLS.

How can I request additional LPS data for my project?

Requests for new data should be submitted via an amendment to UK LLC. You may apply for additional data from already approved LPS, data from additional LPS, and/or additional linked data. N.B. each type of data amendment requires a different level of review before being approved.