{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "beb97ca4",
   "metadata": {},
   "source": [
    "# LPS Harmonised Highest Educational Qualification Dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "e3b8790c",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       ">Last modified: 27 Oct 2025"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import sys\n",
    "import os\n",
    "sys.path.append(os.path.abspath('../../../../scripts/'))\n",
    "from data_doc_helper import UKLLCDataSet as DS, last_modified\n",
    "API_KEY = os.environ['FASTAPI_KEY']\n",
    "ds = DS(\"rtn_lps_education_harmonised\")\n",
    "last_modified()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5e5e67c9",
   "metadata": {},
   "source": [
    "<div style=\"background-color: rgba(0, 178, 169, 0.3); padding: 5px; border-radius: 5px;\"><strong>UK LLC has created a harmonised dataset of participants' highest educational qualifications.</strong></div>  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56141020",
   "metadata": {},
   "source": [
    "<div style=\"background-color: rgb(229, 106, 84, 0.3); padding: 5px; border-radius: 5px;\"><strong>More information about this dataset is available <a href=\"LPS_derived.html#harmonisation-methodology\" target=\"_blank\">here.</a></strong></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fe073008",
   "metadata": {},
   "source": [
    "## 1. Summary"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c938a7b1",
   "metadata": {},
   "source": [
    "The LPS harmonised education dataset contains a variable for LPS participants' self-reported **highest educational qualification**, plus those of their parent(s) where that information is available.  Particpants' education has been harmonised into four groupings - depending on the level of information provided. Parent(s)' qualifications have been harmonised into two groupings. Further details are on the harmonisation methodology are available [here](../LPS_derived/LPS_derived.md#harmonisation-methodology)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "fb0becc7",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [],
   "source": [
    "# below part is failing due to no metadata table in MDDB yet\n",
    "# md.get_num_vars(\n",
    "#             ds.df_ds.iloc[0][\"source\"],\n",
    "#             ds.df_ds.iloc[0][\"table\"]\n",
    "#             )\n",
    "# ds.info_table()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1de1c408",
   "metadata": {},
   "source": [
    "Summary info table TBC "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "799ad0f1",
   "metadata": {},
   "source": [
    "## 2. Variables"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8746bb10",
   "metadata": {},
   "source": [
    "| Variable name | Variable description |  \n",
    "|---|---|\n",
    "| LLC_xxxx_stud_id | Individual identifier (unique to each project in the TRE) |\n",
    "| cohort | LPS name |\n",
    "| source | LPS dataset holding the original education-level or qualification variable(s) for each participant (e.g. ALSPAC_wave1y) |\n",
    "| object | Label indicating which of the harmonised variables is represented by the value (e.g. llc_educ_1, llc_educ_M1) |\n",
    "| value | Numeric value for each of the objects |\n",
    "| label | Description of what each of the values represents |  \n",
    "| llc_timestamp | Date (month and year) on which the information was provided by the participant to the LPS |  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8a469c49",
   "metadata": {},
   "source": [
    "## 3. Version History"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "41fd9910",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_75170 th {\n",
       "  text-align: left;\n",
       "}\n",
       "#T_75170_row0_col0, #T_75170_row0_col1, #T_75170_row1_col0, #T_75170_row1_col1, #T_75170_row2_col0, #T_75170_row2_col1, #T_75170_row3_col0, #T_75170_row3_col1, #T_75170_row4_col0, #T_75170_row4_col1 {\n",
       "  text-align: left;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_75170\" style=\"font-size: 14px\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th id=\"T_75170_level0_col0\" class=\"col_heading level0 col0\" >Version</th>\n",
       "      <th id=\"T_75170_level0_col1\" class=\"col_heading level0 col1\" >1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td id=\"T_75170_row0_col0\" class=\"data row0 col0\" >Version Date</td>\n",
       "      <td id=\"T_75170_row0_col1\" class=\"data row0 col1\" >11 Aug 2025</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_75170_row1_col0\" class=\"data row1 col0\" >Number of Variables</td>\n",
       "      <td id=\"T_75170_row1_col1\" class=\"data row1 col1\" >8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_75170_row2_col0\" class=\"data row2 col0\" >Number of Observations</td>\n",
       "      <td id=\"T_75170_row2_col1\" class=\"data row2 col1\" >643299</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_75170_row3_col0\" class=\"data row3 col0\" >DOI</td>\n",
       "      <td id=\"T_75170_row3_col1\" class=\"data row3 col1\" > <a href=\"https://doi.org/10.71760/ukllc-dataset-00443-01\" rel=\"noopener noreferrer\" target=\"_blank\">10.71760/ukllc-dataset-00443-01</a></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td id=\"T_75170_row4_col0\" class=\"data row4 col0\" >Change Log</td>\n",
       "      <td id=\"T_75170_row4_col1\" class=\"data row4 col1\" > <a href=\"https://api.datacite.org/dois/10.71760/ukllc-dataset-00443-01/activities\" rel=\"noopener noreferrer\" target=\"_blank\">10.71760/ukllc-dataset-00443-01/activities</a></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x2161fc4acf0>"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds.version_history()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bedf5551",
   "metadata": {},
   "source": [
    "## 4. Useful Syntax"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "29e9d5bb",
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/markdown": [
       "Below we will include syntax that may be helpful to other researchers in the UK LLC TRE. For longer scripts, we will include a snippet of the code plus a link to Git where you can find the full scripts."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "ds.useful_syntax()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "jupbook",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}