Data and cohort

The current data available includes responses to our baseline health questionnaire and genotyping array data.

Our Future Health data dictionary


The Our Future Health cohort

The Our Future Health programme is open to all adults (18 years and older) living in the UK. The data we’ve gathered so far includes 704,088 participants, who have consented to take part and completed our baseline health questionnaire; and an initial subset of 66,524 participants who have been successfully genotyped.

In July 2022, we started recruiting participants in England and will continue to expand across the rest of the UK. For more information on how we are recruiting participants, download the Programme design and recruitment documentation (PDF file).

Our aim is to build a data set that reflects the UK population. The following tables show the current composition of our cohort.

Current cohort ages

Participant age

Cohort percentage

18 to 39

22.8%

40 to 59

37.1%

60 to 79

37.8%

80+

2.2%

Current cohort self-reported sex

Participant sex registered at birth

Cohort percentage

Female

56.3%

Male

43.7%

Other

<0.1%

Current cohort ethnicities

Participant ethnicity

Cohort percentage

Asian

5.6%

Black

1.5%

White

89.7%

Mixed

1.8%

Other

1.2%

For more information on cohort characteristics, download the Characteristics of the Our Future Health participants documentation (PDF file).

Back to top


The available data

The current data available includes:

  • participant data - which contains registration, consent and baseline demographic information collected across all consented participants
  • questionnaire data - which contains self-reported health information, details about participants' household, socioeconomic status, work and education history and family history
  • genotype array data – which contains single nucleotide polymorphism (SNP) data extracted from blood and made available in two different file formats

Data are stored and accessed in the Our Future Health Trusted Research Environment.

Participant data

The participant data set includes self-reported demographic information about:

  • ethnicity
  • gender and sex
  • month and year of birth

It also includes information relating to registration and consent:

  • month and year of registration with Our Future Health
  • month and year of consent to take part in the Our Future Health research programme

This data set is gathered at various times, such as during participant registration and as part of the baseline health questionnaire.

Baseline health questionnaire data

There are 2 versions of the baseline health questionnaire. Version 1 of the questionnaire contains 202 questions, and the current version (version 2) contains 286 questions. Version 2 went live in November 2022. Not all participants see every question. Some questions are presented selectively, depending on participant responses.

The questions are grouped into five sections:  

  • about you and your household – for example, age, sex, height, weight, ethnicity and living situation
  • work and education – for example, income, employment history and highest educational attainment
  • lifestyle – for example, socialising, screen use and alcohol intake
  • family health history – for example, siblings' and parents' health
  • personal health history – for example, health check-ups and screenings, diagnoses, medications and any current symptoms

For more information on how we categorise the participant and questionnaire data, download the Participant and questionnaire data release documentation (PDF).

To learn more about the baseline health questionnaire, download our Baseline health questionnaire documentation (PDF).

To view a human readable example of the baseline health questionnaire, download version 2 of the questionnaire (PDF).

Genotype data

The first release of our genotype array data was made available in the Our Future Health Trusted Research Environment on the 12 December 2023. This initial release consists of 700,138 genetic variants across 66,524 participants who have also completed the baseline health questionnaire (31,100 male, 35,424 female).

Data are available in two common file formats and are accompanied by sample level metadata to aid quality control (QC).

Genotype array data:

  • Variant Call Format Files (VCF) and associated files
  • Binary GEN file format (BGEN) and associated files

Metadata:

  • Sample level QC file

For more information on the first genotype release, download our Genotype array data release documentation (PDF file).

Back to top


Making more data available

We release new data into our Trusted Research Environment on a quarterly basis, as our cohort grows. We hope to make more data types available in 2024, including further genotype file types and consented access to health records.

Stay up to date

Would you like to stay up to date with our work? Sign up for updates and tell us what you'd like to know.

Back to top


Protecting the data

We de-identify all participant data before it’s available for use. All researchers will need to become registered researchers at Our Future Health and have an approved research study before they're given access to the data for research purposes.

As a registered researcher at Our Future Health:

  • you must access the data for your research study in accordance with an approved study application
  • you must have completed information governance training that covers UK GDPR within the past 12 months
  • your organisation must sign our resource terms and conditions

Find out how to apply to access the data.

Back to top

Updated: 18 December 2023