This site uses cookies, so that our service can work better. I understand

Project 7: Building a brain-ageing biomarker using machine learning

Authors: James H Cole, PhD / Sebastian Popescu, MSc (Computational, Cognitive & Clinical Neuroimaging Laboratory (C3NL), Imperial College London)



As humans age, changes to the structure and function of the brain occur, so-called ‘brain ageing’. Brain ageing is associated with cognitive decline, decreased function capacity and a higher risk of neurodegenerative disease and dementia. A biomarker of the brain ageing process could have great utility in identifying people at risk of experiencing the adverse effects of brain ageing, before any symptoms manifest. Brain ageing biomarkers could also be useful for mapping individualised brain-ageing trajectories, assessing potential influences on brain ageing and in aiding the design of clinical trials.

Our work has used T1-MRI to design such a biomarker (brain-predicted age), taking voxelwise brain volume images and using a machine-learning regression to accurately predict chronological age in N=2001 healthy people aged 18-90. This follows the experimental design from biogerontology research that looks for measures of underlying ‘biological age’, and assesses the appropriateness of a measure based on the accuracy of age prediction. Our leading model has used voxelwise grey matter in a 3D convolutional neural network (CNN) approach, resulting in a mean absolute error (MAE) of 4.16 years, Pearson’s r = 0.96, R2 = 0.92. There is still considerable room for improvement in the model, to reduce the MAE towards minimal values. This is necessary if brain-predicted age is ever to have clinical impact, as currently the error levels mean that individualised predictions may be misleading.

The Hackathon project will encourage participants to develop their own brain-age prediction pipeline. Dataset #1 (N=2001) will be supplied, along with an independent validation set (dataset #2, N=650). These data will be in available in three formats: i) raw images, ii) FreeSurfer cortical thickness and subcortical volumes, iii) voxelwise grey matter and white matter volume images (derived from SPM). The participants can either use their own pre-processing pipeline or utilise the supplied processed datasets, then run any type of regression model of their choosing. This may or may not involve dimension reduction (e.g., PCA, clustering), feature selection (theory- or data-driven), use of kernels, regularisation and deep learning architectures.

The main dataset will be randomly partitioned into an 80-10-10% split, with separate samples for training, validation and testing. The final model will then also be assessed in dataset #2. The pipeline that results in the best test scores (i.e., MAE) for both dataset #1 and #2 will be declared the winner.

Finally, participants will be asked to consider ways of interpreting the feature importance, to help better understand the neuroanatomical features involved in the brain-age prediction.

A list of 1-5 key papers summarising the subject:

[1] Cole JH, Poudel RPK, Tsagkrasoulis D, et al. Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker. NeuroImage 2017. doi: 10.1016/j.neuroimage.2017.07.059

[2] Cole JH, Ritchie SJ, Bastin ME, et al. Brain age predicts mortality. Molecular psychiatry 2017. doi: 10.1038/mp.2017.62

[3] Franke K, Ziegler G, Klöppel S, Gaser C. Estimating the age of healthy subjects from T1-weighted MRI scans using kernel methods: Exploring the influence of various parameters. NeuroImage 2010; 50(3): 883-92.

[4] Konukoglu E, Glocker B, Zikic D, Criminisi A. Neighbourhood approximation using randomized forests. Medical Image Analysis 2013; 17(7): 790-804.

[5] Valizadeh SA, Hänggi J, Mérillat S, Jäncke L. Age prediction on the basis of brain anatomical measures. Human Brain Mapping 2017; 38(2): 997-1008.

A list of requirements for taking part in the project (education level / English level / programming language required):

Maximum number of participants:


What can the participant gain from the project?

Participants will gain an understanding of a key neuroscientific application of machine learning approaches, as well as an appreciation of the wider benefits of applying machine learning to biomedical problems. Particularly, participants will be encouraged to appreciate the importance of developing the whole analytic pipeline, rather than merely focusing on choice of machine learning algorithm. This includes considerations on pre-processing methods, feature selection and nested-cross-validation

Is there a plan for extending this work to a peer-reviewed paper in case the results are promising?

If a single prediction algorithm that generates a mean absolute error (MAE) less than the current leading application to these data (CNNs using voxelwise grey matter volume MAE = 4.16 years), then publication is warranted. The results of the different pipelines and algorithms used in the Hackathon will be then summarised and written up for submission to a peer-reviewed journal (e.g., NeuroImage, Human Brain Mapping, Frontiers in Aging Neuroscience), with all the Hackathon project participants included as co-authors, alongside the project team.