Random-walk based AI for integrative multi-layer analysis of spatial and bulk omics data in human brain tissue
Code: BBSRC-DFA_2026_15
Primary Supervisor: Vincenzo Nicosia
Email: v.nicosia@qmul.ac.uk
Institute: School of Mathematical Sciences
Secondary Supervisor: Thomas Millner
Email: t.millner@qmul.ac.uk
Institute: Barts Cancer Institute
Abstract:
Recent advances in tissue imaging now generate high-dimensional, spatially resolved data that
capture biological organisation at unprecedented resolution. These platforms quantify the precise
spatial distribution of thousands of living cells within intact tissues while simultaneously measuring
intracellular protein expression levels and molecular concentrations. For example, spatial
transcriptomic profiling has substantially refined our understanding of protein function in tissue
homeostasis and has clarified molecular mechanisms underlying pathological states, including
tumorigenesis.
A central challenge in systems biology is the principled integration of heterogeneous data
modalities to achieve a unified and mechanistically informative representation of tissue function.
This project addresses that challenge by integrating transcriptomic, proteomic, and histological data
derived from human brain tissue. We will model these data as multilayer complex networks with
labelled nodes, where each layer encodes a distinct molecular or structural modality.
Methodologically, the project employs multiple variants of random walk processes on multilayer
networks to detect spatial correlations across scales and to quantify deviations from statistically
grounded null models. By integrating multimodal data across spatial and temporal dimensions, this
framework aims to produce a high-resolution, systems-level model of human brain tissue
organisation and function.
Lay Summary:
Current tissue imaging technologies allow us to obtain a variety of different descriptions of the same tissue sample, thus providing a multi-faceted view of how different factors contribute to determine the observed (mis-)behaviour of a biological system.
Although multi-dimensional imaging is a blessing in many ways, both for biologists and clinicians, there are a variety of practical issues associated with it, including the integration of these data sets into a single biologically meaningful picture.
This project will use methods from graph theory, applied mathematics, and computer science, and analyses data coming from spatial transcriptomics, other molecular data and histology of brain tissues. The focus will be on the integration of these different sources of information to reconstruct a more complete picture of brain tissues, with the explicit aim of identifying the structural and molecular correlations in normal and pathological tissues.
Aims and Objectives:
Aim 1: Establish a multiplex network representation of human brain tissue
Determine meaningful network layers, coming from different data sources
Infer meaningful links among sites, e.g., based on spatial closeness, functional similarity, or behavioural adjacency
Construct minimal network models by selecting the most meaningful and statistically significant connections, e.g., by looking at persistence across subjects or across different tissues
Assign meaningful labels/classes to the nodes/entities, e.g., based on functionality or expression levels
Aim 2: Develop random-walk and diffusion-based integration algorithms
Employ unbiased and biased random walks to construct meaningful time-series of inter-class relations
Determine inter-class mean first passage times, class coverage times, auto-correlation function, block entropy, and higher-order statistics of random walk trajectories in different network layers
Use both agent-based simulations and matrix analysis to infer the steady-state distribution of visit probability and hitting times
Aim 3: Derive integrated molecular-spatial signatures of brain organisation
Use the inter-class hitting times and class coverage times as a fingerprint of tissue organisation at each layer
Assess the significance of random walk trajectory statistics with respect to meaningful null models, including random reassignment of classes and labels and locally correlated models
Renormalise the results as deviations from the corresponding null model
Use the matrices of inter-class passage times and coverage times of each tissue and condition as representative of the corresponding condition/task
Wherever possible, perform inter-subject and intra-subject (inter-task) variability analysis
Aim 4: Deliver generalisable, reproducible AI software
Create classifiers based on the spatial fingerprints of brain tissues derived from random walk time series
Devise simple protocols to cluster spatial fingerprints by subjects and by tasks
Propose quantitative predictors of physiological and behavioural anomalies