Skip to main content
The William Harvey Research Institute - Faculty of Medicine and Dentistry

Conditional Representation Learning for Integrative Analysis of Spatial Lung Tissue Data

Code: BBSRC-DFA_2026_22 (CASE)

Primary Supervisor:
Venet Osmani 
Email: v.osmani@qmul.ac.uk
Institute: Digital Environment Research Institute 

Secondary Supervisor:
Vito Mennella  
Email: v.mennella@qmul.ac.uk  
Institute: School of Biological and Behavioural Sciences

CASE Partner: AstraZeneca

Abstract:

Recent advances in spatial imaging technologies, including high resolution digital immunohistochemistry (IHC) and spatial transcriptomics, enable molecular profiling within intact tissue architecture while preserving information on cellular neighbourhood organisation. However, computational frameworks for integrating multimodal spatial data connecting tissue morphology and gene expression level variations remain underdeveloped. Most existing approaches treat pathology stages as discrete categories, limiting the ability to model subtle, gradual and non-linear changes in heterogenous tissue organisation. 

This project will develop advanced machine learning methods for conditional representation learning in spatial lung tissue data spanning control samples and multiple pathology severity stages and gradings. Integrating whole slide imaging high-resolution IHC data from preserved human lung explants, with spatial transcriptomics data from the same tissue blocks, the student will design models that encode continuous pathology severity directly within latent representations. Rather than performing discrete stage classification, the project will learn how to detect structured embedding features that capture spatial organisation patterns across grades. 

Graph neural networks and attention-based architectures will be developed to model changes in cellular neighbourhood organisation across severity states. Cross-modal alignment strategies will integrate IHC morphological and biomarker data with spatial transcriptomic profiles, producing unified embeddings that support multimodal segmentation, biomarker discovery, and pseudo-temporal modelling of spatial variation. 

The resulting framework will provide a generalisable approach for modelling structured variation in spatial biology datasets. By advancing AI methodologies for multimodal integration, representation learning, and spatial modelling, the project directly aligns with the objectives of the Doctoral Focal Award in AI for spatial imaging technologies.

Lay Summary:

New imaging technologies allow scientists to measure proteins and genes directly within intact tissue samples while preserving their natural structure. This means that we can not only see which molecules are present, but also where they are located and how different cells are arranged relative to one another. However, analysing these complex datasets remains a major computational challenge. 

This project will develop new artificial intelligence (AI) methods to better understand how lung tissue structure varies during normal and pathological progression. The research will use both high-resolution immunohistochemistry (IHC) images and spatial transcriptomics pseudo images from preserved human lung samples, which provides detailed information about gene activity within the same tissue architecture. 

Rather than simply assigning tissue samples to categories, the project will develop AI models that learn patterns of gradual and structured variation across samples. These models will analyse how groups of cells are organised in space, how these spatial patterns differ between regions and samples, and how molecular and structural information can be combined into unified representations. 

The outcome will be new computational tools for analysing multimodal spatial imaging data. While the project focuses on lung tissue as a case study, the methods developed will be broadly applicable to many types of spatial biological data, helping researchers better interpret complex tissue organisation using advanced AI techniques. 

Aims and Objectives:

Aim 1: Develop conditional representation learning architectures for multimodal spatial data 

Objectives: 

  • Design deep learning models that encode molecular and histopathological severity information and regional morphological appearance directly within latent representations rather than treating stages as discrete classes. 
  • Integrate IHC-derived morphological, key biomarker features and spatial transcriptomic profiles within unified multimodal embeddings. 
  • Benchmark conditional models against standard classification and clustering baselines. 

Aim 2: Model spatial neighbourhood reorganisation across severity spectra 

Objectives: 

  • Represent tissue as cell-level or region-level graphs capturing spatial proximity, morphology contextual continuity and molecular similarity. 
  • Develop graph neural network architectures to learn embeddings that capture changes in neighbourhood organisation across stages. 
  • Quantify monotonic, threshold-like, and non-linear spatial variation patterns. 

Aim 3: Construct pseudo-temporal spatial modelling frameworks 

Objectives: 

  • Leverage continuous pathology grading to model structured tissue morphology and spatial variation along a progression axis. 
  • Explore latent trajectory learning approaches to capture ordered variation  
  • Derive interpretable metrics describing spatial reorganisation across severity states. 

Aim 4: Deliver reproducible and generalisable multimodal analysis tools 

Objectives: 

  • Implement modular, version-controlled pipelines for multimodal spatial modelling. 
  • Ensure reproducibility through containerisation and adherence to FAIR principles. 
  • Release open-source software applicable to other spatial imaging datasets characterised by structured biological variation. 
Back to top