Internal project selection

Contents

Internal project selection#

Important

This page was for 2024/25 academic year and is archived for reference. The deadlines and procedures described here have expired and may change for future academic years.

All students must select 8 internal projects by filling out the following form:

https://forms.office.com/e/emuHYSjKHY

  • Enter 8 project codes (e.g. abcd-001 or xyzw-122) in the order of preference - choice 1 is your most preferred one. You can only select projects advertised for your MSc course.

  • You cannot select more than 3 projects with the same main or second supervisor (e.g. Albert Einstein cannot appear more than 3 times on the supervisory team among your selections).

Warning

The deadline to submit your internal projects selection is 9 April 2025, 10:00 BST.

If you have already submitted the selection form and would like to make changes, fill it out again. We will consider only your most recent submission we received before the deadline. Changes to your selection after the deadline will not be accepted. Submitting the form earlier will not increase your chances of getting one of your top choices - please take your time and make an informed decision. Changes after the deadline will not be allowed. If you want to learn more about a project, contact the main supervisor directly.

Warning

Although we do our best to give all students one of their top choices, that is often impossible. Therefore, we strongly encourage all students to find the main supervisor themselves and propose a project (see archive/project-proposal).

Allocation procedure#

  1. Supervisors submit project offers (deadline usually early February). If the main supervisor is affiliated with Imperial College London, the project is internal.

  2. Projects are advertised to students (mid-February 2025).

  3. Students submit project selections - each student submits 8 choices of internal projects in the order of preference (deadline 9 April 2025, 10:00 BST).

  4. The IRP Team matches students with projects using the allocation algorithm.

Warning

The IRP Team reserves the right to adapt or revise the outlined procedures as necessary.

If the student was already allocated to their proposed project or an external project, they will not be allocated to an internal project. No student will be allocated to more than one project and asked again to choose between them.

Available projects#

Ensemble generation for the 3D ocean state for a numerical model of the California Current circulation.#

  • Project code: roar-019

  • Main supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Andrew Moore, ammoore@ucsc.edu, University of California Santa Cruz, USA

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Ensemble methods are used for a variety of applications in the geosciences (e.g. data assimilation, forecasting, uncertainty quantification). Typically each member of an ensemble is computed by direct simulation using the numerical model. Because these calculations are computationally demanding, the size of the ensemble that can be generated is limited (O(102)) and often several orders of magnitude smaller than the dimension of the problem (O(106)). Using an existing ensemble, machine learning can be used to greatly increase the ensemble size without the need to perform additional direct simulations. The focus of the project proposed here is using ensemble methods to build preconditioners for optimization problems, in this case ocean data assimilation. The ocean state comprises temperature, salinity, horizontal velocity and sea surface height. The project will start with generating training data using an existing tangent linear model used in an operational ocean analysis and forecasting system. The aim is using this data to train a model emulation to be used as an ensemble generator via deep learning.

This project is co-supervised by Prof. Andrew Moore from University of California Santa Cruz (US).

Argo floats mapping for 3D Global Ocean Modelling#

  • Project code: roar-020

  • Main supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Andrew Moore, ammoore@ucsc.edu, University of California Santa Cruz, USA

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Our knowledge of the global ocean circulation has been revolutionized in recent decades by the Argo float program(https://argo.ucsd.edu/). An Argo float is a free drifting platform that can measure ocean properties (e.g. temperature and salinity) in the upper 1000m of the water column. At the present time there are ~4000Argo floats deployed in the world ocean providing an unprecedented view of ocean conditions in close to real time. However, the inhomogeneous geographical distribution of Argo presents challenges for visualizing and analysing the observations. This project will use machine learning techniques to map from the unstructured grid of the observations to a structured grid that will provide more utility to oceanographers and climate scientists. The variables will be temperature and salinity. All data is freely available and used operationally for ocean forecasting. More recently, a new class of Argo float has been developed that includes sensors to measure ocean biogeochemical (BGC) properties, such as dissolved oxygen and pH. These so-called BGC Argo floats are much more costly than the conventional floats and for this reason the number of BGC floats currently deployed is limited. A second project will use machine learning to infer BGC properties from temperature and salinity observations made by conventional floats, thus potentially increasing our knowledge of ocean health.

sub-project 1: Build a 3D Global map of Temperature and Salinity implementing a state of the art Vision-Transformer (or similar) to build the map[ref].

sub-project 2: Biogeochemical 3D Global map. Only a few floats have info about biogeochemical variables. The project consists in mapping Biogeochemical variables to the floats which don’t have info about this variable implementing a state of the art Vision-Transformer (or similar) to build the map[ref].

[ref] ViTAE-SL: a vision transformer-based autoencoder and spatial interpolation learner for field reconstruction

Data learning to downscale Air Quality Forecasting Systems#

  • Project code: roar-043

  • Main supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Ilaria D’Elia, ilaria.delia@enea.it, Italian National Agency for New Technologies, Energy and Sustainable Economic Development

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Air pollution represents one of the most significant environmental challenges of modern society. Numerous studies have already demonstrated the adverse effects on health, environment, climate, and economy.

Several regulatory efforts and different actions have been taken in the last decades by authorities. Instruments, models, and tools that can accurately predict air pollution are of the utmost importance and crucial in tackling this issue.

ENEA has developed a state-of-the-art air quality forecasting system to produce 3-day forecast of air pollution concentrations on hourly basis with a resolution of 4 km over Italy (FORAIT-IT (Adani et al., 2022, 2020; D’Elia et al., 2021; D’Isidoro et al., 2022; Mircea et al., 2014), https://www.afs.enea.it/project/ha_forecast/forair-it/it.html). A multiple-year air quality database, starting from the year 2019, is available for different applications. The database contains hourly air pollutant concentrations for the following major pollutants as well as other pollutants and meteorological parameters (temperature, humidity, pressure, wind speed):

NO2 (nitrogen dioxide);

PM2.5 (particulate matter with a diameter < 2.5 µm);

PM10 (particulate matter with a diameter < 10 µm);

O3 (ozone).

The aim of the thesis will be to apply super-resolution models to improve the current model resolution downscaling air pollutant fields to 1 km at national level and at finer resolutions over specific Italian hot spots (like Po Valley or the major metropolitan areas) .

Data learning to estimate aerosol and/or clouds radiative forcing from data collected at the Lampedusa Station for Climate Observation#

  • Project code: roar-042

  • Main supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Ilaria D’Elia, ilaria.delia@enea.it, Italian National Agency for New Technologies, Energy and Sustainable Economic Development

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Since the first IPCC report, published in 1990, the attention of climate scientists and then of the entire international community has increasingly focused on the study of climate variability and how it is affected by human activities.

The study of climate variability requires a complex and interdisciplinary approach involving phenomena from all areas of the Earth (atmosphere, ocean, biosphere, cryosphere and geosphere) and their interactions. Several non-linear feedback mechanisms play an important and still incompletely understood role, altering, among other things, the Earth’s energy balance on different time scales. In this context, it is particularly important to accurately estimate the effect that atmospheric components have on the Earth’s energy balance whose influence is still uncertain.

Since 1996, ENEA has operated the Lampedusa Station for Climate Observation where both solar and infrared irradiance, and the characteristics of aerosols and clouds are continuously measured. To determine their radiative forcing, it is of the utmost importance to estimate, with the highest degree of precision, the value that solar or infrared irradiance would have in the absence of aerosols or clouds.

For instance, to assess the infrared cloud forcing, knowing some of its characteristics like water content, it is necessary to estimate the value that the infrared irradiance would have in the absence of clouds. For this purpose empirical or semi-empirical formulae have been developed over the years that, based on the measurements of other parameters, like the meteorological ones, provides an estimation of the value of the irradiance.

This project evaluates how data learning techniques can face these issues by proposing different solutions and comparing the results with traditional semi-empirical parameterizations.

Predictive Modeling of Shelf Life for FMCG Snack Products#

  • Project code: joav-168

  • Main supervisor: Jorge Avalos-Patino, jea4117@ic.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Catalina Aguilar-Rivera, caguilar@noel.com.co, Grupo Nutresa S.A, Colombia

  • Available to: ACSE

  • This project does not accept multiple students.

This project aims to develop a computational model for predicting the shelf life of Fast-Moving Consumer Goods (FMCG) snack products, focusing on key deterioration mechanisms. The model will employ a deterministic approach, integrating established kinetic models to simulate the complex interplay of factors affecting product quality over time.

The project will leverage existing experimental data on snack product deterioration, encompassing mechanisms such as rancidity (lipid oxidation), texture changes (fracture, crispness loss), moisture fluctuations (sorption isotherms, water activity), staling (starch retrogradation), and microbial spoilage. Crucially, the model will incorporate the influence of packaging material properties (e.g., oxygen permeability, water vapor transmission rate) on these deterioration processes.

The computational model will be designed to predict the evolution of quality attributes (e.g., sensory scores, chemical markers) under various storage conditions (temperature, humidity, light exposure). Students will explore different mathematical formulations for describing reaction kinetics (e.g., zero, first, and higher-order reactions, Arrhenius equation) and transport phenomena (e.g., diffusion, mass balance). Model validation will be performed by comparing predictions with available experimental data. The project will deliver computational tool capable of simulating shelf life under diverse scenarios, enabling informed decisions in product development, packaging design, and storage optimisation. This tool will significantly reduce the reliance on time-consuming and expensive experimental shelf-life studies.

Bloch point exploration using the mean-field model of interacting non-collinear spins#

  • Project code: mabe-114

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Martin Lang, martin.lang@mpsd.mpg.de, Max Planck Institute for the Structure and Dynamics of Matter, Germany

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Complex magnetic materials hosting topologically non-trivial particle-like objects such as skyrmions, hopfions, and Bloch points are under intensive research. They could fundamentally change how we store and process data. One crucial material class are helimagnetic materials with Dzyaloshinskii-Moriya interaction, which emerges in systems lacking inversion symmetry. It was recently demonstrated that nanodisks consisting of two layers with opposite chirality can host a stable Bloch point. So far, simulations have been performed using micromagnetic models at zero temperature. However, since the magnetisation norm vanishes at the Bloch point, exploring the stability and Bloch point structure using different models is necessary. In this project, we will implement a mean-field model of interacting non-collinear spins and explore whether Bloch points are stable in nanodisks or just the artefact of micromagnetic simulations. In addition, the mean-field model will allow us to examine the room temperature stability of the Bloch point, which is necessary for applications in future data storage and information processing devices. No prior physics knowledge is needed - all required physics concepts will be covered in the introductory learning sessions with the supervisor during the first few weeks. Supervision consists of introductory group learning sessions as well as group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you would like to discuss this project in more detail.

Real-Time LLM-Powered Feedback for Asynchronous Python Learning#

  • Project code: mabe-117

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Large Language Models (LLMs) are transforming higher education, offering new ways to support student learning. Today, much of learning happens asynchronously, meaning that students engage with course materials, complete exercises, and seek help at different times rather than in a live classroom setting. While this flexibility is beneficial, one key challenge remains: students need immediate, high-quality feedback to learn and improve effectively. Traditional feedback often comes with delays, leaving students uncertain about their mistakes and hesitant to experiment. Automated feedback can remove this barrier, allowing students to make mistakes freely and request feedback as many times as needed—without fear of judgment. This fosters a more interactive and iterative learning experience. In this project, we will develop an AI-powered feedback tool for learning basic Python programming, which will be embedded directly into a Jupyter Notebook. The system will compare student solutions against hidden reference solutions provided by the instructor. The LLM-based system will generate formative feedback that identifies errors, helps students understand their mistakes, and improves their approach over time. This ensures that students receive real-time guidance tailored to their specific needs. Supervision consists of group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you would like to discuss this project in more detail.

Enhancing Feedback in Higher Education with LLMs#

  • Project code: mabe-116

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Large Language Models (LLMs) are transforming higher education, offering innovative ways to support both students and educators. However, while student numbers continue to rise, the number of teaching staff remains limited. This creates a major challenge: high-quality, personalised feedback is essential for learning, yet the time available for staff to provide it is shrinking. Delegating feedback entirely to LLMs might seem like a solution, but this could make students feel they are not receiving “value for money” and have no contact with teaching staff. Instead, this project aims to enhance human feedback with AI support rather than replace it. We will develop an AI-assisted feedback expansion tool, where markers provide concise feedback points through a Graphical User Interface (GUI). The system will then use an LLM to expand this feedback by incorporating explanations, concrete examples, and tailored exercises. Importantly, the LLM will be based on lecture notes and the provided reading list, ensuring all generated feedback aligns with the course content. This approach ensures that feedback is detailed, pedagogically relevant, and contextualised. Supervision consists of group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you would like to discuss this project in more detail.

Eigenmode analysis of topologically stable quasi-particles#

  • Project code: mabe-115

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Martin Lang, martin.lang@mpsd.mpg.de, Max Planck Institute for the Structure and Dynamics of Matter, Germany

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Topologically stable quasi-particles could fundamentally change how we store and process data. However, for their applications in future data storage and information processing devices, it is necessary to develop the detection (reading) method to determine whether they encode bit 0 or 1. One possible method is using ferromagnetic resonance, where the magnetic sample is excited using an external magnetic field, and the presence of a topologically stable quasi-particle is determined from the ferromagnetic resonance response. In this project, we will develop a computational and data analysis tool in Python to analyse magnetisation dynamics data. This tool will allow us to analyse resonance frequencies and spatially resolved eigenmodes and propose a method for reading in data storage devices. No prior physics knowledge is needed - all required physics concepts will be covered in the introductory learning sessions with the supervisor during the first few weeks. Supervision consists of introductory group learning sessions as well as group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you want to discuss this project.

Field manipulation of emergent magnetic monopoles#

  • Project code: mabe-113

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Martin Lang, martin.lang@mpsd.mpg.de, Max Planck Institute for the Structure and Dynamics of Matter, Germany

  • Available to: ACSE EDSML

  • This project may accept multiple students.

The prediction of topologically stable magnetic skyrmions being used to change how we store and process data has led to materials with the Dzyaloshinskii-Moriya interaction becoming the focus of intensive research. It was recently demonstrated that nanodisks consisting of two layers with opposite chirality could host a stable emergent magnetic monopole - Bloch point (a three-dimensional singularity of the magnetisation field). For applications of Bloch points in future data storage and information processing devices, it is necessary to explore the manipulation of Bloch points using different driving methods. In this project, we will implement a Python-based micromagnetic simulation tool to investigate whether it is possible to manipulate the Bloch point in helimagnetic nanostrips and under what conditions. More precisely, we will simulate the translational motion of Bloch points in planar nanostructures. The results of this work will allow us to determine whether a Bloch point can be an information carrier in racetrack-like data storage devices. No prior physics knowledge is necessary - all required physics concepts will be covered in the introductory learning sessions in the first few weeks. Supervision consists of introductory group learning sessions as well as group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you would like to discuss this project in more detail.

Python-based domain-specific language for atomistic Hamiltonians#

  • Project code: mabe-112

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Martin Lang, martin.lang@mpsd.mpg.de, Max Planck Institute for the Structure and Dynamics of Matter, Germany

  • Available to: ACSE EDSML

  • This project may accept multiple students.

In specific scenarios, computational studies emerge as the only practical approach to tackle complex research challenges and efficiently design various products and systems. In nanomagnetism, simulations have become a prominent tool and often the only method to investigate diverse magnetic phenomena. Researchers employ several models in computational magnetism, each with advantages and disadvantages. Compared to continuous models, atomistic simulations enable us to simulate the magnetic moments of individual atoms and their interactions with other magnetic moments and the environment. However, the capabilities of simulation tools often limit scientists and engineers. For instance, if they want to include a new energy term in the energy equation, the developers did not implement that term. Extending the capabilities of simulation tools is non-trivial - it requires dedicated resources and expert programming knowledge. Therefore, in this project, we will design and implement a domain-specific language in Python that will allow scientists and engineers to define custom atomistic energy terms. Our work would enable scientists and engineers to simulate any energy term without diving into the computational backend. No prior physics knowledge is necessary - all required physics concepts will be covered in the introductory learning sessions with the supervisor during the first few weeks. Supervision consists of introductory group learning sessions as well as group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you would like to discuss this project in more detail.

Landau-Lifshitz-Gilbert-based micromagnetic computational backend#

  • Project code: mabe-111

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Martin Lang, martin.lang@mpsd.mpg.de, Max Planck Institute for the Structure and Dynamics of Matter, Germany

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Computational micromagnetics has become an essential tool in academia and industry to support fundamental physics research and the design and development of devices used in data storage, information processing, sensing, and medicine. We have designed and developed a human-centred research environment called Ubermag. With Ubermag, scientists can control different micromagnetic computational backends. The complete simulation workflow, including definition, execution, and data analysis of simulation runs, can be performed within a single Python session. Furthermore, numerical libraries, co-developed by the computational and data science community, can immediately be used for micromagnetic data analysis within Ubermag. In this project, we will develop a Python-based micromagnetic computational backend in Python. Although this model simulates magnetic nanosystems at zero temperature, it is necessary to simulate the system’s time evolution. Using finite differences, we will extend Ubermag to compute different energy terms and simulate the system’s time evolution by integrating the Landau-Lifshitz-Gilbert equation. The main deliverable will be a well-tested, reliable, and documented extension allowing scientists and engineers to explore magnetic static and dynamic phenomena. No prior physics knowledge is necessary - all required physics concepts will be covered in the introductory learning sessions with the supervisor during the first few weeks. Supervision consists of introductory group learning sessions as well as group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you would like to discuss this project in more detail.

Metropolis-Hastings Monte Carlo simulator for exploring magnetic nanosystems#

  • Project code: mabe-110

  • Main supervisor: Marijan Beg, m.beg@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Martin Lang, martin.lang@mpsd.mpg.de, Max Planck Institute for the Structure and Dynamics of Matter, Germany

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Computational science is emerging as the third pillar of research and development in academia and industry across all science and engineering disciplines. Computational studies complement experimental and theoretical studies and are sometimes the only feasible way to address different research questions or commercial challenges. For example, in nanomagnetism, simulations are often the only possible technique for exploring magnetic phenomena in fundamental physics and their applications in data storage, information processing, medicine, sensing, and many others. Several models are used in computational magnetism, each with advantages and disadvantages. For instance, when we want to simulate magnetic phenomena at elevated temperatures, we often need Metropolis-Hastings Monte Carlo simulations. However, compared to other models, their disadvantage is that they are computationally expensive. We will develop a Python-based Metropolis-Hastings Monte Carlo simulation tool in this project. Since each Monte Carlo simulation consists of millions of iterations, it is crucial to optimise the execution time of each iteration. Therefore, we will explore different optimisation techniques and possibly parallelise it on CPU or GPU. Finally, we will explore integrating it into open-source computational magnetism simulation frameworks to make it available to the research and industrial community. The main deliverable of this project will be a well-tested, reliable, and documented Python-based simulation package. No prior physics knowledge is necessary - all required physics concepts will be covered in the introductory learning sessions with the supervisor during the first couple of weeks of the project. Supervision consists of introductory group learning sessions and group and individual supervision meetings. Please get in touch with the main supervisor, Marijan Beg, if you would like to discuss this project in more detail.

Next-Gen Sedimentology: Automating Grain-Size Measurements with Machine Learning#

  • Project code: rebe-061

  • Main supervisor: Rebecca Bell, rebecca.bell@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Nahin Rezwan, n.rezwan@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Grain size is a key physical parameter in sedimentology, preserving information about sediment transport processes, tectonics and climate. Historically, grain size measurements relied on manual field methods, which, while effective, are labor-intensive and limited in scope. With the advent of machine learning and image-based methods, it is now possible to analyse vast quantities of sedimentary grain data with unprecedented speed and precision. This project aims to evaluate stratigraphic grain size distributions and modern channel deposits across temporal and spatial scales. Understanding these variations will significantly improve insights into sediment routing and Earth history. The dataset will include stratigraphic images and aerial or ground-based photographs from active and ancient river systems, processed using photogrammetry for scaling and alignment. The student(s) will deliver a validated computational workflow for grain size segmentation, calibrated with field data. This project offers the opportunity to contribute to methodological advancements in sedimentary geology, with outcomes potentially influencing industry and academic practices.

Deep Learning for Detecting Thrust Faults in Subduction Zones Using 3D Seismic data#

  • Project code: rebe-055

  • Main supervisor: Rebecca Bell, rebecca.bell@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Wenhao Zheng, w.zheng23@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

Thrust faults, are a key feature of subduction zones and present as a narrow (~100-200 km wide) and shallow (<50 km deep) dipping surface, crucial for generating great (M > 8) earthquakes and tsunami. This study will develop an innovative deep learning-based approach for detecting thrust faults in subduction zones using 3D seismic data. We leverage a diverse training dataset comprising both real seismic data and synthetic seismic data, each accompanied by precisely labelled fault annotations. This project aims to leverage cutting-edge deep learning models to analyze 3D seismic data for accurate detection of thrust faults in subduction zones, which will contribute to the creation of reliable tools for fault detection, offering valuable insights for seismology, geology, and disaster preparedness.

Synthetic seismic data generation for deep learning applications in seismic interpretation#

  • Project code: rebe-054

  • Main supervisor: Rebecca Bell, rebecca.bell@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Wenhao Zheng, w.zheng23@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: EDSML GEMS

  • This project may accept multiple students.

Synthetic seismic data plays a crucial role in training deep learning models for seismic interpretation, providing diverse and high-quality datasets to improve model accuracy and robustness. This study explores efficient methods for generating synthetic seismic data, focusing on the challenges of accurately simulating diverse and complex geological structures. By leveraging physics-based simulation techniques and data-driven approaches, the student(s) will create realistic seismic datasets that capture key geological variations. The generated datasets will be evaluated for their effectiveness in training neural networks for fault detection, horizon tracking, and other interpretation tasks.

One avenue that could be explored in the project is the use of simple 1D convolution to produce synthetic seismic images vs more complex methods that attempt to simulate the realities of wave propagation (e.g. Point Spread Functions (PSF)- which involve 2D convolution kernels, Lecomte et al. 2015). Are deep learning models more successful when they are trained with synthetic seismic data that has been produced in a way which more closely mimics seismic data collection in reality, or can we safely continue with simple synthetic seismic data generation?

The generation of high-quality synthetic seismic data will have broad applications across various aspects of seismic interpretation, significantly advancing the field and enabling more accurate and automated analyses.

Analysis of imbibition in mixed-wet media#

  • Project code: mabl-002

  • Main supervisor: Martin Blunt, m.blunt@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Branko Bijeljic, b.bijeljic@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: GEMS

  • This project may accept multiple students.

In this project you will first construct semi-analytical solutions for spontaneous imbibition for water-wet and mixed-wet media. You will then study the literature review to find data on spontaneous imbibition rates. Using reasonable assumptions you will be asked to determine capillary pressures and relative permeabilities consistent with the results. You will be asked to comment on the results in comparison with conventional relative permeability measurements.

The work involves having a good knowledge of and interest in multiphase flow in porous media, some coding skills and a willingness to look carefully through the literature.

Effect of reservoir mechanical properties and operational properties on reservoir integrity during high-injection rate underground hydrogen storage (or carbon dioxide)#

  • Project code: jabu-070

  • Main supervisor: James Burtonshaw, james.burtonshaw16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Adriana Paluszny, apaluszn@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: GEMS

  • This project may accept multiple students.

In a renewable energy world, there will exist an annual energy supply dichotomy in which for half of the year, plentiful renewable energy to meet demand is produced, but for the other half of the year, a major deficit exists. In order to avoid satisfying this deficit with traditional fossil fuels, excess electrical energy during the periods of energy excess can be diverted to convert water into hydrogen, and subsequently store the hydrogen into subsurface porous media to later be withdrawn during the periods of energy deficit. Therefore, enormous volumes of hydrogen will need to be stored to meet nationwide demand. Hydrogen has a very low density at reservoir conditions, and thus, must be injected at very high rates in order to deliver sufficient volumes over the relatively short periods of energy excess. Therefore, we need to understand whether reservoir and caprock will be hydraulically fractured during hydrogen storage and under what conditions this may become problematic. The student will run hydromechanical simulations in our in-house C++ finite element simulator - the Imperial College Geomechanics Toolkit – to study the influence of different reservoir mechanical properties and operational properties on the growth of pre-existing natural reservoir fractures and the nucleation and growth of novel hydraulic fractures in the reservoir and caprock.

Effect of reservoir mechanical properties and operational properties on caprock integrity during high-injection rate underground hydrogen storage (or carbon dioxide storage)#

  • Project code: jabu-071

  • Main supervisor: James Burtonshaw, james.burtonshaw16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Adriana Paluszny, apaluszn@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: GEMS

  • This project may accept multiple students.

In a renewable energy world, there will exist an annual energy supply dichotomy in which for half of the year, plentiful renewable energy to meet demand is produced, but for the other half of the year, a major deficit exists. In order to avoid satisfying this deficit with traditional fossil fuels, excess electrical energy during the periods of energy excess can be diverted to convert water into hydrogen, and subsequently store the hydrogen into subsurface porous media to later be withdrawn during the periods of energy deficit. Therefore, enormous volumes of hydrogen will need to be stored to meet nationwide demand. Hydrogen has a very low density at reservoir conditions, and thus, must be injected at very high rates in order to deliver sufficient volumes over the relatively short periods of energy excess. Therefore, we need to understand whether reservoir and caprock will be hydraulically fractured during hydrogen storage and under what conditions this may become problematic. The student will run hydromechanical simulations in our in-house C++ finite element simulator - the Imperial College Geomechanics Toolkit – to study the influence of different reservoir mechanical properties and operational properties on the growth of a pre-existing natural vertical array of caprock fractures and the nucleation and growth of novel hydraulic fractures in the reservoir and caprock. Alternatively, the student can perform this analysis for CO2 storage rather than hydrogen storage if they prefer.

Hydraulic fracturing of ultra low permeability reservoir rock and compaction-driven fracture nucleation and growth during natural hydrogen production#

  • Project code: jabu-069

  • Main supervisor: James Burtonshaw, james.burtonshaw16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Adriana Paluszny, apaluszn@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: GEMS

  • This project does not accept multiple students.

It is now becoming understood that the Earth’s crust contains at least hundreds of natural hydrogen reservoirs globally. Natural hydrogen can form from a number of geochemical processes including the serpentinisation of iron-rich minerals and radiolysis of water in uranium, thorium and potassium-rich rocks. The hydrogen rises upward and resides in the pore and fracture space with brine, much like oil and brine in oil reservoirs. Unlike petroleum reservoirs, many natural hydrogen reservoirs are very low permeability consisting not of conventional sandstones and limestones but of crystalline granites and metamorphic marbles and quartzites. Therefore, in many cases, the rock will need to be hydraulically fractured to allow for sufficient production rates. The student will assess with numerical hydromechanical simulations in our in-house C++ rock mechanics simulator – the Imperial College Geomechanics Toolkit - the conditions required for such hydraulic fracturing and how the subsequent fractures propagate. They will then simulate the production phase assessing if any further compaction-derived fractures are nucleated or if the initial hydraulic fractures are extended. Ultimately, the student will externally code a script to assess the evolving permeability of the reservoir through time during the stimulation and production phases to determine whether sufficient permeability can be created.

Data inpainting for ultrasound brain imaging#

  • Project code: osca-084

  • Main supervisor: Oscar Calderon, oc14@ic.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Lluis Guasch, lluis.guasch@sonalis-imaging.com, Sonalis Imaging Ltd.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Ultrasound full-waveform inversion (FWI) has recently emerged as a safe, affordable, portable and high-resolution imaging alternative to conventional MRI and CT imaging of the brain. Despite its advantages, the quality of the reconstructed images is often constrained by the limitations in the acquisition systems. Specifically, the challenges include the narrow bandwidth of the sensors and the limited number of transducers that can be practically arranged in a compact device. These factors restrict the ability to reproduce the recorded data during FWI, a key element of image reconstruction that limits the quality of the recovered images.

This project will explore the use of machine learning (ML) techniques to address these limitations. First, we will utilise ML models to predict the missing high- and low-frequency components of the ultrasound data and improve the convergence and accuracy of full-waveform inversion. We will then utilise ML algorithms for wavefield inpainting to reconstruct the missing spatial data between ultrasound transducers to create a more complete representation of the wavefield.

Developing AI based Sub-Grid-Scale Models for Particles moving in a Fluid using AI4PDEs, AI4Particles and Scale Independent Convolutional Autoencoders#

  • Project code: boch-178

  • Main supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

Recently AI4PDEs [1,2,3] has shown great success in being able to solve very large fluid systems in great detail. Also, the AI4Particle modelling method has been able to describe particle motion in turbulent flows. AI4Particles can also be used as a Discrete Element Method (DEM) and as a molecular dynamics model. AI4PDEs and AI4Particles offer potentially revolutionary advantages over conventional modelling methods. However, there is still a need to incorporate, error prone, correlations for the drag forces between the particles and the fluid flow. This project attempts to overcome this drawback by using a Sub-Grid-Scale Model (SGS) of the fluid flow around the particles and thus resolve the drag forces. A new breed of scale independent convolutional autoencoders will be used to resolve the difference between the course grid AI4PDE model and a resolved model of the flow around the particles. This autoencoder, after training, can be applied to very large systems even when the original training was conducted on much small systems.

AI4PDEs is an in-house computational fluid dynamics (CFD) solver, which solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method (e.g. first-order finite elements, second-order finite differences etc), so no training is needed. The solutions are the same as those obtained from Fortran or C++ codes (to within solver tolerances). The AI4Particles model is similar in its architecture to AI4PDEs but resolves the interaction and motion of particles using convolutional neural networks. The AI4PDEs and AI4Particles codes run on GPUs as well as CPUs and AI processors.

The student(s) will use CFD data and build AI models in Python and PyTorch. For this project, an interest in several of the following would be beneficial: computational fluid dynamics, particle dynamics, neural networks and numerical models. The coding will be done in PyTorch and Python.

[1] Chen, Nadimy, Heaney et al. (2024) Solving the Discretised Shallow Water Equations Using Neural Networks, Advances in Water Resources, accepted. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4956116

[2] Chen, Heaney, Pain (2024) Using AI libraries for Incompressible Computational Fluid Dynamics, https://arxiv.org/abs/2402.17913.

[3] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, Computer Methods in Applied Mechanics and Engineering 426: 116974. https://doi.org/10.1016/j.cma.2024.116974

Controlling ventilation in buildings using Generative Networks (GANs/VAEs/AAEs) to produce comfortable healthy environments that are energy efficient#

  • Project code: boch-181

  • Main supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Energy-efficient buildings are central to implementing successful carbon reduction strategies. In 2019, the UK’s total energy consumption was 142 Mtoe (energy equivalent to burning 142 Mt of oil), and buildings account for 40% of this (IEA 2021). [1] and [2] have shown that full-automation can save 50-60% of HVAC (Heating, ventilation, and air conditioning) energy consumption. An AI control system might expect to meet that saving with associated reduction in CO2 emissions and thus greatly contribute to net zero. Smart controls and connected devices could save 230 EJ in cumulative energy savings by 2040 (IEA), lowering energy consumption of buildings and the associated carbon footprint by as much as 10% globally, while improving comfort for occupants. With this in mind, in this project, we will apply generative neural networks (GANs/VAEs/AAEs and the lattest latent diffusion models) to (1) Predict the transient spatial distribution of temperature, relative humidity, CO2 concentration and pollution concentration within a building given the outside weather conditions and ventilation settings. (2) Assimilate sensor data (e.g. temperature, CO2) and room occupancy data into the AI model in order to predict the future room conditions. (3) Perform control of the ventilation settings in order to save energy and produce healthy living conditions. (4) Perform uncertainty quantification in order to determine uncertainties on controls and predictions. (5) Apply generative AI methods in order to produce priors of initial room conditions (much like weather prediction models) and integrate this with an AI4PDE model [3,4,5] of the detailed air flow and temperature, relative humidity and CO2 distribution within a room.

Project 1 will complete tasks 1 and 2, project 2 will complete tasks 3 and 4, and project 3 will address task 5. These projects would suit students with an interest in computational fluid dynamics, energy including net-zero objectives, neural networks (especially generative networks) and optimisation (such as data assimilation and control).

[1] Schiavon, Melikov, Sekhar (2010) Energy analysis of the personalized ventilation system in hot and humid climates. Energy and Buildings, 42(5):699-707.

[2] Khalil (2020) Computer Simulation of Air Distribution and Thermal Comfort in Energy Efficient Buildings.

[3] Chen, Heaney, Pain (2024) Using AI libraries for Incompressible Computational Fluid Dynamics, in preparation.

[4] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, arXiv preprint. https://doi.org/10.48550/arXiv.2401.06755

[5] Phillips, Heaney, Chen, Buchan, Pain (2023) Solving the Discretised Neutron Diffusion Equations Using Neural Networks, International Journal for Numerical Methods in Engineering 124(21):4659-4686. https://doi.org/10.1002/nme.7321

Multiphase flow modelling using AI4PDEs#

  • Project code: boch-126

  • Main supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Nathalie Carvalho Pinheiro, n.pinheiro23@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

AI4PDEs is an in-house computational fluid dynamics (CFD) solver, which solves discredited systems of equations using convolutional neural networks algorithms. Thus, the algorithm benefits from the high computational efficiency and parallelization implemented in AI libraries. However, it differs from traditional machine learning techniques since the network weights are determined by the discrete equations that govern the system rather than by training [1, 2].

This project aims to apply AI4PDEs to simulate multiphase flow, the mechanical phenomenon of two or more phases simultaneously flowing through a determined space and interacting with each other. Examples of multiphase flow include collapsing dams, 3D-flooding and multiphase jets, where the water interacts with the air, and many industrial applications where multiphase flow in pipelines occurs, such as nuclear power plants, and some scenarios of geothermal energy and carbon dioxide transport. Different multiphase flow patterns can be assumed inside a pipeline when varying initial or boundary conditions [3]. Studying the expected flow patterns in an industrial application can improve system efficiency, leading to cost reductions.

This project will involve small changes to the AI4PDEs code, running test cases and writing pre- and post-processing functions (including generating a GUI, if there is enough time). It is suitable for students interested in numerical methods and applications in physics.

[1] B. Chen, C. E. Heaney, C. C. Pain, Using AI libraries for Incompressible Computational Fluid Dynamics, 2024. arXiv:2402.17913. [2] B. Chen, C. E. Heaney, J. L. M. A. Gomes, O. K. Matar, C. C. Pain, Solving the discredited multiphase flow equations with interface capturing on structured grids using machine learning libraries, 2024. arXiv:2401.06755. [3] Experimental Visualization and Numerical Simulation of Liquid-Gas Two-Phase Flows in a Horizontal Pipe, volume Volume 7: Fluids Engineering of ASME International Mechanical Engineering Congress and Exposition, 2017. doi:10.1115/IMECE2017-72113.

Predicting spatial-temporal systems from sparse and movable observation points using generative and deterministric machine learning methods#

  • Project code: sich-005

  • Main supervisor: Sibo Cheng, sibo.cheng@imperial.ac.uk, Data Science Institute, Imperial College London

  • Second supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Description

Accurately predicting weather and air pollution physical fields is essential for addressing challenges in urban planning, public health, and climate change mitigation. Air pollution, characterized by complex interactions of particulate matter (PM2.5, PM10), gases (e.g., NO2, O3, CO), and meteorological factors, presents significant spatial and temporal variability. Conventional methods, such as static, dense sensor networks and computationally intensive numerical simulations, face constraints including limited spatial coverage, high operational costs, and scalability challenges. Recent advancements in machine learning, particularly diffusion models, have demonstrated exceptional capabilities in modeling complex spatiotemporal dynamics. This proposal seeks to harness these advancements to reconstruct high-resolution air pollution fields and weather patterns using sparse and movable sensor data, facilitating more adaptive and dynamic environmental monitoring. Objectives:

  1. Develop a Machine Learning Framework: Design and implement a machine learning framework based on diffusion models to predict high-resolution weather and air pollution fields, integrating data from sparse and movable observation points.

  2. Enhance Spatial and Temporal Coverage: Leverage movable sensor networks to improve the spatial resolution of key air pollution indicators (e.g., PM2.5, PM10, NO2_22, O3_33) and meteorological variables (e.g., temperature, humidity, wind speed).

  3. Performance Validation: Evaluate the model’s effectiveness in reconstructing air pollution and weather fields by benchmarking against traditional machine learning approaches (e.g., Gaussian Processes, CNN/Vit-based approahces) and numerical simulation techniques.

Publication Opportunity: The proposed research has strong potential for publication in leading machine learning conferences (e.g., NeurIPS, ICLR) and high-impact journals in computational geoscience and environmental modeling, such as Journal of Advances in Modeling Earth Systems (JAMES) and Geoscientific Model Development (GMD).

Potential Phd Opportunity: This project will be in collaboration with Institut polytechnique de Paris in France. There is a potential opportunity to transition into a PhD position, depending on the success of ongoing funding applications and the outcomes of the master project.

References: -Finn, T.S., Durand, C., Farchi, A., Bocquet, M., Rampal, P. and Carrassi, A., 2024. Generative diffusion for regional surrogate models from sea‐ice simulations. Journal of Advances in Modeling Earth Systems, 16(10), p.e2024MS004395

  • Zhuang, Y., Cheng, S. and Duraisamy, K., 2024. Spatially-Aware Diffusion Models with Cross-Attention for Global Field Reconstruction with Sparse Observations. Computer Methods in Applied Mechanics and Engineering, 2024. Requirements

• Strong motivation and passion for research in AI. • Proficiency in coding, with experience in PyTorch. • Ability to work independently and engage deeply with complex machine learning and scientific concepts.

Machine Learning and Data Assimilation for Inverse Modeling in Multiphysics Nuclear Case Study#

  • Project code: sich-006

  • Main supervisor: Sibo Cheng, sibo.cheng@imperial.ac.uk, Data Science Institute, Imperial College London

  • Second supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

Description

This project focuses on leveraging machine learning and data assimilation techniques to address inverse modelling challenges in multiphysics nuclear systems. The work will build on state-of-the-art approaches as demonstrated in the selected nuclear test cases (e.g., IAEA 2D PWR and TWIGL2D benchmarks) described in the reference paper [2]. The aim is to integrate latent space methods for efficient and accurate model bias correction and parameter estimation in the context of nuclear reactor simulations. The project will involve:

  1. Latent Space Data Assimilation: -Development of data assimilation techniques using latent representations generated by autoencoders (AEs). -Implementation of variational data assimilation and/or ensemble Kalman filter (EnKF) methods in the latent space to enhance computational efficiency while maintaining high accuracy. -Use of the PyTorch-based TorchDA package [1] to integrate neural network transformations into data assimilation workflows.

  2. Model Dynamics Propagation in Latent Space: -Employ surrogate neural networks to propagate the dynamics of coupled multiphysics systems (e.g., neutronics and thermal-hydraulics) within the latent space. -Optimization of these latent models for accurate real-time predictions and corrections.

[1] Cheng, Sibo, Jinyang Min, Che Liu, and Rossella Arcucci. “TorchDA: A Python package for performing data assimilation with deep learning forward and transformation functions.” Computer Physics Communications 306 (2025): 109359. [2] Riva, S., Introini, C. and Cammi, A., 2024. Multi-physics model bias correction with data-driven reduced order techniques: Application to nuclear case studies. Applied Mathematical Modelling, 135, pp.243-268.

Requirements

Strong motivation and passion for research in AI and physics. Proficiency in coding, with substantial experience in PyTorch. Ability to work independently and engage deeply with complex machine learning and scientific concepts.

Modelling complex dynamical systems with cutting-edge graph neural networks#

  • Project code: sich-004

  • Main supervisor: Sibo Cheng, sibo.cheng@imperial.ac.uk, Data Science Institute, Imperial College London

  • Second supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Description

Accurately predicting Computational Fluid Dynamics (CFD) behavior in systems with flexible geometries and boundary conditions is vital for advancing engineering design, optimization, and performance evaluation. CFD simulations involve solving highly nonlinear Partial Differential Equations (PDEs), such as the Navier-Stokes equations and the Advection-Diffusion equation, which describe intricate spatial-temporal interactions of physical quantities like velocity, pressure, temperature, and scalar transport. These equations are coupled, nonlinear, and sensitive to boundary conditions and geometry changes, making traditional simulation methods computationally expensive and often limited in scalability. Traditional methods, such as grid-based solvers or conventional machine learning models (e.g., CNNs, RNNs), face challenges including computational inefficiency, difficulties handling unstructured data, and limited adaptability to dynamic geometries.

Recent breakthroughs in Graph Neural Networks (GNNs) offer powerful capabilities for processing unstructured data and capturing relationships in complex systems. By representing CFD domains as graphs, where nodes represent spatial points and edges encode interactions, GNNs can model fluid dynamics with remarkable accuracy and scalability. This project aims to leverage GNNs to predict CFD behavior in systems with flexible geometries and dynamic boundary conditions. The approach will focus on predicting both the vector fields (e.g., velocity, pressure gradients) and scalar properties (e.g., species concentration) governed by non-linear Navier-Stokes and Advection-Diffusion equations over time, providing a robust and adaptable tool for multi-dimensional CFD modeling from 2D to 3D.

Objectives

  1. Develop a Graph Neural Network Framework: Design and implement a GNN-based predictive model for CFD, tailored to handle unstructured and dynamic geometries. The framework will incorporate advanced GNN architectures, such as Message Passing Neural Networks (MPNNs) or Graph Attention Networks (GATs), to model spatial-temporal fluid dynamics.

  2. Enable Flexibility in Geometry and Boundary Conditions: Construct graph representations that seamlessly adapt to changes in geometry and boundary conditions, ensuring the model can generalize to a wide range of CFD scenarios.

  3. Benchmark Performance: Validate the model’s performance by comparing its predictive accuracy, computational efficiency, and scalability against traditional CFD solvers and state-of-the-art neural network-based approaches (e.g., U-Net, ConvLSTM).

Publication Opportunity

This research holds significant potential for publication in leading conference and journals in the field.

Premilinary

Strong research interest and solid experience in Pytorch

Some References:

  1. Pegolotti, L., Pfaller, M. R., Rubio, N. L., Ding, K., Brugarolas Brufau, R., Darve, E., & Marsden, A. L. (2024). Learning reduced-order models for cardiovascular simulations with graph neural networks. Computers in Biology and Medicine, 168, 107676. https://doi.org/10.1016/j.compbiomed.2023.107676

  2. Pfaff, T., Fortunato, M., & Battaglia, P. W. (2020). Learning Mesh-Based Simulation with Graph Networks. ArXiv. https://arxiv.org/abs/2010.03409

  3. Bonnet, F., Mazari, J. A., Munzer, T., Yser, P., & Gallinari, P. (2022). An extensible Benchmarking Graph-Mesh dataset for studying Steady-State Incompressible Navier-Stokes Equations. ArXiv. https://arxiv.org/abs/2206.14709

  4. Geometry-informed deep learning surrogate models for flow prediction, https://meetings.aps.org/Meeting/DFD24/Session/C02.13, Bulletin of the American Physical Society, 2024

Modernizing Impact Crater Simulation: Fast Equation of State Representation using Machine Learning#

  • Project code: gaco-067

  • Main supervisor: Gareth Collins, g.collins@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Ben Moseley, b.moseley@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

An important yet computationally expensive component of impact and shock physics simulators is the equation of state calculation. The equation of state is a material-specific description of how a substance responds to changes in density and internal energy, which can be very complex for geological materials, involving phase transformations and mixed-phase regions. M-ANEOS (isale-code/M-ANEOS) is a FORTRAN program for the construction of thermodynamic equations of state, which uses a suite of analytical approximations in different parts of thermodynamic phase space. While M-ANEOS can give a very accurate representation of material behaviour, using it directly in a shock physics calculation is prohibitively expensive. A common efficiency is to instead use M-ANEOS to generate an equation of state table and to use table interpolation in the impact simulator. Nevertheless, as the equation of state is called for every material, in every cell, in every timestep, even the tabular equation of state implementation can be a severe computational bottleneck for high-resolution tables, which often necessitates the accuracy of the description of material behaviour to be sacrificed for computational expediency. The aim of this project is to train a simple (C)NN-based model to replicate the output of M-ANEOS for a desired region of material phase space. The model will then be used to replace the M-ANEOS table lookup and interpolation. Performance of the new equation of state model will be compared with the baseline tabular equation of state in terms of accuracy and calculation speed.

Improving the functionality of iSALE3D - a widely-used planetary science code for simulating asteroid impacts [multiple projects]#

  • Project code: gaco-051

  • Main supervisor: Gareth Collins, g.collins@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Tom Davison, thomas.davison@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

iSALE3D is a shock physics code widely used in the planetary science community for simulating asteroid impacts on planetary bodies and collisions between asteroids. We have a range of projects that could support a team of students to add new functionality to the code.

iSALE3D is a finite difference code, written in modern FORTRAN, and parallelised with MPI.

Potential projects include: • Improvements to model setup, including more complex target and impactor geometries and material assignments. • Implementing higher order advection schemes to improve solution accuracy. • Implementing new material models in the code. For example, the full “Melosh” model acoustic fluidisation (Melosh, H. J., 1979. Acoustic fluidization: A new geologic process? Journal of Geophysical Research).

Note that some of these projects can accommodate multiple students.

Numerical modelling of non-linear contrast agents for vascular brain imaging#

  • Project code: cacu-121

  • Main supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Oscar Calderon, oc14@ic.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Ultrasound tomography techniques like full-waveform inversion have shown great promise to image the structure of the human brain and, more recently, also to image the dynamic nature of the brain vasculature (its structure and blood flow).

Key to vascular imaging in the brain is the use of accurate numerical models of sound propagation in the brain, as well as the use of injected contrast agents: micron-sized gas bubbles that are innocuous to humans. These bubbles oscillate in a complex and non-linear manner in the presence of ultrasound wave, making their accurate numerical modelling a significant challenge.

This project will explore the efficient and robust numerical modelling of acoustic waves, bubble contrast agents, and their complex interaction, both in static configurations and within the dynamics of blood flow. This will play a critical role in the development and demonstration of ultrasound-based vascular imaging in the brain.

Novel convolutional-based loss functions in deep learning#

  • Project code: cacu-120

  • Main supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Deborah Pelacani Cruz, deborah.pelacani-cruz18@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project does not accept multiple students.

One of the key ingredients in the design of deep learning (DL) models is the choice of loss function, which determines the loss landscape to be navigated during the training process; yet, this is also one of the aspects receiving the least amount of attention from researchers in the field.

Existing loss functions (with MSE being the most popular among them), have well-known limitations like limited perceptual fidelity and accuracy. Recently, we have proposed the use of convolutional (Wiener) matching filters as a metric capable of capturing global relations in the data and enhance data fidelity and training robustness.

The convolutional metrics, however, introduce a number of hyperparameters that make them very sensitive to specific dataset conditions, making their widespread applicability cumbersome. This project will address this by exploring how convolutional-based loss functions can be redesigned and extended. This will include improving the loss-function implementation so that their parameters are automatically learned instead of manually tuned.

The project will also explore strategies to improve the robustness and fidelity of these convolutional metrics by using multi-scale, multi-frequency and hierarchical implementations of the convolutional filter matching.

Optimal full-waveform inversion data embeddings#

  • Project code: cacu-085

  • Main supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Lluis Guasch, lluis.guasch@sonalis-imaging.com, Sonalis Imaging Ltd.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Over the last years, full-waveform inversion (FWI) has significantly altered how we conceive medical imaging. The recorded, time-resolved ultrasound data that is used in FWI reconstructions often contains a high level of redundancy, as well as significant non-critical information for the reconstruction algorithms. This project will explore the use of machine learning (ML) to design optimal embeddings for this kind of data that not only compress them (reduce their dimensionality), but also that can distill and highlight their most relevant features.

At the same time, FWI suffers from the ill-posedness of the inverse problem: the reconstruction is not guaranteed to converge to the right solution unless very narrow conditions are met. As an additional/complementary goal, this project will explore the feasibility of generating these embeddings in a space where the underlying model that corresponds to the data shares its representation in the embedding space and the ill-posedness of the FWI reconstruction can be alleviated.

The project lives at the intersection of signal processing, deep learning, and tomographic image reconstruction.

Improving the performance of iSALE3D - a widely-used planetary science code for simulating asteroid impacts [multiple projects]#

  • Project code: toda-056

  • Main supervisor: Tom Davison, thomas.davison@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Gareth Collins, g.collins@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

iSALE3D is a shock physics code widely used in the planetary science community for simulating asteroid impacts on planetary bodies and collisions between asteroids. We have a range of projects that could support a team of students to improve the current performance of the code.

iSALE3D is a finite difference code, written in modern FORTRAN, and parallelised with MPI. Potential projects include:

  • Implementing a regridding method, to allow the code to run for longer more efficiently, and to add/remove cells to the mesh to allow a wider range of problem types to be simulated.

  • Modernising and parallelising the I/O of the code to take advantage of open-source file formats (e.g. VTK).

  • Implementing perfectly matched layer (PML) boundaries to reduce spurious reflected waves.

Note that some of these projects can accommodate multiple students.

Teaching Cases Generator for the Business School#

  • Project code: jade-107

  • Main supervisor: Jay Deslauriers, j.deslauriers@imperial.ac.uk, Business School, Imperial College London

  • Second supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Joint project proposal ESE (Rhodri Nelson) & Business School (Jay DesLauriers)

Problem Description The case method, where students learn by analysing and discussing real-world business situations (cases), is central to business education. However, creating teaching cases is resource-intensive, and purchasing existing cases represents a significant cost for institutions and students. Oftentimes, published cases lack the nuance needed to support a specific module’s learning outcomes. Each case requires crafting compelling narratives, developing realistic datasets, and writing teaching notes that guide classroom discussion. Few faculty have the time to create customised cases for their courses. This project addresses case creation bottlenecks through AI assistance, potentially enabling wider adoption of case teaching across business schools and other disciplines. Harvard Business School and The Case Center are two excellent resources for case method teaching.

Computational Methodology The student will develop methods to:

  • Generate structured case narratives that maintain internal consistency and educational value

  • Create coherent synthetic business data that realistically supports the case narrative

  • Validate output quality against established case-writing standards

  • Ensure generated cases align with specified learning objectives

The work requires original development of algorithms for text generation, data synthesis, and quality control. Key challenges include maintaining consistency between qualitative and quantitative elements and ensuring pedagogical effectiveness.

Expected Deliverables

  • Functional case generation system

  • Interface for case creation and editing

  • Quality validation tools

  • Technical documentation

  • Evaluation framework and results

  • Sample cases demonstrating

Identifying unknow emission sources for decarbonisation using convolutional neural networks#

  • Project code: fafa-104

  • Main supervisor: Fangxin Fang, f.fang@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Linfeng Li, l.li20@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Modelling carbon dioxide (CO2) in atmosphere is essential to predict global climate change. The modelling requires data for CO2 emission sources. These sources are usually computed from a “so-called” bottom-up method by multiplying emission factors with human activities, which relies on the relevant activity inventories and thus often lacks timeliness. On the other hand, a top-down method uses atmospheric observations to quantify CO2 emissions. Such observations can provide frequent updates to emission data. To combine these two methods, effective numerical algorithms, or data assimilation methods, are necessary. Governed by a transportation equation, the dispersion of CO2 in atmosphere requires meteorology data to drive a transportation model. While this can be solved with numerical methods such as a Lagrangian particle dispersion model [1] or a Eulerian atmosphere model discretised with finite difference [2], it is also possible to build a neural network surrogate model for the urban transportation [3]. Once trained on known emission scenarios, such neural network models have the potential to identify emission sources given some observed CO2 concentrations in the region of interest. Due to the fast inference speed, these models can provide corrections to emission inventory derived from a bottom-up method at a timely manner. For the project, several known scenarios in an ideal domain will be generated as the training data; a neural network architecture will be chosen to fit the correlation of concentration observations and the emission hotspot in the domain; the model can be extended to a real urban environment if time permits. The model will be built with common machine learning packages.

[1] Sargent, et al., Proceedings of the National Academy of Sciences, 2018. 115(29): p. 7491-7496. [2] Maronga, et al., Geosci. Model Dev., 2020. 13(3): p. 1335-1372. [3] Cai, et al. Journal of Advances in Modeling Earth Systems 16.2 (2024): e2023MS003789.

Enhancing Simulation Performance and Interactive Web Applications#

  • Project code: adfa-169

  • Main supervisor: Ado Farsi, ado.farsi@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

This MSc project, supervised by Dr Ado Farsi in collaboration with the startup Tanuki Technology Ltd., focuses on advancing computational mechanics and interactive web applications for simulation tools. Students will be assigned one of two key objectives:

  1. Improving the Performance of Mechanics Simulation Code

  • Analyse and optimise existing simulation codes to enhance computational efficiency and scalability.

  • Implement advanced numerical techniques, parallelisation strategies, or GPU acceleration to improve performance.

  • Investigate profiling tools to identify bottlenecks and propose solutions for real-time or large-scale simulations.

  • Validate performance improvements through benchmarking and comparison with existing solutions.

  1. Development of Interactive Web Applications Using Trame

  • Convert Python-based simulation tools into interactive web applications using Trame.

  • Design an intuitive web-based interface allowing users to set up, run, and visualise simulations.

  • Implement dynamic visualisation of simulation results with real-time updates.

  • Develop a structured workflow with documentation and example applications for future scalability.

Topics Covered:

  • High-Performance Computing Techniques: Optimisation strategies, parallel computing, and GPU acceleration.

  • Web-based Simulation Platforms: Integration of Trame for interactive visualisation and user interaction.

  • Scalable Infrastructure: Implementation of containerised simulation environments and server-client architecture.

This project offers hands-on experience in computational mechanics, scientific computing, and web-based simulation platforms, providing valuable skills for careers in research and industry.

Agent-Based Framework for Full-Waveform Inversion (FWI) Workflows#

  • Project code: gego-205

  • Main supervisor: Gerard Gorman, g.gorman@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Edward Caunt, edward.caunt15@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Objective: Develop an agent-based framework for automating Full-Waveform Inversion (FWI) workflows using Devito. The framework will coordinate multiple agents to handle different stages of FWI, including forward modeling, gradient computation, optimization, and convergence validation.

Scope:

  • Implement a LangGraph-based multi-agent system to structure the FWI pipeline.

  • Design agents for forward simulation, adjoint-state gradient calculation, misfit evaluation, and iterative optimization.

  • Integrate self-verification steps, including numerical tests (e.g., dot-product tests for adjoint verification).

  • Develop strategies to automatically adjust optimization parameters based on convergence diagnostics.

Deliverables:

  • A modular framework for AI-assisted FWI workflows.

  • Demonstration of the system on synthetic and real seismic data.

  • Technical report comparing AI-assisted FWI workflow efficiency with traditional approaches.

AI Assistant for Efficient Solver and Adjoint Code Generation in Devito#

  • Project code: gego-204

  • Main supervisor: Gerard Gorman, g.gorman@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Edward Caunt, edward.caunt15@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

The goal of this project is to develop an AI-powered assistant capable of generating optimized solvers and their adjoints using Devito. The assistant will leverage symbolic reasoning and AI-based self-verification to automate the generation of finite-difference solvers, particularly focusing on complex wave physics such as acoustic tilted transverse isotropy (TTI).

Scope:

  • Design a LangChain-based assistant that takes a high-level mathematical description of a PDE and generates Devito code for forward and adjoint solvers.

  • Implement AI-based self-verification to check the correctness of generated solvers using symbolic differentiation and adjoint tests.

  • Optimize generated solvers for performance, considering loop fusion, memory locality, and vectorization.

  • Validate the assistant by generating solvers for seismic imaging applications, comparing performance and accuracy against reference implementations.

Deliverables:

  • A working prototype of the assistant integrated with Devito.

  • Benchmarks comparing generated solvers to reference optimized implementations.

  • A technical report.

AI Agent Framework for HPC Benchmarking and Performance Tuning#

  • Project code: gego-206

  • Main supervisor: Gerard Gorman, g.gorman@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Jack Betteridge, j.betteridge@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Objective: Develop an AI-powered agent framework for benchmarking, analyzing, and tuning HPC performance for Devito-based solvers. The system will autonomously execute benchmarking experiments, analyze performance bottlenecks, and suggest optimizations.

Scope:

  • Design a LangGraph-based agent system to automate benchmarking, including profiling computational kernels and memory access patterns.

  • Implement AI-based performance diagnostics to identify bottlenecks and suggest optimizations (e.g., cache blocking, OpenMP tuning, memory layouts).

  • Use LLMs to review HPC technical reports and synthesis code to help develop strategies.

  • Validate the framework with performance experiments on various HPC architectures (CPUs, GPUs, ARM-based systems).

Deliverables:

  • A benchmarking framework integrated with Devito for automated performance analysis.

  • Reports detailing performance scaling characteristics and recommended tuning strategies.

  • Technical report.

AI Agent for Reverse-Engineering Legacy Finite-Difference Code and Translating to Devito#

  • Project code: gego-207

  • Main supervisor: Gerard Gorman, g.gorman@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Jack Betteridge, j.betteridge@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Objective: Develop an AI-based agent framework to analyze legacy finite-difference (FD) implementations, infer their numerical schemes, and generate equivalent Devito-based solvers.

Scope:

  • Implement a LangChain-based system that takes legacy Fortran/C finite-difference code as input and automatically extracts PDE formulations, discretization schemes, and boundary conditions.

  • Design an AI verification module to ensure fidelity between the legacy implementation and the reimplemented Devito solver.

  • Introduce symbolic analysis to detect errors in the legacy code and suggest corrections before translation.

  • Validate the system by reverse-engineering known finite-difference solvers and comparing their outputs with Devito-generated equivalents.

Deliverables:

  • A working prototype of the AI agent for code translation.

  • Demonstrations with real-world legacy codes from seismic imaging.

  • A technical report on translation accuracy, performance improvements, and limitations.

Weighted Essentially Non-Oscillatory Finite Difference Schemes for Tsunami and Seismic Wave Modelling#

  • Project code: gego-010

  • Main supervisor: Gerard Gorman, g.gorman@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Edward Caunt, edward.caunt15@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Accurate simulations of tsunami and seismic wave propagation are dependent upon accurate numerical solvers which correctly capture the dynamics of the phenomena of interest. Whilst finite-difference (FD) methods are an efficient means of solving the governing partial differential equations (PDEs), conventional FD operators are prone to introducing spurious oscillations at discontinuities in the solution, such as steep gradients or boundaries. Weighted essentially non-oscillatory (WENO) schemes achieve high-order accuracy where solutions are smooth, but have the advantage of enhanced stability, non-oscillatory behaviour, and accuracy at discontinuities in the solution. In the context of geohazard assessment, it is crucial that behaviours observed in numerical models are inherent to the phenomena of interest, rather than being artefacts of the numerical solver. However, irregular bathymetry in the case of tsunami modelling, or topography in the case of seismic waves, can yield complex wavefields in which it is difficult to discern numerical artefacts from true wavefield behaviour. This project aims to explore the application of WENO schemes to either the shallow water equations (for tsunami modelling) or the elastic wave equation (for seismic wave modelling). This project will also demonstrate the implementation of WENO FD schemes via the domain-specific language (DSL) and compiler Devito, leveraging high-level specification and code generation to accelerate the development and testing of new models, whilst taking advantage of the portability and performance benefits offered by this paradigm.

Modelling pollution in the urban environment using neural networks#

  • Project code: clhe-212

  • Main supervisor: Claire Heaney, c.heaney@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Air pollution is damaging to our health and can cause a variety of adverse health outcomes such as an increased risk of respiratory infections, heart disease, lung cancer and an increased response to allergens. In urban environments, vehicles contribute significantly to air pollution through noxious gases such as nitrogen oxides, carbon dioxide as well as through particulate matter.

The CFD models of air flows in urban environments will be obtained through an in-house CFD code, called AI4PDEs [1,2,3], which solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method (e.g. first-order finite elements, second-order finite differences etc), so no training is needed. The solutions are the same as those obtained from Fortran or C++ codes (to within solver tolerances) and the code runs on GPUs as well as CPUs and AI processors.

Project 1: integration of data with computational fluid dynamics models In CFD (computational fluid dynamics) models, pollution can be modelled by specifying a pollution source and resolving the resulting pollution concentration field as it moves with the air flow. To improve the accuracy of the sources, data such as daily traffic flow and weather conditions can be integrated into the CFD model obtained from AI4PDEs. This will provide better predictions of a person’s exposure to pollution in an urban environment. The traffic and weather data will be obtained through the AI-Respire project (EP/Y018680/1) [4] from Open Weather [5].

Project 2: optimisation of green infrastructure In this project, the location of green infrastructure (hedges and trees) will be optimised in order to reduce people’s exposure to pollution. The key advantage of using AI4PDEs is that, because it is a neural network, it is easier to form sensitivities than with traditional codes. The sensitivities can be used with the optimisation methods within AI software libraries in order to optimise the location of green infrastructure. As a precursor to this optimisation we will form a guess of some sensible positions of green infrastructure (known as a prior), using a generative model (VAE, GAN or latent diffusion model), in order to help constrain the optimisation to sensible solutions, for example, to prevent trees occurring in the middle of roads.

For these projects, an interest in the following would be beneficial: computational fluid dynamics, neural networks, air pollution (for both projects) and generative AI (for the second project). The coding will be done in PyTorch and Python.

[1] Chen, Heaney, Pain (2024) https://doi.org/10.48550/arXiv.2402.17913

[2] Chen, Heaney, Gomes, Matar, Pain (2024) https://doi.org/10.48550/arXiv.2401.06755

[3] Phillips, Heaney, Chen, Buchan, Pain (2023) https://doi.org/10.1002/nme.7321

[4] https://www.imperial.ac.uk/news/246893/government-funding-revolutionise-ai-healthcare-research/

Modelling the microclimate in storm-prone areas#

  • Project code: clhe-209

  • Main supervisor: Claire Heaney, c.heaney@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

This project aims to develop high-resolution microclimate models to enhance our understanding of regions frequently affected by storms. Microclimates describe localised climate variations, and are influenced by topography, vegetation and urban structures. They are crucial for determining storm intensities, precipitation patterns and temperature fluctuations. Results would be of interest to policymakers, urban planners and emergency responders, all of whom have to make informed decisions to mitigate storm impacts.

This project will use data from computational simulations of storms to train a neural network in order to make predictions of storms for regions not associated with the training data. To obtain a model that has good generalisation capabilities, we will experiment with the state-of-the-art networks for prediction, including U-Nets and neural operators. The latter have demonstrated their ability to enhance the speed and resolution of weather forecasting models, particularly in high-resolution, short-term global weather predictions [1].

This project would suit students with a keen interest in machine learning techniques and microclimates.

[1] Pathak, Subramanian, Harrington, et al. (2022) Fourcastnet: A global data-driven high-resolution weather model using adaptive Fourier Neural Operators. arXiv preprintarXiv:2202.11214 https://arxiv.org/abs/2202.11214.

Using Neural Networks to Model Improvements in Urban Air Quality due to Photocatalytic Materials#

  • Project code: clhe-210

  • Main supervisor: Claire Heaney, c.heaney@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Andreas Kafizas, a.kafizas@imperial.ac.uk, Department of Chemistry, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Air pollution is damaging to our health and can cause a variety of adverse health outcomes such as increased risk of respiratory infections and heart disease. Nitrogen oxides (NOx) contribute to air pollution and are released when fossil fuels undergo combustion in vehicles and power stations. NOx also contributes to the formation of particulate matter and ground level ozone.

Photocatalytic materials offer an exciting potential for the removal of NO2 from the air by turning it into N2 and O2 in the presence of light [1]. A trial was carried out by a section of the M1 motorway, which showed a reduction in NO2 due to a barrier coated with photocatalytic material [2]. Building materials including brick and glass can also be sprayed with photocatalytic coatings, and this project will attempt to analyse the effectiveness of photocatalytic coatings on buildings in removing NO2 from the air.

In this project, we will investigate a number of urban layouts and model the potential reduction of NO2 from the air due to the use of photocatalytic coatings. To model the air flows, we will use an in-house computational fluid dynamics code, called AI4PDEs [3,4,5], which solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method, so no training is needed. The deposition rate of NO2 onto the photocatalyst is known from experiment, and this will need to be included in the AI4PDEs code to model how much NO2 can be removed.

For this project, an interest in several of the following would be beneficial: computational fluid dynamics, neural networks and air pollution. The coding will be done in PyTorch and Python.

[1] Towards Purer Air https://eic-uk.co.uk/media/baecbnd4/towards-purer-air.pdf

[2] Kafizas, Rhys-Tyler (2023) https://nationalhighways.co.uk/media/qp3fr5mg/smogstop-report-final-002.pdf

[3] Chen, Heaney, Pain (2024) https://doi.org/10.48550/arXiv.2402.17913

[4] Chen et al. (2024) https://doi.org/10.48550/arXiv.2401.06755

[5] Phillips et al. (2023) https://doi.org/10.1002/nme.7321

Creating an App for Visualising Green Infrastructure in the Urban Environment#

  • Project code: clhe-211

  • Main supervisor: Claire Heaney, c.heaney@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Prashant Kumar, p.kumar@surrey.ac.uk, University of Surrey, UK

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Green infrastructure (GI) such as hedges, trees, green walls and roofs, has recently demonstrated the capability to reduce the exposure of individuals to air pollution. For example, hedges can form barriers to pollution [1], trees can encourage air flow and divert pollutant-laden air flows away from pedestrians, and plants can absorb groundwater reducing impact of flooding [3]. Many local councils are adopting strategies based on green infrastructure in towns and cities to assist in creating more resilient and more healthy urban environments. Other benefits of GI include its ability to enhance mental health and well-being, and its capability to support bio-diversity.

Working alongside researchers of the GP4Streets consortium [5], the aim is to develop an AI tool which can visualise GI for an individual household or street and produce a commentary on the benefits of the particular GI. With inputs describing where the GI is to be located, what type of GI is desired, what type of housing it is to be applied to, and possibly photos of the house or street, or images from google earth, a generative model will be used to produce an image of what the household or street would look like with the GI. In the first instance, the app will be developed using pre-existing tools for visualising GI, but as the project progresses, we will explore the production of our own generators.

For these projects, an interest in the following would be beneficial: image generation, generative models and reducing exposure to air pollution. The coding will be done in PyTorch, Python and other languages.

[1] Abhijith et al (2025) https://doi.org/10.1016/j.scitotenv.2024.177959 [2] Tomson et al (2021) https://doi.org/10.1016/j.envint.2020.106288 [3] https://www.woodlandtrust.org.uk/trees-woods-and-wildlife/british-trees/flooding/ [4] https://x.com/UniOfSurrey/status/1899067721456775324 [5] GP4Streets consortium https://www.gp4streets.org/

Inverse problems for bio-physics using neural networks#

  • Project code: clhe-213

  • Main supervisor: Claire Heaney, c.heaney@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Pranav Mamidanna, p.mamidanna22@imperial.ac.uk, Department of Bioengineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

When assessing and promoting muscle health during rehabilitation, it is preferable to avoid invasive procedures which could increase risk, increase recovery time and increase the potential for complications. Surface electromyography measures the electric potential generated when muscles contract, which can be used to assess muscle health and guide therapy without the need for invasive techniques [1].

During this project, the student will solve Maxwell’s equations to determine the electrical signals present on the skin which are affected by the composition of the muscle (the “forward problem”). The next step is to take observations of the electrical signals and work out what composition of muscle could give rise to such observations (the “inverse problem”).

The numerical discretisation will be implemented using convolutional layers from PyTorch [2, 3, 4] and the resulting systems can be solved using Jacobi iteration or multigrid methods [2, 3]. Having a forward model that can be expressed as a neural network means that the model is differentiable, so that PyTorch’s backpropagation algorithm can be used to solve the inverse problem.

For this project, an interest in several of the following would be beneficial: numerical methods, optimisation and inverse modelling, neural networks and bio-physics (no specialist knowledge of the latter is required). The coding will be done in PyTorch and Python. This project will show how knowledge and code applied to fluid dynamics can be re-applied to bio-physics.

[1] Maksymenko, Clarke, Mendez Guerra, et al (2023) https://doi.org/10.1038/s41467-023-37238-w

[2] Chen, Heaney, Pain (2024) https://doi.org/10.48550/arXiv.2402.17913

[3] Chen, Heaney, Gomes, Matar, Pain (2024) https://doi.org/10.48550/arXiv.2401.06755

[4] Phillips, Heaney, Chen, Buchan, Pain (2023) https://doi.org/10.1002/nme.7321

Development of an open-source software tool for rapid, probabilistic assessment of Aquifer Thermal Energy Storage (ATES) and Groundwater Heating and Cooling (GWHC) installations#

  • Project code: maja-031

  • Main supervisor: Matthew Jackson, m.d.jackson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Meissam Bahlali, m.bahlali@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Open-loop geothermal systems pump groundwater into and out of an aquifer to provide low carbon heating and/or cooling to buildings via a heat pump. Groundwater Heating and Cooling (GWHC) systems produce groundwater at ambient temperature and re-inject warmed or cooled groundwater after use. Aquifer Thermal Energy Storage (ATES) systems re-use the warmed groundwater to provide heating in winter, and the cooled groundwater to provide cooling in summer.
Initial scoping studies to assess the potential capacity of GWHC and ATES systems, prior to detailed modelling, typically use a simple, deterministic approach, choosing just a few values of key parameters such as the number of boreholes, borehole flowrates, and the groundwater temperature change. The problem with this approach is that it provides little or no information on the possible range of system capacities along with the probability that a given capacity can be achieved. A better approach is to use a probabilistic approach to assess potential capacity. A simple Monte-Carlo method implements numerous trials, drawing values of key system parameters from user-defined probability distributions. Such an approach has been implemented at Imperial in an Excel framework, using the CrystalBallTM plug-in. However, CrystalBall is a commercial package which is a barrier to widespread uptake. The aim of this project is to develop a dedicated, open-source code that implements the Monte-Carlo method to assess the potential capacity of ATES and GWHC systems, via a simple-to-use interface and with some functionality to plot results. The code should be free to use (no third-party license requirements). Demand for such a code is large and continues to grow; widespread uptake is expected. The project will suit a student with interest in green energy provision and a desire to make a direct contribution to the uptake of new technology through the provision of a practical, useable software tool.

Diagnosing and rectifying a failed Aquifer Thermal Energy Storage System (ATES) installation in London#

  • Project code: maja-032

  • Main supervisor: Matthew Jackson, m.d.jackson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Carl Jacquemyn, c.jacquemyn@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: GEMS

  • This project may accept multiple students.

Aquifer Thermal Energy Storage (ATES) is a technology used to provide sustainable, low carbon heating and cooling to large buildings such as office blocks, hospitals and museums. In summer, waste heat is captured from the building’s cooling system and stored underground as warm water in a shallow reservoir (aquifer). In winter, the warm water is pumped out and used to provide heating via a heat pump, and the waste cool is captured and stored underground as cool water. A well-engineered system can provide heating and cooling indefinitely and save up to 96% of the CO2 emitted by conventional heating and cooling systems. However, not all systems are well engineered. This project will investigate the performance of a failing ATES system installed in London. Monitoring data from the system will be analysed and a number of reservoir simulation models will be constructed that can replicate the system behaviour, testing different hypothesis for the cause of failure. Based on the model predictions, remedial actions will be identified that could be implemented to rectify system operation. These actions will be communicated to the system operator. The main focus of the project will be on understanding the subsurface reservoir behaviour, as preliminary work suggests that this is the cause of system failure rather than the operation of the surface infrastructure.

Assessing the potential for Aquifer Thermal Energy Storage (ATES) to heat and cool Imperial’s South Kensington campus#

  • Project code: maja-033

  • Main supervisor: Matthew Jackson, m.d.jackson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Carl Jacquemyn, c.jacquemyn@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: GEMS

  • This project may accept multiple students.

Aquifer Thermal Energy Storage (ATES) is a technology used to provide sustainable, low carbon heating and cooling to large buildings such as office blocks, hospitals and museums. In summer, waste heat is captured from the building’s cooling system and stored underground as warm water in a shallow reservoir (aquifer). In winter, the warm water is pumped out and used to provide heating via a heat pump, and the waste cool is captured and stored underground as cool water. A well-engineered system can provide heating and cooling indefinitely and save up to 96% of the CO2 emitted by conventional heating and cooling systems.
Imperial is currently assessing options to decarbonize heating and cooling of its South Kensington Campus. ATES is a candidate technology, exploiting the Chalk aquifer. A test borehole drilled in Prince’s Gardens has confirmed the aquifer is present and has a large flow capacity. The aim of this project is to assess the potential capacity of an ATES system to heat and cool the South Kensington campus, using numerical simulation in IC-FERST (the Imperial College Finite Element Reservoir Simulator). A key aspect of the project will be to implement a novel borehole flow control system in IC-FERST using Python, that can adjust the flow rates from multiple boreholes to optimize system efficiency.

Development of “MyDigitalTwin” for Early Detection and Risk Prediction of Gastrointestinal cancers#

  • Project code: paki-040

  • Main supervisor: Patrick Kierkegaard, p.kierkegaard@imperial.ac.uk, Department of Surgery and Cancer, Imperial College London

  • Second supervisor: Parastoo Salah, p.salah@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Background: Gastrointestinal (GI) cancers are often diagnosed at advanced stages, leading to limited treatment options and poor survival outcomes. Screening asymptomatic populations remains unfeasible, highlighting the need for innovative tools to enable early detection and continuous monitoring of high-risk groups. Aims: This study aims to develop MyDigitalTwin, an intelligent health tool designed to create digital replicas of patients. By predicting individual GI cancer risks, it will facilitate early detection, identify disease precursors, and guide biomarker testing and personalized behavioural interventions. Methodology: Machine learning techniques will be employed to integrate patient data from the All of Us dataset, including genetic, epigenetic, and clinical information. Modular design principles will drive the development, testing, refinement, and validation of digital twin models. The process will focus on identifying clinical and lifestyle risk factors as well as epigenetic signatures linked to GI cancers. Multi-stage participatory design will ensure usability in primary care, enabling tailored recommendations for screening and prevention. Expected Outcomes: The primary outcome will be the creation of the MyDigitalTwin platform, capable of predicting GI cancer risks and simulating patient health scenarios. This tool will optimize early detection strategies, guide biomarker tests, and support personalized prevention and lifestyle modifications. By leveraging the diverse and longitudinal data within All of Us, MyDigitalTwin aims to address gaps in GI cancer care, improving early intervention and reducing the burden of late-stage diagnoses.

Forecasting injection-induced seismicity using machine learning techniques#

  • Project code: imki-057

  • Main supervisor: Iman Rahimzadeh Kivi, i.rahimzadeh-kivi@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Victor Vilarrasa, victor.vilarrasa@csic.es, Spanish National Research Council, Spain

  • Available to: EDSML GEMS

  • This project may accept multiple students.

Low-carbon geoenergies play a critical role in the energy transition necessary to achieving climate ‎goals. Exploiting these resources commonly involves fluid injection/extraction into/from the ‎subsurface, which has frequently induced earthquakes, sometimes large enough to be felt at the ‎surface and even cause damage to the infrastructure. Felt-induced earthquakes have led to the ‎immature termination of a number of geoenergy projects with the loss of investment in recent ‎years. Forecasting and managing induced earthquakes are challenging tasks because they need ‎‎(near) real-time processing of large, multi-physics datasets acquired during the operation for anticipating the spatiotemporal evolution of seismicity. The challenging nature of this problem is ‎evidenced by the lack of any industrial protocol that can successfully forecast and mitigate the ‎seismicity risk. In this project, the student(s) will develop improved understanding and forecasting ‎capabilities of induced seismicity by learning relationships embedded in field data using machine ‎learning methodologies. The data will be collected from geothermal energy exploitation projects, ‎e.g., the Soultz geothermal project in France and the Espoo geothermal system in Finland where ‎injection-induced seismicity raised public concerns about the safety of these activities. The ‎datasets comprise the recorded seismicity catalogues and operational data such as time-varying ‎injection rate, bottomhole pressure and cumulative injected fluid volume, which are all publically ‎accessible. The students will lead data collection, quality screening and processing. They will ‎adapt open-source machine learning codes to extend their capabilities for the analysis of the ‎collected data to identify leading operational controls on injection-induced seismicity and develop ‎a seismicity forecasting tool. This innovative, data-driven seismicity forecasting approach will offer ‎high gains for the scientific community, the geoenergy industry and the public as it paves the way ‎for geoenergy projects to safely contribute to the mitigation of the climate change emergency. ‎

Hydrogen storage modelling using StrataTrapper Kr-Pc upscaling applied to an underground gas storage reservoir#

  • Project code: sakr-041

  • Main supervisor: Sam Krevor, s.krevor@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Arthur Thenon, arthur.thenon@storengy.com, Storengy

  • Available to: GEMS

  • This project does not accept multiple students.

We are seeking a motivated and detail-oriented reservoir modeling & simulation intern to join our team in Storengy Headquarters at Bois-Colombes, France. The intern will join the Geosciences Department within the Storengy Technical Expertise division. The goal of this internship is to evaluate an innovative modelling workflow developed by Imperial College (https://imperialcollegelondon.github.io/StrataTrapper/) that could improve the gas movement prediction within the reservoir. To fulfill the coding component requirements of the MSc project, the student may either (1) extend the StrataTrapper toolset by implementing hysteresis in the upscaling or (2) using the simulations to train a surrogate model of hydrogen storage using a ML approach.

SCALED-S: Accelerating SCALED Through Compressed Variables for Multiple Physics Problems (Heat Conduction (Geothermal), Single-Phase Flow, Multiphase Flow, Particle Flows, Radiation)#

  • Project code: yuli-171

  • Main supervisor: Yueyan Li, yl222@ic.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

SCALED is a grid-invariant, geometry-invariant, and scalable surrogate AI model for numerical solvers, built on a generative modeling framework. It takes the solutions from physics-based models from the previous timestep as input and predicts the solutions for the next timestep. While the model has been validated for urban fluid flow (single-phase flow), it faces a significant time-efficiency challenge due to the large domain sizes and the denoising timesteps required by the framework. Inspired by latent diffusion models, we aim to develop a compression model (e.g., VAE or VQ-VAE) that compresses physical quantities into a latent space, enabling SCALED to operate within this reduced space. This approach will increase the efficiency of SCALED’s inference. To preserve SCALED’s grid-invariance, geometry invariance, and scalability, we also plan to ensure that the compression model itself is grid-invariant. Physical problems of interest include single-phase flow, multiphase flow, particle flows, and radiation, among others. We will provide: • The base SCALED model code • Baseline compression network code • Relevant datasets for various physics problems Students will use the computing resources at IC-HPC to carry out experiments.

SCALED-X: Extending SCALED to Multiple Physics Problems (Heat Conduction (Geothermal), Multiphase Flow, Particle Flows, Radiation)#

  • Project code: yuli-170

  • Main supervisor: Yueyan Li, yl222@ic.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

SCALED is a grid-invariant, geometry-invariant, and scalable surrogate AI model for numerical solvers, built on a generative modeling framework. It takes the solutions from physics-based models from the previous timestep as input and predicts the solutions for the next timestep. This model has already been successfully validated for urban fluid flow. We now aim to extend its application to additional physical problems such as multiphase flows, particle flows, radiation, and more. We will provide: • The base code for the SCALED model. • Relevant test datasets for the physical problems of interest. Students will conduct experiments using the computing resources at IC-HPC. This project will suit students with an interest in machine learning, diffusion models and one of the applications listed.

Multiomodal Learning in healthcare#

  • Project code: chli-009

  • Main supervisor: Che Liu, che.liu21@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Multimodal learning is essential in healthcare, as it enables the integration of diverse data types—such as medical reports, imaging data, and clinical notes—into cohesive, insightful models. This capability is crucial for improving diagnostic accuracy, personalized treatment plans, and overall healthcare efficiency.

In this project, we aim to explore the potential of multimodal large language models (LLMs) for processing and analyzing healthcare data. Our focus is on integrating and leveraging data from various modalities, including textual medical reports and medical images, to build a robust framework that enhances understanding, decision-making, and predictive capabilities in clinical settings.

This work holds promise for advancing multimodal AI applications in healthcare and addressing the unique challenges of interpreting complex, high-dimensional medical data.

Predictive Modelling of Geometallurgical Variables Using Machine Learning: Addressing Geospatial Uncertainty with Limited Data#

  • Project code: dime-047

  • Main supervisor: Diego Mesa, d.mesa@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Douglas Mazzinghy, douglasmazzinghy@ufmg.br, Universidade Federal de Minas Gerais, Brazil

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

In geometallurgy, accurately modelling variables such as comminution indices and metallurgical recovery is essential for optimising mine planning and processing strategies. Unlike ore grades, these variables are non-additive and are typically measured from a limited number of samples, making traditional geostatistical methods (e.g., kriging) unsuitable for their prediction. This project explores the use of machine learning (ML) to predict geometallurgical variables for block models by leveraging correlations with ore grades, which are spatially modelled using kriging.

The objective of the project is to develop an ML-based methodology capable of:

1.- Modelling complex, non-linear relationships between grades and geometallurgical variables. 2.- Quantifying spatial uncertainty in predictions. 3.- Producing geospatially consistent estimates to fill data gaps in block models.

Three ML approaches have been considered for this task:

1.- Gradient Boosting Machines (GBMs): Efficient algorithms like XGBoost and LightGBM, well-suited for small datasets, offering strong predictive performance and interpretable feature importance. 2.- Gaussian Processes (GPs): A probabilistic method that naturally quantifies uncertainty and models spatial relationships using covariance functions such as the Matérn kernel. 3.- Random Forests (RFs): A robust and simple tree-based algorithm, ideal for sparse data, which can be extended to quantify uncertainty using methods like quantile regression forests.

The student will have the flexibility to focus on one of these methods, comparing it against a simple multivariate linear regression, or implementing multiple algorithms to compare their performance. The project will involve:

1.- Data preparation and pre-processing using publicly available geometallurgical datasets. 2.- Algorithm development and training based on the chosen approach(es). 3.- Performance evaluation in terms of prediction accuracy, geospatial continuity, and uncertainty quantification.

Traffic Congestion Prediction using Graph-Based Deep Learning#

  • Project code: almi-199

  • Main supervisor: Alessandro Micheli, am1118@ic.ac.uk, School of Public Health, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Traffic congestion remains a persistent challenge for modern urban transportation systems, significantly compromising both urban air quality and overall environmental health. In our project, we investigate an innovative approach to modeling and predicting traffic congestion. Our methodology will focus on the use of modern Graph-Based Deep Learning Architectures, e.g. Graph Convolutional Networks, which are specifically designed to process data structured as graphs. Traditional methodologies, such as Gaussian Processes on Graphs, are effective at modeling spatial structures. However, they struggle with incorporating temporal dependencies due to scalability issues when processing large datasets. In our project, we address this limitation by initially benchmarking our approach against these established methods, focusing solely on spatial modeling to set a performance baseline. Once this benchmark is established, we extend our model by leveraging the capability of deep learning architectures to handle vast amounts of data and integrate temporal dynamics. This project requires experience in python programming and data wrangling (e.g. pandas, pytorch) as well as a good understanding of spatial statistics.

Diffusion Models for Inverse Problems in Time-Series Environmental Data#

  • Project code: memo-198

  • Main supervisor: Mélodie Monod, melodie.monod18@imperial.ac.uk, School of Public Health, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: EDSML

  • This project may accept multiple students.

Diffusion models, a class of generative models, have recently emerged as powerful tools for addressing inverse problems. In the context of Bayesian inference, inverse problems typically involve estimating unknown variables given observed data. This flexibility offers several advantages over traditional approaches, such as Markov Chain Monte Carlo (MCMC), particularly in allowing the use of empirical priors - where only sampling (rather than evaluation) is required - and enabling the processing of larger datasets. Recent advancements in diffusion models have broadened their applicability, allowing for the use of any likelihood from the exponential family [1], making them well-suited for real-world data, such as environmental data. While diffusion models have shown great promise to solve inverse problems given two-dimensional observations, where empirical priors are typically sampled from ImageNet, this project aims to develop a state-of-the-art diffusion model for time-series (one-dimensional) data. The key innovation will be to build an empirical prior for time series, similar to how ImageNet serves as a general-purpose prior for images. In this case, repositories like the Monash Time Series Forecasting Repository and the UCR/UEA Time Series Archive could provide the prior samples used for developing this model. A key application of this model will be in photovoltaic energy (PV) nowcasting, a task that involves predicting short-term solar energy generation based on historical data. The model will be built on existing approaches from [2]. This project is ideal for students with a strong background in Bayesian analysis and machine learning, particularly those with proficiency in Python programming. A passion for AI and a deep interest in time-series analysis and its applications will be essential for success in this project. Knowledge of High-Performance Computing (HPC) at Imperial would be an added advantage for this project.

[1] Short-term Prediction and Filtering of Solar Power Using State-Space Gaussian Processes. Sean Nassimiha, Peter Dudfield, Jack Kelly, Marc Peter Deisenroth, So Takao. 2023. arXiv [2] Diffusion Models for Inverse Problems in the Exponential Family. Alessandro Micheli, Mélodie Monod, Samir Bhatt. 2025. arXiv

Accelerating Multi-Scale Simulations with Finite Basis Physics-Informed Neural Networks#

  • Project code: bemo-062

  • Main supervisor: Ben Moseley, b.moseley@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project does not accept multiple students.

Motivation: Multi-scale simulations are essential for modelling complex phenomena, from biological systems to the evolution of the universe. However, traditional numerical methods, like finite element modelling, often require highly detailed simulation meshes and supercomputers to achieve accurate results. Physics-Informed Neural Networks (PINNs) have emerged as a promising alternative for simulation. Unlike traditional methods, PINNs bypass complex simulation meshes and can integrate observational data directly into the modelling process. However, PINNs still face significant challenges, including computational efficiency and difficulty in handling multi-scale interactions. Our recent work on Finite Basis PINNs (FBPINNs) demonstrated a way to address these challenges. By combining PINNs with domain decomposition and multilevel modelling, FBPINNs effectively break down global problems into smaller, manageable subproblems and facilitate communication between scales. Despite these advances, training efficiency remains a bottleneck.

Method: This project will focus on improving the training efficiency of FBPINNs, enabling faster and more accurate multi-scale simulations. We will address this through two key innovations:

  1. Combining FBPINNs with Extreme Learning Machines (ELMs): We will investigate using ELMs (randomly initialised neural networks where only the last layer is trainable) to linearise the underlying FBPINN optimisation problem, making them faster to solve.

  2. Learning basis functions: We will develop methods to learn better initialisations of the basis functions in the FBPINN, to improve the convergence rate of the model during training. The student will design and test these methods on a range of partial differential equations, including examples from fluid dynamics and wave propagation. They will build upon the existing FBPINN library written in JAX to implement their methods.

Research Questions:

  • How effectively can extreme learning machines accelerate the training of FBPINNs?

  • Do learned basis functions improve the convergence and accuracy of multi-scale simulations?

  • How do these methods perform on challenging PDEs such as the Navier-Stokes equations?

Outcome: The project will develop faster and more efficient approaches to multi-scale simulation using PINNs. The student will gain hands-on experience with cutting-edge methods in SciML and JAX-based programming, and contribute to the broader effort of creating efficient tools for scientific simulation.

Physics-Informed Neural Networks for Monitoring Enhanced Rock Weathering (Industry project with UNDO Carbon)#

  • Project code: bemo-059

  • Main supervisor: Ben Moseley, b.moseley@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Will Turner, will@un-do.com, UNDO Carbon

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

Motivation: Enhanced Rock Weathering (ERW) is a promising carbon dioxide removal (CDR) strategy with the potential to mitigate climate change by accelerating the natural weathering of silicate minerals. However, existing measurement techniques struggle to reliably quantify carbon capture in the complex and heterogeneous environments where ERW is applied. This limits our ability to understand and optimise its effectiveness. To address this challenge, advanced computational methods are needed to integrate diverse monitoring data—ranging from geochemical and environmental measurements to electrical geophysical imaging—into cohesive models of mineral weathering and carbon sequestration processes.

Method: This project will develop new computational tools for analysing and modelling ERW processes, focusing on integrating time-lapse Electrical Resistivity Tomography (ERT) and Spectral Induced Polarization (SIP) measurements alongside comprehensive auxiliary data (geochemical, environmental, microbiological, soil measurements). The project involves two key components:

  1. Physics-Informed Neural Networks (PINNs) for weathering modelling: PINNs will be developed to simulate mineral weathering, which will integrate concepts from established geochemical modelling approaches, such as those implemented in PHREEQC and PFLOTRAN. Initially, synthetic datasets will be used to train and validate these models under varying environmental and geochemical conditions.

  2. Automated workflows for electrical geophysical data analysis: The student will build workflows to apply their PINNs for processing and interpreting real time-series ERT and SIP data from a planned ERW monitoring field deployment in May 2025. These workflows will reconstruct subsurface property changes, providing insights into mineral weathering processes.

Available Data: Initial work will rely on synthetic datasets and established geochemical models for validation. Once field data becomes available, the developed methods will be applied to real-world measurements, evaluating their ability to track and quantify subsurface changes associated with ERW.

Research Questions: Can PINNs effectively simulate mineral weathering processes by integrating geophysical (electrical), geochemical, and environmental data? How well do automated workflows perform in reconstructing subsurface property changes? What role does domain knowledge play in improving the robustness and accuracy of PINNs for ERW monitoring?

Outcome: The student will develop new computational tools that combine geophysical imaging, biogeochemical modelling, and deep learning, contributing directly to advancing techniques for monitoring carbon capture in ERW. You will gain hands-on experience with cutting-edge computational methods in climate solutions while developing skills in scientific Python programming, deep learning frameworks, and environmental data analysis. The project offers opportunities for publication and contributes directly to advancing carbon dioxide removal monitoring techniques.

Diffusion models and ultrasound full-waveform inversion#

  • Project code: bemo-082

  • Main supervisor: Ben Moseley, b.moseley@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Motivation: Ultrasound full-waveform inversion (FWI) is a promising technique for high-resolution imaging of complex structures, such as the human brain. However, FWI often struggles with the ill-posedness of the inverse problem: there is insufficient observational data to uniquely determine the underlying velocity model. This lack of constraints results in reconstructions that are prone to errors and artifacts, limiting the clinical utility of FWI. Diffusion models have recently emerged as a powerful tool for learning complex data distributions. By generating realistic samples from a learned prior distribution, these models could provide a principled way to incorporate prior knowledge into FWI, addressing its inherent ill-posedness and improving reconstruction accuracy.

Method: In this project, the student will investigate using diffusion models as a learned prior to guide the FWI process. The workflow will consist of the following steps:

  1. Training a diffusion model: the student will train a diffusion model on a dataset of ground truth velocity models of the human brain, learning to generate random but realistic images of brain structures.

  2. Integrating the diffusion model into FWI: the diffusion model will be incorporated into a standard FWI algorithm. Specifically, as the FWI optimization takes gradient descent steps to minimize the data misfit, diffusion steps will be interleaved to push the reconstruction toward the distribution of realistic brain images generated by the diffusion model.

  3. Exploring alternative diffusion processes: An extension of the project could involve investigating whether alternative diffusion processes (e.g., cold diffusion or other variations) lead to improved results by better aligning the prior with the physics of the inverse problem.

Data and Benchmarks: The student will use synthetic datasets of human brain velocity models for training and validation. The performance of the proposed method will be compared to traditional FWI approaches. Evaluation metrics will include reconstruction accuracy, convergence speed, and robustness to noise in the observational data.

Research Questions: Can diffusion models provide a useful prior to improve the accuracy and robustness of FWI? How does interleaving diffusion steps with FWI optimization affect convergence and reconstruction quality? Do alternative diffusion processes, such as cold diffusion, lead to better alignment between the learned prior and the true brain velocity distributions?

Outcome: The student will develop and evaluate a novel FWI framework that leverages diffusion models to address the ill-posedness of the inverse problem. The results could significantly improve the accuracy and reliability of ultrasound-based brain imaging, opening pathways for more effective diagnostic tools.

Meta-learning full-waveform inversion#

  • Project code: bemo-081

  • Main supervisor: Ben Moseley, b.moseley@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Motivation: Ultrasound full-waveform inversion (FWI) is a powerful imaging technique with the potential to provide high-resolution, safe, and portable alternatives to conventional brain imaging methods like MRI and CT scans. However, FWI is computationally expensive, requiring iterative numerical optimization to reconstruct high-quality images. Additionally, its accuracy is often limited by the ill-posedness of the inverse problem, and FWI can struggle to converge to high-fidelity solutions. Recent advances in machine learning (ML) and differentiable physics offer exciting opportunities to address these challenges. By leveraging data-driven methods, we can learn more efficient and accurate inversion algorithms, potentially transforming FWI into a faster and more reliable tool for medical imaging.

Method: In this project, the student will investigate the use of meta-learning and differentiable physics to improve FWI algorithms. The key idea is to learn better optimization strategies from a large dataset of synthetic brain velocity models, which serve as ground truth. Specifically, the project will explore two approaches:

  1. Learning gradient descent steps: using meta-learning techniques, we will optimize the steps of the gradient descent process to accelerate convergence and improve the reconstruction quality of FWI.

  2. Exploring alternative search algorithms: beyond gradient descent, the student will investigate more advanced search methods, such as tree search, which may provide better solutions for the highly non-convex FWI problem.

Data and Benchmarks: The student will train and test their methods in JAX using synthetic velocity models, such as open-source datasets of brain anatomy or other well-established benchmarks for FWI. Performance will be compared against traditional FWI optimization methods and existing ML-based approaches, evaluating key metrics like convergence speed, computational efficiency, and reconstruction accuracy.

Research Questions: Can meta-learning techniques significantly improve the efficiency of FWI? How do alternative search algorithms like tree search compare to gradient-based methods in terms of reconstruction quality?

Outcome: The student will develop and benchmark novel algorithms that could significantly enhance the accuracy and computational efficiency of ultrasound FWI. These innovations could lead to practical advancements in portable brain imaging systems, with implications for safer and more accessible medical diagnostics.

Modernizing Impact Crater Simulation: Rewriting iSALE in JAX for Differentiable and Accelerated Modelling#

  • Project code: bemo-066

  • Main supervisor: Ben Moseley, b.moseley@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Gareth Collins, g.collins@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Motivation: Numerical modelling plays a vital role in understanding the formation of impact craters, providing insights into planetary evolution and surface dynamics. However, simulating large impacts is computationally expensive, often requiring high-performance computing resources. Furthermore, existing codes, such as the widely used iSALE (impact-SALE), are written in Fortran, a legacy programming language that is difficult to integrate with modern machine learning frameworks. This creates barriers for leveraging advancements in machine learning and GPU-accelerated computing to enhance the modelling process. Rewriting legacy codes in modern, high-performance libraries like JAX offers an opportunity to make simulations faster, more flexible, and differentiable. This could enable hybrid approaches combining traditional numerical methods with machine learning, leading to more efficient and accurate simulations.

Method: In this project, the student will re-implement a simplified 1D version of iSALE in JAX, a modern machine learning and scientific computing library. The project will involve the following steps:

  1. Rewriting iSALE in JAX: The student will translate the core algorithms of the iSALE hydrocode, which uses the Simplified Arbitrary Lagrangian Eulerian (SALE) method for shock physics simulations, into JAX. The focus will be on implementing a 1D version of the code to simplify development while preserving the essential physics.

  2. GPU Acceleration: JAX’s automatic parallelization capabilities will be used to run simulations on GPUs, significantly reducing computational time compared to the original Fortran implementation.

  3. Making the Code Differentiable: The new implementation will be differentiable by design, allowing seamless integration with machine learning methods. For example, neural networks can be incorporated to learn material properties or optimize model parameters in a hybrid simulation approach.

Data and Benchmarks: The student will validate the re-implemented code against benchmark simulations from the original iSALE codebase, ensuring the accuracy of the physics and shock dynamics. Performance metrics, such as runtime on CPUs and GPUs, will be compared to the legacy Fortran implementation. Research Questions: • Can JAX be used to accurately replicate the core physics of the iSALE hydrocode in 1D? • How much computational speedup can GPU acceleration provide compared to the original Fortran implementation? • What new capabilities does differentiability introduce for hybrid approaches that combine machine learning and numerical modeling? Outcome: The project will produce a modernized, GPU-accelerated, and differentiable version of the iSALE hydrocode in JAX. This will make impact crater simulations more accessible, flexible, and efficient, while opening the door to new research directions that integrate machine learning and shock physics modeling.

Electricity Smart Meter Data, Smart Energy Consumption and Decarbonisation#

  • Project code: mimu-201

  • Main supervisor: Mirabelle Muûls, m.muuls@imperial.ac.uk, Business School, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

This research project aims to harness machine learning and data science techniques to analyse electricity smart meter data, offering insights into energy consumption behaviours and the marginal carbon intensity of electricity generation. The ultimate goal is to facilitate intelligent load shifting, moving power consumption to times when the grid is supplied by cleaner energy sources, thereby reducing overall carbon emissions.

There are 3 potential objectives to the project(s) that will build on a rich dataset of smart meter data for thousands of consumers in India and the UK, as well as data from consumption through smart plugs and switches:

  1. Behavioural Insights from Smart Meter Data Use algorithms to segment households and businesses based on their consumption patterns. Apply time-series forecasting models to predict future electricity demand, incorporating external factors like weather and energy prices. Analyse behavioural drivers (e.g., price sensitivity, work schedules) influencing peak consumption periods.

  2. Real-Time Carbon Intensity Estimation Develop a model to estimate the marginal carbon content of electricity at different times of the day, integrating data on grid generation mix, fossil fuel dependency, and renewable energy availability. In addition, the project would explore the opportunity to use reinforcement learning to optimise demand response strategies based on real-time emissions data.

  3. Interactive Visualization & Decision Support Build interactive dashboards to visualise consumption patterns by user type, carbon intensity variations over time, potential cost and carbon savings from shifting demand. There would be the opportunity to explore what type of personalised, data-driven recommendations to consumers and policymakers could be made.

By empowering consumers and utilities with real-time data-driven insights, this/these project(s) will promote grid flexibility, enhance renewable energy integration, and drive effective policy interventions toward a cleaner energy future.

Improving Precision in RAG Systems for Granular and Niche Queries#

  • Project code: rhne-083

  • Main supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

This project investigates methods to enhance the precision and relevance of retrieval-augmented generation (RAG) systems, addressing challenges in retrieving fine-grained information from large document corpora. A key issue in RAG pipelines is that after chunking and vectorization, references to specific content—such as a section or a chapter—may be embedded within larger text vectors, making them difficult to retrieve accurately.

Key objectives include:

  • Analyzing the Limitations of Existing RAG Pipelines: Investigating how chunking strategies and embedding models affect retrieval for niche queries.

  • Optimizing Chunking and Indexing: Exploring hierarchical chunking, metadata tagging, and hybrid retrieval (dense + sparse vectors) to improve precision.

  • Query Reformulation and Context Expansion: Enhancing retrieval strategies by dynamically reformatting user queries to better match indexed content.

  • Benchmarking on Niche Information Retrieval Tasks: Evaluating different methods across domains where precise retrieval is critical (e.g., legal texts, research papers, technical manuals).

This research contributes to improving the accuracy of AI-assisted knowledge retrieval, making it particularly valuable for applications in scientific research, law, and enterprise AI systems. Ideal for students interested in AI search, NLP, and knowledge representation.

Rotating Hollow Vortex Equilibria: Numerical and Analytical Exploration#

  • Project code: rhne-076

  • Main supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project does not accept multiple students.

This project investigates the equilibrium configurations of rotating hollow vortices, building on the theoretical framework established by Nelson & Crowdy. Hollow vortices—regions of uniform vorticity surrounded by potential flow—exhibit rich dynamics, with applications in fluid mechanics, geophysical flows, and vortex stability analysis.

Key objectives include:

  • Mathematical Formulation: Reviewing the governing equations and equilibrium conditions for rotating hollow vortices.

  • Numerical Methods: Implementing computational techniques to solve for equilibrium shapes and assess their stability.

  • Comparison with Analytical Solutions: Exploring how numerical results compare with known exact solutions from complex analysis and conformal mapping approaches.

  • Extension to Multi-Vortex Systems: Investigating the dynamics and interactions of multiple hollow vortices in rotating equilibria.

The project provides an opportunity to explore applied mathematical techniques, computational fluid dynamics, and vortex dynamics, making it ideal for students interested in mathematical fluid mechanics, computational modelling, and nonlinear dynamical systems.

Evaluating the Effectiveness and Trustworthiness of RAG Systems and Fine-Tuned LLMs#

  • Project code: rhne-075

  • Main supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

This project investigates the performance, reliability, and trustworthiness of different retrieval-augmented generation (RAG) systems and fine-tuned language models across a range of tasks. With the increasing use of models such as DeepSeek, GPT-4o, and LLaMA variants, both with and without RAG and fine-tuning, it is critical to assess how these models retrieve, synthesize, and generate information in domain-specific applications.

Key objectives include:

  • Comparative Performance Analysis: Evaluating different models on factual consistency, coherence, and task-specific accuracy.

  • Trustworthiness and Bias Assessment: Analyzing the reliability of model outputs, susceptibility to hallucinations, and potential biases.

  • Impact of Fine-Tuning vs. RAG: Investigating whether domain adaptation through fine-tuning or external retrieval mechanisms leads to more robust outputs.

  • Benchmarking on Real-World Tasks: Using diverse datasets to test models across problem domains such as technical Q&A, scientific literature synthesis, and structured decision-making.

This research will provide quantitative and qualitative insights into the strengths and weaknesses of various approaches to enhancing LLMs. The findings will be valuable for AI practitioners, researchers, and industry stakeholders looking to optimize AI-assisted knowledge systems. Ideal for students interested in LLMs, AI evaluation, and applied NLP research.

LLM-Driven Abstraction Layer for Automating PDE Modelling in Devito#

  • Project code: rhne-074

  • Main supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

This project aims to develop an LLM-powered interface that acts as an abstraction layer for the Devito domain-specific language (DSL), enabling users to describe partial differential equation (PDE) problems in natural language and receive automatically generated Devito code.

Key objectives include:

  • Natural Language Interface: Implementing a chatbot-style interface where users describe PDEs, boundary conditions, and computational requirements in plain language.

  • Code Generation Pipeline: Developing a system that translates user input into syntactically correct Devito DSL code, handling key aspects like discretization, boundary conditions, and solver configuration.

  • Error Handling and Refinement: Allowing interactive refinement of generated code through iterative prompts and user feedback.

  • Performance Considerations: Ensuring that the generated code is efficient and optimally structured for execution on modern hardware.

This project sits at the intersection of AI-driven programming, scientific computing, and compiler design. It will enable researchers and engineers to model PDE-based problems more intuitively, lowering the barrier to high-performance numerical simulations. Ideal for students interested in LLMs, DSLs, and computational mathematics.

Epidemiological modelling of COVID-19 and other infectious diseases#

  • Project code: rhne-073

  • Main supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

This project focuses on computational epidemiology, specifically the spatially dependent SEIRD (Susceptible-Exposed-Infected-Recovered-Deceased) model. The SEIRD model extends the classic SIR framework by incorporating disease latency and mortality, making it well-suited for studying outbreaks such as COVID-19.

The project will leverage Devito, a domain-specific language (DSL) for solving partial differential equations (PDEs) using finite-difference methods. Devito enables efficient, high-performance simulations on modern architectures, making it an ideal tool for modelling spatial disease dynamics.

Key objectives include:

  • Implementing a spatially explicit SEIRD model using Devito, capturing how infections spread in heterogeneous populations.

  • Exploring different diffusion and mobility mechanisms, such as human movement patterns or environmental transmission factors.

  • Investigating the impact of intervention strategies (e.g., lockdowns, vaccination, travel restrictions) on disease spread.

  • Comparing numerical solutions to real-world epidemiological data where available.

This project will provide experience in mathematical modelling, high-performance computing, and numerical methods. It is suited for students with an interest in computational science, epidemiology, and parallel computing.

Agentic AI for Data Synthesis Using Vector Databases and LLMs#

  • Project code: rhne-072

  • Main supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

This project explores the development of an autonomous agent that leverages vector databases and large language models (LLMs) to synthesize new content from existing datasets. The goal is to create a tool capable of retrieving, reasoning over, and generating insights from a structured collection of documents, research papers, or other domain-specific data.

The key components of the project include:

  • Vector Database Integration: Storing and retrieving embeddings of textual data for efficient semantic search.

  • LLM-Powered Synthesis: Using LLMs to generate structured summaries, reports, or novel insights based on retrieved information.

  • Agentic Workflow: Implementing an agentic approach where the system autonomously iterates over search, retrieval, and synthesis steps to refine outputs.

  • Evaluation and Benchmarking: Measuring the effectiveness of the system in generating high-quality, coherent, and contextually relevant content.

This research will contribute to the intersection of retrieval-augmented generation (RAG), AI automation, and knowledge synthesis, with applications in fields like scientific research, legal analysis, and business intelligence. It is suitable for students with an interest in AI, natural language processing, and computational knowledge systems.

Interactive Visual Learning Assistant for Quantitative Concepts#

  • Project code: seo-109

  • Main supervisor: Sean O’Grady, s.ogrady@imperial.ac.uk, Business School, Imperial College London

  • Second supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Problem Description Students often struggle with abstract quantitative concepts in finance, economics, and mathematics. While text explanations exist in textbooks and online resources, they frequently lack the dynamic, visual elements that can illuminate these concepts. Traditional learning resources are static and cannot adapt their explanations to a student’s specific point of confusion. Students need interactive tools that can provide immediate, visual explanations tailored to their questions, enhancing understanding through both textual and graphical representations.

Computational Methodology The student will develop methods to:

  • Parse and understand student queries about quantitative concepts

  • Generate appropriate visualisations (graphs, charts, diagrams) dynamically

  • Create clear explanations that link visual elements to underlying concepts

  • Handle mathematical notation and equations appropriately

  • Adapt responses based on subject domain (finance/economics/maths)

The work requires developing algorithms for both natural language understanding and dynamic visualisation generation. Key challenges include ensuring mathematical accuracy, generating publication-quality visualisations, and maintaining coherence between textual and visual elements.

Expected Deliverables

  • Interactive chat interface

  • Visualisation generation system

  • Subject domain knowledge base

  • Query interpretation system

  • Performance evaluation framework

  • Documentation and user guide

Current Content Discovery for Module Teaching#

  • Project code: seo-108

  • Main supervisor: Sean O’Grady, s.ogrady@imperial.ac.uk, Business School, Imperial College London

  • Second supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Problem Description Keeping course material current with real-world developments is crucial for engaging students, but manually searching and curating relevant content is time-consuming for teaching teams. Business topics evolve rapidly, with new case studies and examples appearing daily across news outlets, industry publications, and academic sources. While learning platforms effectively deliver static content, they lack mechanisms to automatically incorporate emerging real-world examples that could enhance student learning. This affects both the timeliness of teaching materials and students’ ability to connect theory with practice.

Computational Methodology The student will explore autonomous agent architectures to develop methods for:

  • Extracting and structuring core topic areas from module specifications

  • Intelligently searching and filtering web content for relevance

  • Processing content to identify key themes and educational value

  • Generating contextualised summaries linking to learning objectives

  • Making decisions about content relevance and timing

The work involves developing information retrieval systems and exploring how agents can make autonomous decisions about content selection and presentation. Key challenges include designing effective agent behaviours and ensuring reliable content evaluation. The student may investigate both simple rule-based agents and more complex architectures that can learn from feedback.

Expected Deliverables

  • Content discovery and processing system

  • Search configuration interface for teaching teams

  • Content recommendation pipeline

  • Integration with common learning platforms

  • Evaluation metrics and analysis

  • Documentation and deployment guide

Applying convolutional autoencoders to data on unstructured meshes#

  • Project code: chpa-179

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Autoencoders have been in use for over 20 years [1,2], and are neural networks which aim to learn the identity map through an architecture that has a central `bottleneck’ layer (with a dimension lower than that of the input and output layers). Currently, they are used for their ability (i) to generate realistic-looking images (variational AEs); (ii) to extract noise from signals (denoising AEs); (iii) and in reduced-order modelling as a tool for dimensionality reduction. In the latter, dimensionality reduction (DR) methods aim to extract features from data (often solutions to PDEs, for example velocity or vorticity fields). One such DR method is Proper Orthogonal Decomposition (also known as Principal Component Analysis), which is based on Singular Value Decomposition (SVD) and a linear combination of basis functions. Autoencoders are a natural extension to the SVD, with the ability to better capture features in the flow due to the nonlinear activation functions.

Data from fluid flow simulations is often stored on unstructured, adapted meshes (as this can be a more efficient way of modelling these problems). However, many numerical techniques have been developed for structured grids, including convolutional neural networks. In order to apply convolutional autoencoders (CAE) to data on unstructured meshes, space-filling curves (SFC) can be used to to reorder the data before application of the autoencoder (SFC-CAE) [3]. A number of projects are now proposed to extend the SFC-CAE, including (i) applying SFC-CAE to unstructured, adapted meshes; (ii) finding optimal mappings between unstructured and structured meshes based on the earth mover’s distance (the Wasserstein metric).

For these projects, an interest in several of the following would be beneficial: numerical modelling, computational fluid dynamics and neural networks.

[1] Bourlard, Kamp (1988) Auto-association by multilayer perceptrons and singular value decomposition, Biological Cybernetics 59:291-294.

[2] Brunton, Noack, Koumoutsakos (2020) Machine Learning for Fluid Mechanics Annual Review of Fluid Mechanics, 52(1):477-508.

[3] Heaney, Li, Matar, Pain (2021) Applying Convolutional Neural Networks to Data on Unstructured Meshes with Space-Filling Curves, arxiv preprint.

AI for Personalised respiratory health and pollution (AI-Respire (EP/Y018680/1) [1]) - using generative AI to develop a predictive tool for the impact of pollution on health#

  • Project code: chpa-182

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

The challenge: Up to 90% of the world’s population breathe air with high levels of both indoor and outdoor pollution that kills ~7 million people each year worldwide. In the UK, it is rated as one of the most serious threats to public health with only cancer, obesity and heart disease eclipsing it. The health risks associated with fine and ultrafine particulate matter (PM2.5 and PM0.1) include development and exacerbation of respiratory diseases such as chronic obstructive lung diseases including asthma, respiratory infections and lung cancer.

Maximising health of individuals: Using cohort data, we will develop an AI model of exposure and health response to pollution to make generic predictions. At the individual level, we utilise the increasing range of sensor information from smart watches and mobile phones to personalise the predicted health response to levels of personal pollution exposure. Furthermore, as individual sensors collect more information about an individual and how they respond to their environment, an AI model can be refined making it specific to that individual, accounting for their medical condition and history where available, to make predictions and mitigation suggestions for that individual. This information can then be uploaded to the more generic generative neural network [2] to greatly improve its predictive ability over time. In this way the system becomes increasingly smart as time goes by and is able to better: (1) diagnose health issues e.g. viral infections, respiratory problems with the nuances in the patterns the AI system learns from sensitivity to individuals age and season, (2) provide advice and develop new health insights in terms of exposure to pollution, potentially provide health advice, to exercise or not.

Data from AI-Respire app: The data from wearable devices currently includes: temperature, blood oxygen saturation, respiratory rate, heart rate, movement, position (using GPS) plus user-added information on gender, weight and age. These are embodied in, for example, Fitbit, Garmin and Apple watch software with inclusion of air quality data from OpenWeather

Projects: Using this data the project will develop further the generative AI models [2] (e.g. GANs, VAEs, Latent Diffusion) in order to address challenges 1 and 2 above. Specific projects: Project 1: Form a generative autoencoder that is able predict health responses given pollution conditions for specific classes of individuals (e.g. asthmatics, healthy people). Then tailor this model (using transfer learning) so that it is able to predict the responses of a specific individual e.g. an asthmatic of healthy person. Project 2: Develop generative AI methods for diagnosing potential benefits (of actions that reduce pollution exposure) and provide uncertainties associated with those benefits. Further develop this for an individual. Project 3: Perform optimisation of a persons environment in order to maximise their health e.g. their commute through a city. Project 4: Integrate personal health data with UK BIO bank data in order to predict long terms consequences of pollution exposure for individuals. Project 5: Integrate personal ECG data with pollution exposure for individuals.

PyTorch and Python skills needed.

[1] https://www.imperial.ac.uk/news/246893/government-funding-revolutionise-ai-healthcare-research/

[2] Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville, Bengio (2014) Generative adversarial networks, arXiv preprint: arxiv:1406.2661.

Solving PDEs on unstructured and distorted meshes with AI libraries#

  • Project code: chpa-183

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Due to new hardware technologies, increases in computing power and developments in AI software, the benefits of combining AI techniques with traditional numerical methods for solving governing equations of dynamical systems are becoming apparent. For example, Cerebras have just released a new ‘AI computer’ which has about 1 million cores on a single chip with vastly increased computational speed, yet which requires much less energy than GPUs or CPUs, making it a tantalising prospect for researchers wishing to run demanding computations in an energy efficient manner. If the potential of combining the new AI computers and AI software with traditional numerical methods can be harnessed, one can expect a revolution in computational physics across disciplines and a ‘must have’ next generation technology. Based on the latest techniques in AI software that are particularly suited to exascale computing, this project develops a potentially revolutionary approach to the discretisation and solution of differential equations, and the formation of surrogate models. Our approach implements models, such as Computational Fluid Dynamics (CFD), using AI software with the aim of simplifying the software development and building on the very substantial developments already made in AI software. This simplification would greatly increase the number of developers capable of developing current CFD/nuclear codes, speeding up the implementation of developments and – crucially – their parallel scalability. This new approach also enables relatively simple development of digital twins, using the optimisation engine and sensitivities embedded in AI software. The digital twins can be used to optimise systems, form error measures, assimilate data and quantify uncertainty in a relatively straightforward manner with this approach. This enables the major deficiency (formation of an error estimate) in current modelling approaches for safety critical systems in nuclear engineering and the environment to be addressed.

Progress has been made with this approach on structured meshes [1,2,3], and in these projects we look to extend this approach to unstructured meshes. For these projects, an interest in several of the following would be beneficial: computational fluid dynamics, neural networks and numerical methods. The coding will be in PyTorch and Python.

The individual projects on offer are: (i) Multi-grid for solving large systems of equations (ii) Automatic formation of adjoint sensitivities using backpropagation (iii) Parallelisation (iv) Forming a Computational Fluid Dynamics solver (v) Cube-sphere representations for capturing the dynamics of global atmosphere using distorted structured grids (viii) Mesh movement for capturing fluid dynamics using distorted structured grids

[1] Chen, Heaney, Pain (2024) Using AI libraries for Incompressible Computational Fluid Dynamics, https://arxiv.org/abs/2402.17913.

[2] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, Computer Methods in Applied Mechanics and Engineering 426: 116974. https://doi.org/10.1016/j.cma.2024.116974

[3] Chen, Nadimy, Heaney et al. (2024) Solving the Discretised Shallow Water Equations Using Neural Networks, Advances in Water Resources, accepted. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4956116

Compressing particles and converting particles to a continuum using AI4PDEs, AI4Particles and Convolutional Autoencoders#

  • Project code: chpa-177

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Recently AI4PDEs [1,2,3,4] has shown great success in being able to solve very large fluid systems in great detail. Also, the AI4Particle modelling method has been able to describe particle motion in turbulent flows. AI4Particles can also be used as a Discrete Element Method (DEM) [5] and as a molecular dynamics model. AI4PDEs and AI4Particles offer potentially revolutionary advantages over conventional modelling methods. However, the compression of cloud of particles is an unexplored area as is the conversion of a cloud of particles to a continuum description, which could be more accurate. Here we will use the outputs of these two modelling approaches in order to attempt to compress the velocity and position of particles and also relate particle descriptions of a concentration field (e.g. air pollution) to a continuum description.

AI4PDEs is an in-house computational fluid dynamics (CFD) solver, which solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method, so no training is needed. The solutions are the same as those obtained from Fortran or C++ codes (to within solver tolerances).

A new type of scale independent convolutional autoencoder will be used within this project. This autoencoder, after training, can be applied to very large systems even when the original training was conducted on much smaller systems.

Two student projects are offered here: (i) particle compression; (ii) converting particles to a continuum.

For these projects, an interest in several of the following would be beneficial: computational fluid dynamics, particle dynamics, neural networks and numerical models. The coding will be done in PyTorch and Python.

[1] Chen, Heaney, Pain (2024) Using AI libraries for Incompressible Computational Fluid Dynamics, https://arxiv.org/abs/2402.17913.

[2] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, Computer Methods in Applied Mechanics and Engineering 426: 116974. https://doi.org/10.1016/j.cma.2024.116974

[3] Chen, Nadimy, Heaney et al. (2024) Solving the Discretised Shallow Water Equations Using Neural Networks, Advances in Water Resources, accepted. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4956116

[4] Phillips, Heaney, Chen, Buchan, Pain (2023) Solving the Discretised Neutron Diffusion Equations Using Neural Networks, International Journal for Numerical Methods in Engineering 124(21):4659-4686. https://doi.org/10.1002/nme.7321

[5] Naderi, Chen, Yang, Xiang, Heaney, Latham, Wang, Pain (2024) A discrete element solution method embedded within a Neural Network, Powder Technology 448: 120258. https://doi.org/10.1016/j.powtec.2024.120258

Generative Models, e.g., Scored-based Models, Flow Matching, for Viscous Incompressible Flow#

  • Project code: chpa-152

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Donghu Guo, donghu.guo21@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

In recent years, generative models such as score-based models and flow matching have had a significant impact across various fields. Their application to fluid flow problems has also gained increasing attention. This project aims to leverage these state-of-the-art models for viscous incompressible flow. There are two main streams in the project.

Superresolution – The student(s) could explore the use of generative models to perform superresolution tasks, enhancing simulations by mapping between different grid sizes.

Prediction – The student(s) could investigate the application of generative models for time-step forecasting. There are multiple possible approaches, including: Autoregressive prediction, where the model takes data from the previous time step, predicts the next, and iteratively repeats this process to forecast further. Multi-step prediction, where the model directly predicts multiple future time steps in a single forward pass.

Additionally, different conditioning strategies and inference methods for the generative models could be explored and compared.

Incorporating automatic code generation to solve PDEs using neural networks#

  • Project code: chpa-176

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

AI4PDEs [1,2,3] is an in-house computational fluid dynamics (CFD) solver, which solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method (e.g. first-order finite elements, second-order finite differences etc), so no training is needed. The solutions are the same as those obtained from Fortran or C++ codes (to within solver tolerances). The code runs on GPUs as well as CPUs and AI processors.

In this project the aim is to solve a general system of discretised differential equations. We will do this using SymPy (used within Devito) for the interface into which the user types their desired differential equations. The resulting sets of set of coupled equations are solved using a multi-grid method that is applied using a U-Net neural network architecture. The focus of the work will be on how to solve efficiently the resulting linear system of equations and maintaining the advantages of neural networks (e.g. the ability to run on CPUs or GPUs).

In this project the aim is to solve a general system of discretised differential equations. We will do this using SymPy (used within Modulus and Devito) for the interface that generates the differential equations and the resulting sets of set of coupled equations are solved using a multi-grid method that is applied using a U-Net architecture. The focus of the work will be on how to solve efficiently the resulting linear system of equations and maintaining the advantages of neural networks (e.g. the ability to run on CPUs or GPUs).

For this project, an interest in the following would be beneficial: computational fluid dynamics, neural networks and symbolic maths. During this project the student will use the SymPy, Modulus and PyTorch packages. On successful completion of the project, the solution of even complex PDEs will be made relatively easy impacting across computational physics. There might be the possibility to link with NVIDIA engineers on Modulus development.

[1] Chen, Heaney, Pain (2024) Using AI libraries for Incompressible Computational Fluid Dynamics, https://arxiv.org/abs/2402.17913.

[2] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, Computer Methods in Applied Mechanics and Engineering 426: 116974. https://doi.org/10.1016/j.cma.2024.116974

[3] Chen, Nadimy, Heaney et al. (2024) Solving the Discretised Shallow Water Equations Using Neural Networks, Advances in Water Resources, accepted. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4956116

Solving PDEs using machine-learning techniques#

  • Project code: chpa-175

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Claire Heaney, c.heaney@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Neural networks struggle with long-term time series prediction, as the error in the solution accumulates over time. Physics-informed approaches attempt to enforce the neural network solutions to satisfy physical laws (such as conservation of mass and/or momentum) with the hope that this will improve the long-term forecasting ability of the network [1,2,3].

In this project we will integrate AI4PDEs, a solver for computational fluid dynamics (CFD) that is written as a neural network. This will be used to calculate the residual that will be included in the loss function of the physics-informed neural network (PINN). This offers an elegant approach of combining CFD (an untrained neural network) with a surrogate model (trained neural network). We will also look at alternative methods to PINNs, such as minimising the discrete residual of the governing equations evaluated with the output of the neural network [4,5]; developing geometry- and grid-invariant foundational AI models that can be treated like a CFD model; particle and molecular dynamics surrogates.

The student(s) will be provided with CFD data and build AI models in Python and PyTorch.

[1] Raissi, Perdikaris, Karniadakis (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686-707.

[2] Arthurs, King (2021) Active training of physics-informed neural networks to aggregate and interpolate parametric solutions to the Navier-Stokes equations. Journal of Computational Physics, 438.

[3] Chen, Wang, Hesthaven, Zhang (2021) Physics-informed machine learning for reduced-order modeling of nonlinear problems, Journal of Computational Physics, 446:110666.

[4] Xiao, Fang, Buchan, Pain, Navon, Du, Hu (2010) Non-linear model reduction for the Navier-Stokes equations using residual DEIM method, Journal of Computational Physics, 263:1-18.

[5] Sipp, de Pando, Schmid (2020) Nonlinear model reduction: A comparison between POD-Galerkin and POD-DEIM methods. Computers & Fluids, 208

Enabling CO2 capture and storage using AI#

  • Project code: chpa-174

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Background: Carbon Capture and Storage (CCS) is one of the viable solutions that can effectively reduce CO2 emissions from “hard to decarbonise” industries (e.g., steel, cement). Most aspects of CCS are technologically mature but there are key barriers for large scale CCS implementation including: (a) high energy demands of the capture process, (b) high cost for predicting and monitoring CO2 plume migration, storage conformance and caprock integrity. In this project, we aim to address these barriers by developing (i) AI materials discovery pipeline for energy efficient CO2 capture, (ii) AI-solvers for modelling CO2 flow in geological storage formations.

The project: Here we develop a number of key methods for modelling of the detailed pore-scale stcture and interaction with multi-phase flows e.g. water-CO2 systems. We will develop an AI4PDE multi-phase flow model that will take X-ray images of pore scale structures and assign either solid or fluid to each node/cell and apply the AI4PDE approach (see [1]) to solve the multi-phase CO2-water systems at low Reynolds numbers. The approach involves using surface tension models for both the contact angle of the fluids with the solids as well as for the multi-phase interface. It will also form new surrogate and sub-grid-scale (SGS) models using new scale independent filters which enable the convolutional filters to be trained at smaller scale and then applied at a much larger scale.

The possible projects include: (1) AI and PDE solver: Large scale AI4PDE multiphase model formation to run in parallel on GPUs. (2) Surrogate AI solver: Surrogate model formation using scale independent convolutional filters and the latest latent diffusion generative neural network methods. (3) Sub-grid-scale AI Solver: Using scale independent filters to help form sub-grid-scale (SGS) models that will enhance the accuracy of the AI4PDE approach when used in combination with it. (4) Material discovery through molecular modelling: Generalisation of a recent AI4PDE related innovation that enables the modelling of particles using convolutional neural networks. These particles will be molecules within this project and the forces between them are defined by a potential function. This model will eventually enable the formation of novel solvents used for CO2 sequestration.

[1] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, Computer Methods in Applied Mechanics and Engineering 426: 116974. https://doi.org/10.1016/j.cma.2024.116974

Flooding Modelling With Neural Networks#

  • Project code: chpa-173

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Recently, AI4PDEs has shown great success in being able to solve very large scale floods across UK cities [1]. AI4PDEs [2,3] is an in-house computational fluid dynamics code that solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method (e.g. first-order finite elements, second-order finite differences etc), so no training is needed. The solutions are the same as those obtained from Fortran or C++ codes (to within solver tolerances). The code runs on GPUs as well as CPUs and AI processors. The benefits of reliable flood prediction are an ability to save lives by providing early flood warnings and also providing suitable flood resilience through flood defence measures.

Project 1: Coupling drainage systems and flood models A next step in the development of AI4PDEs for flooding is its application to model the drainage systems under our cities. This project has two stages: (a) establishing a simplified drainage model from a multiphase AI4PDE model; (b) coupling the drainage models with the 2D shallow water equation models both formed using AI4PDEs.

Project 2: Country-scale models Another important development of AI4PDEs for flooding is to be able to model flooding events at the country scale. The project aims to take a step towards country-scale modelling of large scale floods. The model is formed from a patchwork of shallow water equation models in different subdomains and covering the river catchment areas across a large area, for example, over Britain.

Project 3: 3D Flooding models 3D Flooding modelling enabling interaction with the atmosphere, underground drainage and rapid flows (e.g. rivers) to be modelled accurately.

For these projects, an interest in several of the following would be beneficial: computational fluid dynamics, neural networks, flooding and numerical methods. The coding will be done in PyTorch and Python.

[1] Chen, Nadimy, Heaney et al. (2024) Solving the Discretised Shallow Water Equations Using Neural Networks, Advances in Water Resources, accepted. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4956116

[2] Chen, Heaney, Pain (2024) Using AI libraries for Incompressible Computational Fluid Dynamics, https://arxiv.org/abs/2402.17913.

[3] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, Computer Methods in Applied Mechanics and Engineering 426: 116974. https://doi.org/10.1016/j.cma.2024.116974

Traffic modelling using Foundational Grid Invariant Neural Networks#

  • Project code: chpa-185

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

This project will model traffic using some of the transformative foundational surrogate modelling methods that have been applied recently to model fluid flows. These models include methods based on AI diffusion models (https://en.wikipedia.org/wiki/Diffusion_model) that have an architecture that ensures that the model can learn and be applied on any grid resolution. Thus the model can be applied to model flows around a few buildings but then re-applied to model flows on the city scale. The same approach will be attempted with the modelling of traffic. In this way it is hoped that the model will be able to learn the behaviour of vehicles and their interactions. The modelling will learn from (i) data collected from real traffic systems (ii) the VSM traffic simulator.

After completion of the project we hope to have a model of vehicle movement that can be used to determine the pollution levels emitted by vehicles as well as to be able to assess drivers’ behaviour by comparing with the neural network model in order to provide advice to improve driving economy or help determine if there is a fault with the vehicles.

The coding will be in PyTorch and Python.

There is a possibility of working with colleagues in the Civil Engineering Department at Imperial College as well as working with colleagues at University College London.

OpenWeather developments to improve urban, regional and global weather and pollution predictions and access to these predictions#

  • Project code: chpa-200

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Olga Buskin, olgabuskin@openweather.co.uk, OpenWeather

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Weather-related ML projects

  1. LLM with Weather APIs. An assistant to create weather reports by location. Optional: make reports tailored to different industries. This can be several different projects, each tailored to a different industry

  2. LLM with Weather APIs. A weather chatbot that answers users questions related to weather

Other ML projects

  1. A comprehensive review of cutting-edge knowledge graph/database approaches

  2. Creation of a knowledge graph to store and navigate across the database

  3. Vision transformers as a tool to automate computer screen navigation and interaction

See https://openweathermap.org/

Multi-Resolution Grid Decomposition for Generative AI Fluid Flow Modeling#

  • Project code: chpa-203

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

RAPIDS is a new Generative AI surrogate fluid flow modelling software, developed within AMCG, that implements a neural network-based approach for fluid flow prediction with various UNet architectures. It processes 2D and 3D flow field data with solid boundaries, implementing progressive masking strategies and iterative refinement for accurate flow prediction. The system currently uses uniform domain decomposition of the domain with fixed overlap regions, operating on structured grids for both training and inference.

This thesis will investigate a novel multi-resolution structured grid approach for domain decomposition, moving beyond the current fixed-resolution subdomain strategy. The proposed approach will maintain the simplicity of rectangular/square cells while allowing different resolution levels within the same domain. Rather than having uniformly sized subdomains, the domain will be adaptively subdivided into finer rectangular grids in areas requiring higher resolution (e.g., near boundaries or in regions with complex flow features) while using coarser grids in smoother flow regions. The student will implement and evaluate hierarchical grid structures similar to those used in AMR (Adaptive Mesh Refinement) frameworks like AMReX (see: https://amrex-codes.github.io/), while maintaining the structured nature of the grid. This approach presents unique challenges for Gen AI, particularly at resolution interfaces where neighbouring cells may have different sizes. The work will focus on developing effective data handling strategies to accommodate these multi-resolution structured grids while preserving the existing CNN architecture’s ability to learn flow physics.

The work will be added to the existing codebase to incorporate multi-grid integration, where solutions computed at coarse scales are progressively refined at finer scales, potentially improving both computational efficiency and prediction accuracy for complex flow scenarios with irregular domain boundaries. RAPIDS is a grid and geometry invariant architecture (evolving from https://doi.org/10.3389/fphy.2022.910381), and the student’s work will be published in leading ML conferences.

AI4PDEs for Viscous Incompressible Flow#

  • Project code: chpa-153

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Donghu Guo, donghu.guo21@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

The student(s) are expected to explore the use of AI4PDEs to generate simulations for viscous incompressible flow problems governed by the Navier-Stokes equations, such as flow past a cylinder, multiple cylinders, or buildings, starting in 2D and potentially expanding to 3D. Also, the students are expected to compare the results with those of traditional numerical solvers such as ICFERST. Furthermore, if time allows, the student could use these generated simulations to train some of the AI surrogate models.

AI4PDEs is an in-house computational fluid dynamics (CFD) solver, which solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method (e.g. first-order finite elements, second-order finite differences etc), so no training is needed. The solutions are the same as those obtained from Fortran or C++ codes (to within solver tolerances).

Producing optimal adaptive meshes for finite element methods using convolutional generative Neural Networks#

  • Project code: chpa-180

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Boyang Chen, boyang.chen16@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

To model fluid flow problems that are turbulent and/or have complex geometries or interfaces, meshes that change their resolution can be an efficient way to obtain accurate results (known as mesh adaptivity or mesh optimisation) [1,2,3]. In order to determine where the high and low areas of resolution should be, an error measure is formed, sometimes based on the gradient of solution variables. We use mesh optimisation in order to optimise the size and shape quality of every tetrahedral element in the finite element mesh in order to meet the demands of the error measure e.g. achieve a 1% accuracy in a solution variable. Here a convolutional autoencoder is proposed to form a new approach to adapt a mesh optimally in response to the error measure.

Convolutional methods have been used to represent differing material properties [4]. If these materials have convex shapes (e.g. grains in a sand stone), then the centres of the different materials can represent nodes of a finite element tetrahedral mesh. If different materials are next to one another then we assume there is an edge (in the tetrahedral mesh) between the associated nodes. In this way a tetrahedral mesh can be formed in 3D. Moreover, if the convolutional methods are able to gauge the size and shape quality of the tetrahedral elements then these can have their sizes and shapes optimised thus producing an optimal mesh.

For this project an interest in some of the following would be beneficial: computational fluid dynamics, numerical methods and neural networks. The coding will be done mostly in PyTorch and Python.

[1] Pain, Umpleby, de Oliveira, Goddard (2001) Tetrahedral mesh optimisation and adaptivity for steady-state and transient finite element calculations, Computer Methods in Applied Mechanics and Engineering, 190(29):3771-3796.

[2] Kampitsis, Adam, Salinas, Pain, Muggeridge, Jackson (2020) Dynamic adaptive mesh optimisation for immiscible viscous fingering, Computational Geosciences, 24:1221-1237.

[3] Salinas, Regnier, Jacquemyn, Pain, Jackson (2021) Dynamic mesh optimisation for geothermal reservoir modelling, Geothermics, 94:102089

[4] Gayon-Lombardo, Mosser, Brandon, Cooper (2020) Pores for thought: generative adversarial networks for stochastic reconstruction of 3D multiphase electrode microstructures with periodic boundaries. Computational Materials, 6(1):82.

AI4PDEs for Viscous Incompressible Flow#

  • Project code: chpa-151

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Claire Heaney, c.heaney@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

The student(s) are expected to explore the use of AI4PDEs to generate simulations for viscous incompressible flow problems governed by the Navier-Stokes equations, such as flow past a cylinder, multiple cylinders, or buildings, starting in 2D and potentially expanding to 3D. Also, the students are expected to compare the results with those of traditional numerical solvers such as ICFERST. Furthermore, the student could use these generated simulations to train some of the AI surrogate models if time allows.

AI4PDEs is an in-house computational fluid dynamics (CFD) solver, which solves discretised systems of equations using neural networks. The weights of the networks are determined in advance through the choice of discretisation method (e.g. first-order finite elements, second-order finite differences etc), so no training is needed. The solutions are the same as those obtained from Fortran or C++ codes (to within solver tolerances).

Foundational Wild Fire Modelling using Generative AI Surrogates#

  • Project code: chpa-214

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Importance of wildfire modelling: Recent wildfire (https://www.bbc.co.uk/news/topics/ce2gz9mdde3t) have highlighted the importance of wildfires and the damage and threats to life they provide.
Wildfire modeling aims to aid wildfire suppression, increase the safety of firefighters and the public, and minimize damage. Wildfire modeling can also aid in protecting ecosystems, watersheds, and air quality.

Physics based modelling: Wild Fire Modelling is now a key part of Weather Forcasting Models (for the WRF atmospheric model, see the WRF-Fire model https://ral.ucar.edu/model/wrf-fire-wildland-fire-modeling). Models like these are typically based on physics and allows users to model the growth of a wildland fire in response to environmental conditions of terrain slope, fuel characteristics, and atmospheric conditions, and the dynamic feedbacks with the atmosphere. Thus, there also, ultimately, needs to be a two way coupling between the fire behavior and the atmosphere e.g. so the wild fires are able to generate their own weather conditions.

This project: Recent developments in generative AI surrogates (https://www.frontiersin.org/journals/physics/articles/10.3389/fphy.2022.910381/full) have been able to model fluid flows and geometries with convolutional neural networks and have Foundational AL properties. That is they are convolutional grid and geometry invariant and thus after training, can - in principle -, model any size of computational domain. Since they are Foundational they are very flexible and may be able to model wildfires with good accuracy and speed. Also since these generative AI surrogates are neural networks they can be linked to AI workflows in order to assimilate data, design interventions, etc. In this project we will develop ‘Foundational Wild Fire Modelling using Generative AI Surrogates’ and apply this, in collaboration with ‘Fist street’, to model wildfires (https://firststreet.org/) using physics based modelling data provided by ‘Fist street’.

Optimizing operational efficiency in multiphase fluid processes using AI#

  • Project code: chpa-122

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Nathalie Carvalho Pinheiro, n.pinheiro23@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Multiphase flow is the mechanical phenomenon of two or more phases flowing simultaneously through a determined space. The phases interact with each other, resulting in a variety of flow patterns. Multiphase flow is part of our natural environment, such as smoke formation in the air, and it is the working mechanism behind many industrial applications, including flow in nuclear power plants, the chemical/food industry, and some scenarios of geothermal energy and carbon dioxide transport.

This research is based on analysing an open dataset that contains multivariate time series from sensors and valves in a system with multiphase flowing through a pipeline (available in: petrobras/3W). The dataset contains data from normal operation and a variety of abnormal events of multiphase flow in pipelines. These abnormal events can be, for instance, a restriction in upstream or downstream valve or instability. The research consists in comparing different machine learning techniques applicable to tabular data or time series to identify and classify the abnormalities [1, 2]. This algorithm could be used to set alarms and control systems, enabling interventions to increase operation efficiency, thus reducing carbon emissions from that process plant.

The algorithms to be developed are applicable to many multiphase flow scenario. Some of them, such as carbon storage, suffer with the lack of real data available. Thus, research using similar processes is very relevant.

[1] B. G. Carvalho, R. Emanuel Vaz Vargas, R. M. Salgado, C. Jose Munaro, F. M. Varejao, Hyper-parameter tuning and feature selection for improving flow instability detection in offshore oil wells, in: 2021 IEEE 19th International Conference on Industrial Informatics (INDIN), 2021, pp. 1–6. doi:10.1109/INDIN45523.2021.9557415. [2] A. P. F. Machado, C. J. Munaro, P. M. Ciarelli, R. E. V. Vargas, Time series clustering to improve one-class classifier performance, Expert Systems with Applications 243 (2024) 122895. URL: https://www.sciencedirect.com/science/article/pii/S0957417423033973.

Dreambirds in Motion: AI-Driven Temporal Consistency in Surreal Video#

  • Project code: chpa-087

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: James Coupe, james.coupe@rca.ac.uk, Royal College of Art, UK

  • Available to: ACSE

  • This project does not accept multiple students.

Building upon the Birds of the British Empire collaboration with the RCA, this project focuses on developing advanced techniques for transforming still images into temporally consistent video sequences. The candidate will join a multidisciplinary team of artists and researchers to explore generative models capable of preserving the eccentric and surreal qualities of avian representations across temporal dimensions.

The primary challenge lies in maintaining key outlier features and their contextual significance while ensuring smooth transitions and temporal coherence throughout the video sequence. To address this, the project will explore methodologies in generative AI, such as temporal consistency models, convolutional recurrent architectures, and neural networks for fluid motion simulation. The focus will also include analysing how anomalies in feature spaces can be preserved and contextualized dynamically in long-form video outputs.

This work will contribute to advancing AMCG’s understanding of video synthesis techniques and outlier feature preservation within generative frameworks. It also offers opportunities for collaborative publications and potential exhibitions, demonstrating the intersection of technical innovation and artistic exploration.

Dreambirds: Diffusion-Based Exploration of Surreal Avian Forms#

  • Project code: chpa-086

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: James Coupe, james.coupe@rca.ac.uk, Royal College of Art, UK

  • Available to: ACSE

  • This project does not accept multiple students.

This project aims to develop a text-to-image diffusion model tailored to generate eccentric and surreal avian representations. This work forms part of the Birds of the British Empire collaboration with the Royal College of Art’s (RCA) School of Arts & Humanities. The model will explore how text prompts can guide feature exaggeration, generating highly unconventional outputs while maintaining internal consistency and semantic coherence.

To achieve this, the project will build upon foundational generative AI frameworks, incorporating advanced components such as Transformers, U-Net architectures, and Diffusion Models. A particular focus will be placed on:

Expanding the model's capability to interpret and execute extreme semantic features within prompts.
Exploring how latent space can be manipulated to encourage the amplification of atypical or outlier attributes.
Balancing the generation of bizarre features with the preservation of coherence and recognizability.

This project offers a unique opportunity to contribute novel techniques to generative AI research, allowing for the consistent creation of surreal imagery. The outcomes will be directly aligned with AMCG’s objectives and will support publications in collaboration with the RCA, as well as potential exhibitions showcasing the creative synergy of AI and art.

AI-Surrogate models for carbon storage in porous media#

  • Project code: chpa-127

  • Main supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Nathalie Carvalho Pinheiro, n.pinheiro23@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Modelling solid-fluid interaction requires a set of partial differential equations (PDEs) to be solved together to predict the flow behaviour of the system throughout the domain. When numerically solving the equations that govern a solid-fluid interaction, a spatial discretisation is performed. Conventional high-fidelity physical models require high-resolution discretization to obtain reliable results, leading to a large amount of time and memory consumption. Artificial intelligence (AI) algorithms have been applied to scientific computing to obtain more efficient models. One prominent approach is to use machine learning techniques to build surrogate models capable of performing a simulation similar to the physical one but in a shorter time or using less computational effort.

This work will train and evaluate a machine learning surrogate model to predict fluid flow in porous media in a carbon storage scenario, based on a previous simulation [1]. The models generated could be extended to many fluid flow applications.

A current model based on U-Net predicts the next timestep. This project seeks to improve the recursive predictions for future timesteps. It can compare different network structures or forecasting strategies, considering models with recursive single-timestep prediction and options with multiple-timestep prediction. This project will suit people interested in training neural networks and learning about carbon storage.

[1] J. Maes, C. Soulaine, H. P. Menke, Improved volume-of-solid formulations for micro-continuum simulation of mineral dissolution at the pore-scale, arXiv preprint 2204.07019 (2022).

Analyzing ESG Rating Variability: A Data-Driven Comparison of the GOLDEN Dataset and Traditional Frameworks#

  • Project code: fipe-159

  • Main supervisor: Filippo Pellegrino, f.pellegrino22@imperial.ac.uk, The Leonardo Centre on Business for Society, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: EDSML GEMS

  • This project does not accept multiple students.

Environmental, Social, and Governance (ESG) metrics are widely used to assess corporate sustainability performance. However, significant variability exists across different ESG rating frameworks, particularly in the Social (S) and Governance (G) dimensions. This heterogeneity creates challenges in data reliability, comparability, and decision-making for investors, policymakers, and researchers. This project aims to explore the GOLDEN dataset, comparing it to usual ESG rating systems, providing insights into the dataset and into the factors contributing to these discrepancies.

The GOLDEN dataset, developed in collaboration with the Leonardo Centre on Business for Society, is the most comprehensive global repository of corporate sustainability initiatives. It consists of around 1 million sustainability actions extracted from more than 60,000 sustainability reports of over 12,000 publicly listed companies accross more than 20 years. Machine learning algorithms identify and classify these initiatives based on the 17 Sustainable Development Goals (SDGs) and 14 behavioral components. This provides researchers with a novel level of granularity to analyze corporate sustainability strategies, track the evolution of sustainability initiatives across industries and geographies, and support evidence-based policymaking and investment decisions.

The student will conduct a comparative data analysis using statistical and machine learning techniques. She/he will have access to the GOLDEN dataset, together with ESG ratings provided by the College, and will conduct comparative data analysis across different GOLDEN dataset dimensions and related ESG ratings.

Computational Fluid Dynamics projects using OpenFOAM#

  • Project code: jape-091

  • Main supervisor: James Percival, j.percival@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: EDSML

  • This project may accept multiple students.

OpenFOAM (https://openfoam.org/) is an open source, user-extendable C++ based solver for Computational Fluid Dynamics (CFD), solving variations of the Navier-Stokes equations governing viscous fluid flow. It comes with solvers for a number of classes of problem:

  • Steady state/time averaged

  • Rotating fluids

  • Moving mesh problems plus a number of others.

Depending on the interest of the student, you’ll run the code on and analyse the results you obtain. The precise terms of the project are fairly flexible, based on the student’s interests within the overall topic and other data sets which become available. If you are interested in the project, you are strongly encouraged to discuss the topic with me before making your decisions on which projects to request, please feel free to email me to find a time to discuss.

Some References

Multiphase (or single phase) projects using OpenFOAM#

  • Project code: jape-130

  • Main supervisor: James Percival, j.percival@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project may accept multiple students.

OpenFOAM (https://openfoam.org/) is an open source, user-extendable C++ based solver for Computational Fluid Dynamics (CFD), solving variations of the Navier-Stokes equations governing viscous fluid flow. It comes with solvers for a number of classes of problem:

  • Multiphase flows (e.g. gas liquid or liquid-liquid flows)

  • Rotating fluids

  • Moving mesh problems plus a number of others.

Depending on the precise interest of the student, you’ll run the code on and analyse the results you obtain. The precise terms of the project are flexible, based on the student’s interests within the overall topic and other data sets which become available. If you are interested in the project, you are strongly encouraged to discuss the topic with us before making your decisions on which projects to request, please feel free to email me.

One potential project would create a series of benchmarks for multiphase flow in OpenFOAm, ultimately leading to a study of slugging flow in pipes. This could lead to collaboration with

Some References

Implementing a Simple Morphodynamic Numerical Model in Devito#

  • Project code: jape-090

  • Main supervisor: James Percival, j.percival@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

The action of fluids such as water flowing over sediments such as soils and sands can cause the bed surfaces themselves to change, via the erosion of materials in areas of strong flow and their subsequent deposition where materials sink out of the water column. These changes can then affect the fluid motion itself, creating a feedback loop which is potentially complex, nonlinear and highly coupled.

Various PDE-based models exist for these processes, accurate to varying degrees for different classes of problem.In this project you will implement simple a hydro-morphodynamic model in Devito (https://www.devitoproject.org), a fast, parallel scalable Python-based framework for automatically generating finite-difference based solvers for PDEs. You will compare this model’s accuracy to both experimental data and to other numerical models.

Using AI/RL for ship routing optimisation#

  • Project code: mapi-189

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

This project will consider how AI (e.g. reinforcement learning) approaches, incorporating weather and maritime data (e.g., wind, waves, currents) can be used as part of decision making for for increasing the efficiency and sustainability of ship navigation.

AI based regional weather modelling#

  • Project code: mapi-188

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Simon Warder, s.warder15@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

There is a move to implement higher-resolution, regional scale models of the atmosphere building on recent developments in global scale AI models. This project will investigate this active area of research, building on ECMWF’s ANEMOI framework

https://www.ecmwf.int/en/newsletter/181/news/introducing-anemoi-new-collaborative-framework-ml-weather-forecasting https://www.ecmwf.int/en/about/media-centre/news/2024/anemoi-new-framework-weather-forecasting-based-machine-learning https://anemoi.readthedocs.io/projects/models/en/latest/#

and motivated by recent progress in the use of stretch grids for regional scale modelling:

https://www.ecmwf.int/en/about/media-centre/aifs-blog/2024/data-driven-regional-modelling https://waipangsze.github.io/2025/01/25/ML-ECMWF-Anemoi/#high-resolution-inside-the-nordic https://arxiv.org/pdf/2409.02891

Although not confirmed, there is potential here for collaboration with ECMWF

Integration of renewables into power/energy systems models#

  • Project code: mapi-063

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Simon Warder, s.warder15@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Operation and long-term planning of power grids requires a reliable representation of renewable energies such as wind and tidal. This project will focus on the representation of these technologies within the popular model PyPSA. There is scope to focus on particular areas of interest, but this project will likely involve

  • processing wind/tidal resource datasets

  • writing code to integrate these energy sources within PyPSA

  • using the code to run experiments with a focus on long-term energy strategies/scenarios

Model-data fusion for the simulation of lakes#

  • Project code: mapi-142

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Build on existing work at ICL on the use of satellite data for the analysis of lake thermodynamics and hydrodynamics using process-informed deep learning.

Also possible to consider links to numerical simulation and model-data fusion methods, e.g. building on https://www.alplakes.eawag.ch/

Possibility to work on model-data fusion, e.g. through the assimilation of data into numerical models, or through model calibration.

Overland Wave Height Prediction using Graph Neural Networks#

  • Project code: mapi-050

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Pablo Higuera, pablo.higuera@moodys.com, Moody’s Risk Management Solutions (RMS)

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

The proposed master thesis aims to address the critical issue of predicting the heights of waves generated by Tropical cyclones (TCs) as they propagate inland when storm surge leads to inundation. These waves pose a substantial threat to structures within the affected areas. The traditional methods of simulating overland waves (e.g., MikeSW) are often computationally intensive and may not always provide timely or easily scalable solutions. Thus, the need for an alternative approach, particularly one that leverages Machine Learning (ML), to offer both efficiency and accuracy in predictions. This predictive capability is crucial as it provides essential information for assessing the impact of waves on coastal structures, facilitating better-informed decision-making and risk management practices. The core objective of this thesis is to develop a surrogate model that utilizes ML techniques, specifically focusing on Graph Neural Networks (GNNs), to accurately calculate maximum overland wave heights. In GNNs nodes contain local information (e.g., terrain elevation) and are connected to their neighbours, in a similar way to conventional structured grids or unstructured meshes. Training data will comprise wave conditions at the coastline or nearshore — time series of significant wave heights, peak periods, and wave directions during TC events.

The research will explore several key areas, including the application and performance of GNNs. Additionally, the study aims to assess the generalizability of the developed model across different coastal regions and TC scenarios, and compare the effectiveness of GNNs against traditional approaches in predicting overland wave heights. In addition, physics-informed ML mechanisms could be put in place to penalise unphysical solutions.

Parsimonious and explainable surge tropical cyclone surrogate model using statistical and machine learning techniques#

  • Project code: mapi-049

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: George Pouliasis, georgios.pouliasis@moodys.com, Moody’s Risk Management Solutions (RMS)

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Understanding and accurately predicting the impacts of tropical cyclones is crucial for mitigating their devastating effects on coastal communities. In the context of risk assessments, which serve crucial functions within the insurance industry, the impact of tropical cyclones is usually described by the maximum surge height and it’s distribution. To accurate estimate this distribution, especially for higher return periods, often, many thousand of simulations are required. However, the accurate numerical modelling of such processes over regional scales is a time-consuming and computationally intensive task. This project aims to develop a surrogate model designed to estimate the maximum surge heights caused by tropical cyclones along coastlines by utilising various track parameters and local coastline features. The project will involve defining the optimal training methodology in order to achieve best model performance and faster convergence and subsequently, the training of multiple model types, including a linear model, Random Forest (RF), Extreme Gradient Boosting (XGB), and a Gaussian Emulator, among other potential models. The focus will be on achieving a fast surrogate model which strikes a balance between generalisability and reasonable accuracy, rather than solely maximising accuracy. Key input variables for the model can include the signed distance from the center of landfall, atmospheric pressure differences between the center of the cyclone and the prediction location, wind velocities, local depth, and the coastal slope. Additionally, a classification of specific points along the coastline may be incorporated to enhance the model’s accuracy. An essential aspect of this research is to identify and understand the factors that contribute to the model’s success as well as failure in certain areas. By doing so, the project aims to develop a fast, interpretable and transparent prediction model, thereby enhancing its reliability and utility for modelers. Ultimately, the success of this project will contribute in our ability to predict and manage the impacts of tropical cyclones, thereby protecting lives and property in vulnerable coastal regions.

Remote sensing and promptable segmentation for water-land interface problems: geospatial AI#

  • Project code: mapi-143

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Consider the use of modern, advanced AI approaches (foundation models, promptable methods, vision transformers vs CNNs etc) to address coastal/estuarine challenges.

Applications to -sustainable development in small island developing states -coastline identification for tracking island evolution, sea level rise etc. -identifying hard vs soft sea defences

Wind farm modelling and design with machine learning#

  • Project code: mapi-064

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Simon Warder, s.warder15@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Wind turbines and farms extract kinetic energy, producing a wake which propagates downstream and reduces the power output from other nearby turbines/farms. Numerical models exist which can accurately model these wakes, but their computational cost makes optimal farm siting or array layout design challenging.

This project will focus on the use of machine learning to overcome these challenges. This project will be well suited to multiple students working on different parts of the problem. Example topics include

  • The development of machine learning surrogate models to replace computationally expensive numerical models of individual wakes, or entire wind farms

  • The application of these efficient models to optimal array layout design or farm siting problems

  • Robust optimisation under uncertainty; a given wind farm design problem may be subject to significant uncertainties due to the potential for other wind farms likely to be developed nearby in the future. These uncertainties must be treated appropriately within optimisation

AI and remote sensing to support solar energy deployment#

  • Project code: mapi-144

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

Consider approaches to identify and and optimise solar array design using remote sensing data and AI algorithms.

Potential to tackle the problem on a small per roof perspective, or to consider larger scale questions of trends in deployment.

Potential to incorporate data from weather forecasting, e.g. on cloud cover.

Adaptive mesh methods [various]#

  • Project code: mapi-140

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project may accept multiple students.

Adaptive mesh methods to improve the numerical solution of PDEs.

  • mesh optimisation and mesh movement methods.

  • error measure design.

  • acceleration via use of AI surrogates.

  • integration of adaptive methods within PDE-constrained optimisation problems.

  • finite element methods.

happy to discuss of you want a numerical methods based project, with scope to use AI tools to improve classical numerical methods and specifically methods that update or move a mesh based on solution characteristics.

Weather pattern analysis using machine learning#

  • Project code: mapi-065

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Simon Warder, s.warder15@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Wind farm design problems require numerical simulations of a “representative” number of different wind/weather conditions. A balance must be achieved between the ability to faithfully represent the overall wind climate (which improves with a larger set of weather samples), and the computational cost of the wind farm simulations (which is proportional to the number of samples selected).

The selection of individual weather samples to build up a fully representative simulation is typically carried out in a simplistic way, e.g. random sampling. This project will explore alternative methods, such as the use of ML to analyse weather patterns in order to carefully select a representative set of weather conditions using the minimum number of samples. There may also be a climate change component to this project, e.g. assessing how wind patterns are projected to change in future.

Computational Fluid Dynamics for flood risk analysis: numerical wave tanks#

  • Project code: mapi-138

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Project will begin from a working OpenFOAM CFD based numerical wave tank setup. There are options to progress the work in the direction of the hydrodynamics of waves, the engineering of e.g. sea wall defences, or consideration of the effects of different numerical schemes on the performance of numerical wave tanks. Data for comparisons with experimental results is also available.

Numerical ocean modelling: coastal ocean processes and coastal engineering; offshore energy and flood risk#

  • Project code: mapi-139

  • Main supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Lots of projects in the following areas

  • Modelling and AI/remote sensing project - model-data fusion.

  • Tsunami and storm surge simulation.

  • Erosion and land reclamation analysis and modelling.

  • Dispersion of releases from desalination and nuclear accident scenarios (e.g. Fukushima).

  • Lagrangian vs Eulerian dispersion modelling.

  • CFD project - development of a numerical wave tank to understand wave over-topping and interaction with vegetation; comparisons with lab data.

  • Comparison of operational models with multi-scale models for tidal, flood and surge prediction; calibration/optimisation against observational data (including new satellite altimetry products).

  • AI + satellite data for coastline evolution tracking.

  • Automated mesh generation incorporating vegetation, defences etc.

Happy to discuss specifics if any of these areas are of interest.

Exploiting very high resolution lidar and multispectral drone imagery to map biodiversity and quantify natural capital.#

  • Project code: yvpl-193

  • Main supervisor: Yves Plancherel, y.plancherel@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Myriam Prasow-Emond, m.prasow-emond22@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Thanks to the international Bio+Mine project (https://bioplusmine.earth/), the opportunity exist to leverage recently obtained repeated very high-resolution drone multispectral and Lidar drone data collected over an abandoned mining site in the Philippines to develop a digital twin of the site.

In order to fuel economic growth and the decarbonisation transition, humanity will need to mine more in the next 50 years than it has mined in the history of humankind. This implies that more mines will need to be opened, but also that a plethora of mines will eventually need to be closed. The problem of mine closure has received very little attention so far and mine operators have had a tendency to leave closed mines in a poor ecological state and prone to collapse, representing significant physical and chemical hazards for local communities. Because of this poor record on mine closure, mine developers today are finding it hard to gain the required social licence to operate, what today limits how, where and when new mines can be opened.

The overall aim of this project is to produce a virtual representation of the site coupled and to build an intelligence layer to aid the local community and stakeholders make decisions in terms of local structural and economic development, including water management, farming practices and biodiversity conservation or interventions.

In addition, we aim to develop tools to quantify the natural capital of the site. Natural capital valuation of ecosystems assets, including automated recognition of plant species from imagery via machine learning, biomass and carbon, is critical as this needs to be balanced with economic market-based measures of development to assist decision-making at the local level.

Students engaged in this project are expected to solve segmentation problems in 2 and 3D using combination of multispectral image with a ground sampling distance of order 2-3 cm and 3D point cloud data to map and study the density and distribution of targets in time and space for creative uses.

Further information: zyc00/Point-SAM https://medium.com/@BasicAI-Inc/3d-point-cloud-segmentation-guide-a073b4a6b5f3 https://link.springer.com/article/10.1007/s44267-024-00046-x

Leveraging machine learning, remote sensing and modelling to help low lying island nations adapt against and mitigate the effects of climate change and economic development.#

  • Project code: yvpl-195

  • Main supervisor: Yves Plancherel, y.plancherel@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Small Island Developing States (SIDS), such as the Maldives, Cocos Island or Kiribati, are of particular interest. Most are very low-lying, with elevation on average <1 m above mean sea level, making them extremely vulnerable to sea-level rise. Many witness rapid and ongoing urban and marine development associated with extensive creation of coastal infrastructure, resort islands and land reclamation, what can severely affect the sediment balance of these islands and therefore influence if and how islands cope with sea level rise. Others do not attract such development and are now seeing their population having to migrate as islands get flooded. Due to low GDP, limited availability of data and scientific resources, and scattered population, many SIDS do not possess the capability to monitor and forecast the effects of development and climate change on their landscapes, limiting evidence-based policy making in these communities. The goal of this project is to build data analysis and modelling infrastructures specific for islands, building on existing efforts and working closely with researchers in the department.

Possible projects around this theme include: (1) Development of machine learning algorithms to automatically detect coastal and urban development from remote sensing products, to automatically detect start and end date or construction projects, evolution of urban development, coastal erosion, flooding risk, and changes in biodiversity. (2) Analysis of environmental change to link cause and effect of change (is sea level rise really to blame for migrations in Cocos Keeling and Kiribati?) (3) Development of hydrodynamic ocean models to simulate the flow of ocean currents, sediments and pollution around islands with complex topography.

Further information: https://statics.teams.cdn.office.net/evergreen-assets/safelinks/1/atp-safelinks.html https://storymaps.arcgis.com/stories/608e0f2830634711b0a4e4d8910716ad https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2020EA001207 https://nhess.copernicus.org/articles/24/737/2024/

Tracking Illegal Gold Mining Safely with Earth Observations and Machine Learning#

  • Project code: yvpl-194

  • Main supervisor: Yves Plancherel, y.plancherel@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Sesinam Dagadu, s.dagadu@snoocode.com, SnooCODE Limited

  • Available to: ACSE EDSML

  • This project may accept multiple students.

INTERPOL and UNEP estimate that illegal gold mining represents more than USD 48 billion a year in illegal gains. Globally, up to 80% of all small-scale artisanal mining is illegal. Recent events (e.g. Covid, Ukraine, political instability), have led to further increases in gold prices, boosting demand. Given deteriorating socio-economic conditions, illegal gold mining is booming across the globe, causing severe, widespread, transnational, and chronic environmental damages. It is also linked to organized crime, human trafficking, and other human rights violations. In regions that are particularly affected by illegal mining, corruption of policymakers and law enforcement officials can further impair local action.

This project aims to leverage satellite observations, trade and socio-economic data and machine learning tools to develop the new field of “environmental forensics and criminology” and to quantify the environmental footprint associated with illegal gold mining by looking at the spatial and chemical footprint of gold mining operations, starting in Ghana and Liberia.

Students will use machine learning and satellite-derived products to quantify the mass of gold extracted illegally, the mercury used/emitted and the spatial and temporal evolution of these quantities, looking to identify patterns of development and operations that could be linked to criminal organisations, and their effects on environmental deterioration. Results from this project will help quantify the environmental, social and financial footprint of illegal mining activities, including the degradation phase and the recovery phase after sites are abandoned; information that is critical to develop suitable management policies.

As part of this project, the students will collaborate with SnooCode.com, an IT company based in Ghana, and with other potential partners in South Africa and India.

Further information:

Investigating the potential of isotopic tracing and resource stoichiometry for resource flow analysis and supply chain mapping.#

  • Project code: yvpl-187

  • Main supervisor: Yves Plancherel, y.plancherel@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Nicola Gambaro, nicola.gambaro20@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

The global demand for mineral resources is increasing rapidly, driven by technological advancements and the transition to a green economy. However, complex supply chains often obscure the origin of these minerals, creating opportunities for unethical practices such as conflict financing, human rights abuses, and environmental degradation. Knowledge gaps exists that prevent tracing minerals through complex supply chains and databases relating to resource flows and stocks are also sparse and prone to large uncertainties.

By analysing the isotopic composition of materials, it may be possible to link products back to its original ore source. Isotopic fingerprinting can be used to verify the origin of materials. This technology can be incorporated into certification schemes, providing a robust mechanism for tracking mineral resources from mine to market. It is, however, not clear how to effectively integrate isotopic tracing into existing certification schemes and supply chain management systems, and how to demonstrate the cost-effectiveness and value proposition of this integration to businesses.

In addition, it may be possible to use relationships between resources to cross-constrain the flows and stocks of multiple resources. For instance, a car is not just built of aluminium, but if good data exist for aluminium, and some information is available about the relative amount of materials to aluminium, inferences can be made about the flows and stocks of these other resources.

The overall goal of this project is to explore the feasibility, potential and limits of isotopic fingerprinting for certification and supply chain mapping activities by adding isotope tracing to a Bayesian Material Flow Analysis system and to further improve the capability of that model by exploring how information from multiple resources can help constrain resource flows and stocks data of the system overall. Parallel work can also be done using a complementary dynamical systems approach that simulates resource flows explicitly, building on the ongoing work of PhD student N. Gambaro.

Elevation-Aware Route Optimization System for Energy-Efficient Electric Vehicle Navigation#

  • Project code: yvpl-105

  • Main supervisor: Yves Plancherel, y.plancherel@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Sesinam Dagadu, s.dagadu@snoocode.com, SnooCODE Limited

  • Available to: ACSE

  • This project may accept multiple students.

This project aims to develop a mobile or web-based application that optimizes electric vehicle (EV) routes by considering elevation data to minimize energy consumption. The app will calculate the power necessary for different route options and suggest optimal power usage strategies, such as when to reduce acceleration or leverage regenerative braking. The final output will be simple and user-friendly, but the underlying computations will involve complex energy modeling and route optimization algorithms.

Building a LLM-driven platform for global climate innovation#

  • Project code: cequ-208

  • Main supervisor: César Quilodrán-Casas, c.quilodran@imperial.ac.uk, The Grantham Institute for Climate Change, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

You will design and build the MVP of a platform for a global network of accelerators who support climate innovators all round the world, which if successful will help more climate innovators achieve impact at scale more rapidly and make a tangible difference in the urgent fight against climate change.

The platform will enable the accelerators to find the best partners to collaborate with around the world, to attract the best applicants to their programmes and to increase the visibility of their innovators so potential investors and corporate adopters can find the climate solutions they need.

It will also enable innovators to find the right accelerator for them at whatever stage they are at in whatever field and wherever they may be and to access a global pool of mentors, experts and trusted suppliers who can help them develop and scale.

It will need require a great user interface that makes it easy to for all user groups to find who they need from a large database you will also design - it can start off small but will need to be scalable to handle >1,000 accelerators and >100,000 innovations from all around the world.

Identifying research that has the potential for significant impact against climate change#

  • Project code: cequ-058

  • Main supervisor: César Quilodrán-Casas, c.quilodran@imperial.ac.uk, The Grantham Institute for Climate Change, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

The Climate Solutions Catalyst (Grantham Institute) aims to unearth neglected climate solutions from the full breadth of the UK academic community. The challenge is to identify these solutions using a machine learning algorithm, based on graph neural networks, to access the right pre-innovation ideas as quickly as possible.

The team are developing a search tool using LLMs to identify the research through the content of research papers. However, we have identified an opportunity of expanding the variety of datasets used to inform the tool and conduct relational analyses.

The aim of the project is to design, build and test a relational algorithm using the datasets already identified as well as any additional ones which are found through the course of the project. Such as matching papers, to authors, to grants, to social media. Thus, identifying climate solutions cross referenced with implementation and funding datasets that exist within the UK landscape.

If you’re excited about designing innovative algorithms that can identify impactful climate research, join us. Your contribution will help to create a more sustainable world where science, technology, and innovation meet.

If you’re excited about designing innovative algorithms that can identify impactful climate research, join us. Become part of a diverse network of individuals and researchers committed to the Grantham Institute’s mission of driving climate innovation. Together, we can build and test tools that harness the power of machine learning to uncover transformative solutions. Your contribution will help to create a more sustainable world where science, technology, and innovation meet.

The project is NLP/LLM heavy

Beyond subjectivity: A study on the influence of colourmaps for Machine Learning tasks#

  • Project code: anra-022

  • Main supervisor: Andrianirina Rakotoharisoa, andrianirina.rakotoharisoa19@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

In Computer Vision, machine learning (ML) models and deep learning architectures have been extensively studied in recent years. They are applied in real-life problems including medical imaging, video- or image-processing and earth observation. In the context of remote sensing data (e.g. atmospheric or chemical variables), colourmaps are often used to visually represent the range and variability of the features we investigate. An overlooked issue is the subjective choice of colourmaps and the lack of quantitative evidence as to which representation is better depending on the task at hand. This project will aim to understand and quantitatively evaluate the impact of colourmap-based representations on the performance of ML models and depending on time constraints, establish guidelines for remote sensing datasets.

Enhancing Neural Network–Derived Electrocardiographic Features for predicting cardiovascular disease risk factors in Africa#

  • Project code: olra-186

  • Main supervisor: Oliver Ratmann, oliver.ratmann05@imperial.ac.uk, Department of Mathematics, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

The ECG is widely used for assessment of cardiovascular diseases (CVD) and CVD risk factors such as hypertension. Individual-level ECG time series data are ideally suited for statistical machine learning approaches due to its inherent complexity over 12-lead measurements, capturing heartbeats through a complex series of atrial depolarization, ventricular depolarization, ventricular repolarization, and time from atrial to ventricular activation, and nuances in these individual functions. Large-scale ECG data are available for unsupervised latent feature extraction and deep-learning from 100,000s of people in the UK, Israel and Brazil. It is unclear however if and how 1) trained open-source models can be transferred to identify population phenogroups across Africa, as practice and quality of ECG reads differ; and 2) to what extent ECG features derived from people of different race & ethnicity have structurally distinct prediction properties, as for example QRS voltage tends to be higher in people of black heritage and could lead to false positive diagnoses. CNN open-source neural network architectures are available in Keras using a TensorFlow backend, but ideally you will expand available architectures within JAX, re-train on open-source data, and then transfer learning to available longitudinal cohort data from across Southern and Eastern Africa. You will explore these research milestones within the Machine Learning & Global Health network group in Maths at Imperial. We are dynamic, curious & diverse and look forward to hearing from you.

Surrogate Modelling of Mantle Dynamics using Physics Informed Neural Networks#

  • Project code: frri-045

  • Main supervisor: Fred Richards, f.richards19@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project may accept multiple students.

The Earth’s mantle is a giant heat engine, transferring thermal energy from the core and the decay of radioactive isotopes to the surface via convection. As well as controlling tectonic plate motions and the distribution of volcanism, this process perturbs surface elevations, influencing the evolution of landscapes, ocean circulation, and ice sheet stability. Understanding flow patterns within our planet’s interior is therefore central to improving forecasts of natural hazards and the impacts of climate change.

Our ability to numerically model mantle convection has dramatically improved; however, these models have become increasingly computationally expensive, complicating assessment of uncertainties. In addition, the correct `rheology’ (i.e., the description of how rock deforms as a function of its physical state) to use in these models remains unclear, since experimental studies have come to differing conclusions about the flow laws governing mantle convection and these experiments are necessarily conducted at conditions that diverge significantly from that of the deep Earth.

This project aims to tackle these limitations by developing physics-informed neural networks (PINNs) to enable rapid prediction of mantle flow across a comprehensive range of rheological inputs. First, a training dataset of simulation outputs that assume different flow laws and initial conditions will be generated using finite-element convection code, ASPECT. Next, neural networks will be constructed and trained on the simulated data by minimising a loss function that accounts not only for agreement between predicted outputs and validation data, but also the degree to which calculated fields are consistent with governing equations (i.e., conservation of mass, momentum, and energy). Depending on speed of progress, these PINNs may then be integrated into a probabilistic inverse framework to determine flow law parameters most consistent with geophysical observations. If successful, this work will allow us to model mantle convection with unprecedented accuracy and quantified uncertainties.

How Stable is the West Antarctic Ice Sheet? Impacts of Transient Viscoelasticity#

  • Project code: frri-044

  • Main supervisor: Fred Richards, f.richards19@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project does not accept multiple students.

The West Antarctic Ice Sheet sits on bedrock that is low-lying and that slopes downwards towards its interior. As our planet continues to warm, these factors make the ice sheet particularly prone to the development of instabilities, which numerical models suggest could cause rapid retreat and sea-level rise of up to 2 m by 2100. This alarming possibility represents and an immediate and serious threat to the 680 million people and $4 trillion of critical infrastructure currently occupying flood-susceptible coastlines.

While viscoelastic bedrock rebound following ice unloading can help to restabilise a retreating ice sheet, it has long been thought that sub-Antarctic mantle viscosity (‘runniness’) is high, rendering the rebound too slow to have much of an impact on ice-sheet evolution over human timescales. However, recent studies suggest that viscosity is very low beneath West Antarctica and that `transient’ viscoelastic deformation is occurring in this region. These two observations imply that the bedrock beneath Antarctica will: a) rebound faster than expected; and b) do so at a rate proportional to that of ice unloading, introducing a currently unmodelled negative feedback into the climate system.

The aim of this project will be to build numerical models that will enable the impact of this transient rheology on ice-sheet dynamics to be quantified for the first time. This will be accomplished by developing of a new module of the state-of-the-art geodynamic finite-element software package ASPECT (C++) that will enable the complex rheology outlined above to be integrated into a geodynamic model. These thermomechanically self-consistent models of mantle deformation, when integrated into coupled ice-sheet–sea-level models will produce transformational insights into the future stability of the West Antarctic Ice Sheet and its contribution to sea-level change.

enhancing Efficiency in Natural Hazard Risk Assessment: Clustering-Based Scenario Reduction for Monte Carlo Simulations#

  • Project code: pasa-162

  • Main supervisor: Parastoo Salah, p.salah@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Monte Carlo simulations are widely recognized for their effectiveness in natural hazard risk assessment, providing valuable insights into the probabilistic outcomes of various disaster scenarios. However, the computational intensity and time requirements of running large numbers of scenarios pose significant challenges. This project aims to address these challenges by introducing a novel clustering approach to streamline scenario selection, thereby enhancing the efficiency and practicality of Monte Carlo simulations in natural hazard studies.

The research will explore the application of advanced clustering algorithms, such as k-means, hierarchical clustering, and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), to categorize a broad range of potential natural hazard scenarios based on their characteristics and outcomes. By identifying representative scenarios within each cluster, the number of simulations required to achieve a comprehensive risk assessment can be significantly reduced without compromising the accuracy of the results.

A critical component of the project will involve the development of criteria for the evaluation and selection of the most representative scenarios within each cluster, taking into account factors such as frequency, severity, and spatial distribution of the hazards. The effectiveness of the clustering-based approach will be validated through comparative analysis with traditional Monte Carlo simulations, assessing both the reduction in computational demand and the preservation of result integrity.

This research promises to make Monte Carlo simulations more accessible and feasible for a wider range of applications in natural hazard risk assessment. By improving the efficiency of scenario selection, this project will facilitate more informed decision-making in disaster management, urban planning, and insurance, contributing to the resilience of societies against natural hazards.

Shallow Water Equation Ocean Modeling with AI-Driven PDE Solvers#

  • Project code: pasa-167

  • Main supervisor: Parastoo Salah, p.salah@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Matthew Piggott, m.d.piggott@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

The integration of Artificial Intelligence (AI) with traditional numerical methods is reshaping the landscape of computational physics. This project focuses on leveraging AI-driven approaches to solve the Shallow Water Equations (SWE) in ocean modeling. The SWE play a fundamental role in simulating oceanic and coastal dynamics, including tsunami propagation, storm surges, and tidal movements. By harnessing AI techniques for solving Partial Differential Equations (PDEs), this research aims to improve computational efficiency, accuracy, and scalability in ocean modeling. - Develop AI-Assisted Numerical Solutions for SWE Implement AI-based surrogate models to approximate SWE solutions efficiently. Explore the integration of Physics-Informed Neural Networks (PINNs) within AI-driven PDE solvers such as NVIDIA Modulus and AI4PDE to learn solutions directly from governing equations and observational data.

Tsunami Forecasting: Leveraging Graph Neural Networks for Enhanced Early Warning Systems#

  • Project code: pasa-166

  • Main supervisor: Parastoo Salah, p.salah@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Gege Wen, g.wen@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

The criticality of timely and accurate tsunami warnings cannot be overstated, given the catastrophic potential of such events. This project proposes an innovative approach to tsunami forecasting by integrating Graph Neural Networks (GNNs) with traditional numerical modeling, establishing a more precise and efficient early warning system. The intricate and highly dynamic nature of tsunamis, combined with the vast and interconnected data involved, makes them an ideal candidate for the application of advanced deep learning (DL) methodologies, particularly GNNs. This research will focus on the development of a hybrid forecasting model that leverages Graph Neural Networks to enhance predictive accuracy and computational efficiency. Unlike traditional deep learning models that treat data as sequential or grid-like structures, GNNs excel at capturing relationships and dependencies within complex, interconnected systems—such as the spatial-temporal patterns of tsunami propagation. The GNN-based approach will be trained on historical tsunami datasets, encompassing various initiating factors such as undersea earthquakes, volcanic eruptions, and landslides. By modeling these events as dynamic graphs, GNNs can effectively learn spatial correlations and temporal dependencies, providing faster and more accurate predictions compared to conventional methods. Additionally, the model will integrate real-time seismic and oceanographic data, allowing continuous updates to forecasts as new information becomes available. A key component of this project will be the rigorous testing and validation of the GNN-based forecasting model against historical tsunami events. Comparative analysis with existing numerical and deep learning models, such as Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks, will assess improvements in prediction accuracy, computational efficiency, and response time. By combining the strengths of Graph Neural Networks and traditional numerical modeling, this research aims to establish a new benchmark for tsunami forecasting, offering a more reliable tool for disaster preparedness and response. The ultimate goal is to improve the speed and accuracy of tsunami alerts, potentially saving lives and mitigating economic losses in vulnerable regions.

Advanced imaging techniques to enhance energy efficiency in the processing of critical raw materials#

  • Project code: lusa-011

  • Main supervisor: Luis Salinas-Farran, l.salinas-farran@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Stephen Neethling, s.neethling@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

The green energy transition relies on a supply of critical raw materials to obtain metals that are essential to renewable energy technologies. It is thus necessary to improve conventional mineral processing methods to minimise their environmental impact, as well as understanding the effect of the different process variables to optimise energy and water consumption. One of the most energy intensive processes in mining is comminution. Its energetic efficiency is commonly below 10%; however, it is an essential process in the mining industry. Comminution is understood as the reduction in size of raw materials, which is crucial to obtain a specific particle size range that allows for subsequent processing in downstream steps. The performance of comminution systems is highly affected by the interactions between the solid particles, the elements used as charge to promote breakage and, in some cases, the liquid in the system. Furthermore, the subsequent flotation stage benefits greatly when particles present a high surface exposure, as this process works on the particle surface chemistry to separate the valuable metal from the gangue. Although, the impact of the comminution variables is commonly assessed either by using destructive methods or by analysing the final product of very long and complex processes that involve multiple middle stages. This makes it impossible to track and quantify the performance of the system at each stage of the process, relying mostly on empirical knowledge. This project will combine novel experimental and characterization techniques in order to better understand the effect that different methods used to achieve particle size reduction have on the efficiency of the process and the properties of the resultant particles. In particular, this will be assessed for mineral ores not only in terms of the resulting size but also based on the extent of surface exposure of the mineral grains. With this, textural properties of the material, the comminution method used, the reduction in size, and the extent of liberation of valuable material can be linked to one another. The use of X-ray micro tomography (micro-CT) will allow a 3D mapping of minerals in rocks before, during, and after comminution; this is not a trivial task requiring the development of bespoke sample holders and advanced image processing and data analysis techniques. These 3D mineral maps will allow us to accurately determine the mineral liberation that is obtained when treating ores of different mineral texture and with various methods to reduce the particle size. Ultimately, this information will be used to tackle low efficiencies in comminution and flotation circuits, achieving a new understanding of this process both at the particle scale and the unit process scale that can inform strategies for substantial energy reduction in the process.

Developing, Testing and Implementing an Automated Defacing Workflow for Magnetic Resonance Imaging Research Data.#

  • Project code: jase-026

  • Main supervisor: Jan Sedlacik, j.sedlacik@imperial.ac.uk, Institute of Clinical Sciences, Imperial College London

  • Second supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project does not accept multiple students.

Increased MRI scan quality and the capability of facial recognition software to search the internet make it necessary to remove facial features from neurological research scans to better protect subject privacy when sharing MRI data.

The aim of this project is to develop an automated workflow for defacing MRI scans using Python. The workflow needs to work with single and multi-frame DICOM files and without writing temporary interim data, e.g. Neuroimaging Informatics Technology Initiative (NIfTI) formatted data, to the file system. The code should be implemented as a locally running command line script on a image series folder on the file system. Furthermore, the code should be implemented as an pipeline on the eXtensible Neuroimaging Archive Toolkit (XNAT) imaging data platform. The defaced DICOM data should then be archived as a separate image series on XNAT. The workflow should be optimised with respect to low computational and memory demands, since the XNAT archive server, as well as the local desktop PCs, have limited hardware specifications. Furthermore, it needs to be investigated which type of MRI scans (2D, 3D, structural, functional) have sufficient facial information for possible deidentification and, therefore, will need to be defaced. The defaced data needs to be checked and tested for the successful removal of the facial features and if it affects the subsequent structural or functional analysis of the image data.

The student is expected to work independently but in close communication with the supervisors. The student should conduct a thorough literature research and test all available and potentially useful defacing tools using anonymised image data scanned at our facility. Literature research and writing of the final report can be done remotely. However, testing and implementing the workflow needs to be done on site, since the image data is not available remotely.

Further Reading: DOI:10.1002/alz.093968 DOI:10.48550/arXiv.2205.15536 DOI:10.1101/2023.05.15.23289995 DOI:10.3389/fpsyt.2021.617997 DOI:10.1007/s10334-024-01170-x

Software Examples: poldracklab/pydeface neurolabusc/mydeface https://surfer.nmr.mgh.harvard.edu/fswiki/mri_deface https://neuroimaging-cookbook.github.io/recipes/pydeface_recipe/ https://wiki.xnat.org/xnat-tools/face-masking https://mri-defacing-pipeline.readthedocs.io/en/stable/index.html https://pypi.org/project/deepdefacer/

AI-Powered Conservation: Leveraging Open Source Geospatial Data for Biodiversity, Climate, and Sustainable Development Solutions#

  • Project code: misi-046

  • Main supervisor: Minerva Singh, minerva.singh07@imperial.ac.uk, Centre for Environmental Policy, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML

  • This project may accept multiple students.

The increasing availability of open-source geospatial data and advancements in AI offer unprecedented opportunities to address global conservation challenges. This project aims to develop an AI-driven conservation platform that harnesses open geospatial data to tackle diverse ecological and environmental issues. The platform will integrate data processing pipelines, advanced machine learning models, and visualization tools to enable students and researchers to work on unique, impactful themes.

The project will encompass a wide range of strands, allowing students to pursue specific areas of interest. These themes include:

Biodiversity Monitoring:

Using AI to analyze satellite imagery for habitat mapping, species distribution, and wildlife corridor identification. Automating the detection of deforestation, illegal logging, or habitat fragmentation. Climate Change Analysis:

Developing models to quantify carbon sequestration potential of forests or wetlands using aerial imagery and LiDAR data. Predicting the impact of climate change on ecosystems by integrating temperature, precipitation, and land-use datasets. Disaster Resilience and Ecosystem Services:

Mapping flood-prone areas or wildfire risks using geospatial data and AI-based prediction models. Assessing ecosystem services such as water purification, pollination, and soil health. By developing modular AI models and open-source tools, this project empowers students to specialize in different strands while contributing to the overarching goal of data-driven conservation. It aims to provide scalable, replicable solutions for global challenges, fostering interdisciplinary collaboration and real-world impact.

Development of a Scoring System for Predicting Pressure Ulcers and Surgical-Site Infections Using Publicly Available Data#

  • Project code: miso-012

  • Main supervisor: Mikael Sodergren, m.sodergren@imperial.ac.uk, Department of Surgery and Cancer, Imperial College London

  • Second supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Pressure ulcers (PUs) and surgical-site infections (SSIs) are among the most common hospital-acquired conditions, causing signifi cant morbidity, prolonged hospital stays, and increasedhealthcare costs. Pressure ulcers, particularly aff ecting immobile or bed-bound patients, canlead to severe complications such as infections, sepsis, and even death. Similarly, SSIs, whichoccur in 2-5% of surgical patients, pose a considerable challenge for healthcare systems,accounting for nearly 20% of all hospital-acquired infections. The prevention of bothconditions relies heavily on early identifi cation of at-risk patients, yet current clinical tools forpredicting these outcomes are limited. Risk factors such as patient demographics,comorbidities, surgical complexity, and postoperative care contribute to the likelihood ofdeveloping PUs or SSIs, but eff ective and comprehensive scoring systems remainunderdeveloped. Publicly available healthcare datasets, including data from organisations likethe National Health Service (NHS) in the UK or Centers for Medicare & Medicaid Services (CMS)in the USA, off er an opportunity to analyse large patient populations and identify keypredictive factors. By leveraging machine learning techniques and statistical analysis, thisproject aims to create a scoring system that identifi es patients at high risk of developing PUsor SSIs. This system could support clinical decision-making and targeted interventions,ultimately reducing the incidence and severity of these preventable complications.

Analysis of the UK Medical Cannabis Registry#

  • Project code: miso-013

  • Main supervisor: Mikael Sodergren, m.sodergren@imperial.ac.uk, Department of Surgery and Cancer, Imperial College London

  • Second supervisor: Carlos Cueto, c.cueto@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project does not accept multiple students.

In 2018, cannabis-based medicinal products (CBMPs) were rescheduled in the United Kingdom (UK) under the Misuse of Drugs Regulations 2001 (1). The National Institute for Health and Care Excellence (NICE) has provided guidance to support the utilisation of licensed CBMPs (nabilone, Epidyolex® and Sativex®) in the setting of chemotherapy induced nausea and vomiting, treatment-resistant epilepsy in Dravet and Lennox-Gastaut Syndromes, and multiple-sclerosis-associated spasticity (1,2). However, they did not provide any recommendation regarding the prescribing of unlicensed CBMPs (1,2). However, for conditions where sufficient prior clinical evidence exists, unlicensed CBMPs can be initiated by members of the General Medical Council’s Specialist Register for patients who have failed to achieve a satisfactory clinical response to licensed therapies (1). These conditions comprise of an array of pain, psychiatric, gastrointestinal, neurological, and dermatological conditions, in addition to the symptoms of cancer and/or its treatment (3-10).

CBMPs are derived from species of the cannabis plant which have previously been described as containing a ‘pharmacological treasure trove’ of active pharmaceutical ingredients (11). The major compounds contained within CBMPs are the cannabinoids cannabidiol (CBD) and (–)-trans-Δ9-tetrahydrocannabinol (Δ9-THC). However, there are potentially over 600 potential active pharmaceutical ingredients contained within the cannabis plant (12,13). Hence, the pharmacology and resultant effects of cannabis, and products made thereof, may vary between different chemovars of cannabis according to the concentrations and interactions of these compounds (12,13).

However, there is a paucity of high-quality research detailing the efficacy of CBMPs. This is in the context of significant challenges in conducting clinical trials incorporating CBMPs (14). A key challenge is the vast heterogeneity of available CBMPs (14). Considering the potential 600 active pharmaceutical ingredients which may be found within the flowering head of the cannabis plant, each at varying concentrations due to underlying plant genetics, the environment in which the plant is grown and how it is manufactured into a medicinal preparation, there is potentially a limitless spectrum of CBMPs. At present there have been incomplete attempts to try and map the effects of specific types of CBMPs on symptoms for which they are prescribed. Without this information the choice of a specific CBMP for a patient is largely dictated by anecdotal experience and rudimental characterisation of their physical and chemical properties.

The UK Medical Cannabis Registry was developed in 2019 to help address the current lack of evidence to guide the prescribing of CBMPs. Since this time is has recruited in excess of 30,000 patients who have been prescribed unlicensed CBMPs under the supervision of consultant physicians. From this data, more than 20 studies have been published in peer-reviewed scientific journals detailing rudimentary analyses in conditions, such as chronic pain, anxiety, among others (15-18). Considering this rich data source, the next phase of studies arising from the UK Medical Cannabis Registry will seek to conduct a much more thorough evaluation of patient and medication factors which may predict or determine a specific clinical response.

We would use this multi-dimensional dataset to apply suitable machine learning algorithms to profile different therapies and other input factors in relation to clinical outcome data (clinical efficacy measures, patient reported outcome measures, and adverse events). The outcome would potentially not only have clinical relevance but also will be valuable for policy makers and regulators in ongoing evaluations of providing more of these therapies on the NHS.

Developing a Retrieval System for Tracking the Long-Term Impact of Corporate Sustainability Initiatives#

  • Project code: llsp-161

  • Main supervisor: Llohann Dallagnol Speranca, ldallagn@ic.ac.uk, The Leonardo Centre on Business for Society, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

Identifying the long-term impact of corporate sustainability initiatives in scale is a challenging task, although key for developing sustainability strategies. This project aims to design a retrieval system that processes past and future Corporate Social Responsibility (CSR) reports to extract relevant information regarding specific sustainability initiatives. The goal is to enable a structured approach to tracking corporate sustainability actions and assessing their effectiveness.

The Leonardo Centre has a leading global repository of corporate sustainability initiatives. It extracted and categorized around 1 million of sustainability action from around 12,000 publicly listed companies. This dataset provides researchers with a novel level of granularity to analyze corporate sustainability strategies. Besides, it developed a data lake with nearly 300,000 Corporate Sustainability Reports.

The student will develop a retrieval pipeline using NLP and information extraction techniques. The system will ingest company sustainability initiatives and extract relevant details from CSR reports over time. Techniques may include Named Entity Recognition (NER), information retrieval algorithms, and vector-based similarity models. The project will leverage Leonardo Centre’s database of sustainability reports.

Key responsabilities:

  • Develop retrieval pipeline capable of extracting information from CSR reports.

  • Develop benchmarking framework for evaluating the accuracy and relevance of extracted data.

  • Produce research report detailing the findings and potential applications of the methodology.

Developing an NLP-Powered Name-Matching System for Corporate Sustainability Data Integration#

  • Project code: llsp-158

  • Main supervisor: Llohann Dallagnol Speranca, ldallagn@ic.ac.uk, The Leonardo Centre on Business for Society, Imperial College London

  • Second supervisor: Rossella Arcucci, r.arcucci@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Companies are often referred to by different names across various datasets, reports, and documents. This inconsistency makes it difficult to merge, compare, and analyze corporate data accross datasets. The challenge lies in developing an effective name-matching algorithm that can standardize company names across different sources while minimizing false positives and negatives. A robust company name-matching system has a broad range of applications, including in banks, researches, and regulatory agencies for compliance monitoring.

The student will develop a company name-matching algorithm using Natural Language Processing (NLP), fuzzy matching and possibly machine learning techniques. Methods may include edit distance metrics (Levenshtein distance), phonetic algorithms (Soundex, Metaphone), and machine learning models trained on company name variations. The student will be given access to curated datasets with matching and (non-trivially) non-matching company name pairs. The project will involve the development of the pipeline and performance benchmarking against existing methods.

Building an Agent-Based Model to Assess the Maturity of Corporate Sustainability Initiatives#

  • Project code: llsp-163

  • Main supervisor: Llohann Dallagnol Speranca, ldallagn@ic.ac.uk, The Leonardo Centre on Business for Society, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: EDSML GEMS

  • This project does not accept multiple students.

Sustainability initiatives vary in complexity and impact. The Leonardo Centre co-develops a unique dataset with around 1 million of them. Likewise, the Centre defined Business Impact Maturity Model to assess coroporate actions. The student is invited to develop agentic best practices to rate sustainability initiatives according to such a model. The agentic routine will be then used to explore a few use-cases for face validity.

The Leonardo Centre has a leading global repository of corporate sustainability initiatives. It extracted and categorized around 1 million of sustainability action from around 12,000 publicly listed companies. This dataset provides researchers with a novel level of granularity to analyze corporate sustainability strategies. Besides, with its leading business research expertise, the Centre developed a bheavioural model to assess maturity and impact of initiatives.

The student will implement an agentic scoring workflow using the Centre’s methodology. The approach will involve analyzing sustainability initiatives from Leonardo Centre’s database, assigning maturity scores, and validating the model against expert assessments. Techniques such as reinforcement learning or decision tree-based classification may be explored.

AI-Driven Analysis of Longitudinal Mental Health Trajectories in Cancer Patients#

  • Project code: bosu-038

  • Main supervisor: Bowen Su, b.su@imperial.ac.uk, Cancer Research UK Convergence Science Centre, Imperial College London

  • Second supervisor: Parastoo Salah, p.salah@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project does not accept multiple students.

Background and Rationale: Head, neck, and neuro-oncological cancers impose significant physical and mental health challenges due to aggressive treatments such as radiotherapy, chemotherapy, and immunotherapy. Existing studies largely overlook how anxiety and depression evolve longitudinally during pre-, peri-, and post-treatment phases, especially across diverse populations. Using the All of Us dataset, this study leverages AI and machine learning to investigate mental health trajectories, focusing on socioeconomic disparities, treatment-related stressors, and genetic predispositions. Methodology: This retrospective observational study utilizes the All of Us dataset, incorporating GAD-7 and PHQ-9 scores, electronic health records (EHRs), and socioeconomic data to assess mental health changes over time. Advanced AI models, including longitudinal mixed-effects models (LME), time-to-event analyses, and Bayesian frameworks, will be applied to: • Predict mental health trajectories during treatment. • Identify vulnerable subgroups using machine learning clustering algorithms. • Explore gene-environment interactions using genomic data and predictive modelling. Key Variables: • Mental Health Metrics: Anxiety (GAD-7) and depression (PHQ-9). • Treatment Data: Radiotherapy schedules, surgical procedures, and chemotherapy regimens. • Socioeconomic Factors: Income, education, and employment status. • Genetic Data: Markers for anxiety and depression, if available. Expected Outcomes: AI and machine learning will reveal how SES, treatment types, and genetic predispositions influence anxiety, depression, and quality of life. These insights will identify high-risk groups and inform personalized interventions to optimize mental health care in cancer treatment. This study’s novel application of AI methods on a diverse dataset bridges gaps in psycho-oncology, addressing critical health equity issues and enabling precision care in vulnerable populations.

Multilevel Analysis of Comorbidities, Treatment Outcomes, and Survivorship in Ethic Minority American Cancer Patients#

  • Project code: bosu-039

  • Main supervisor: Bowen Su, b.su@imperial.ac.uk, Cancer Research UK Convergence Science Centre, Imperial College London

  • Second supervisor: Parastoo Salah, p.salah@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Background and Rationale: Ethic Minority Americans diagnosed with breast or prostate cancer under 50 often face adverse survivorship outcomes due to complex interactions among baseline comorbidities, lifestyle behaviours, and treatment modalities. Conventional models relying on clinical and demographic data provide limited insights into these disparities. Leveraging the All of Us dataset, this study applies AI and machine learning techniques to examine these multifactorial pathways and identify modifiable risks, with a focus on underserved populations. Objectives: • Quantify the contributions of comorbidities, lifestyle behaviours, and treatments to survivorship. • Explore the role of genomic, transcriptomic, and proteomic data in mediating these relationships. • Incorporate geospatial and environmental variables to evaluate neighbourhood-level impacts on survivorship. • Use machine learning (ML) and natural language processing (NLP) to uncover latent subgroups and improve risk stratification. • Conduct health economic evaluations to assess cost-effectiveness of interventions. Methods: This retrospective cohort study utilizes longitudinal data from All of Us, including EHRs, multi-omics, and geospatial metrics. Key variables include comorbidities (e.g., diabetes, hypertension), treatments (e.g., chemotherapy, radiotherapy), and unstructured clinical data extracted via NLP. ML models (e.g., random forests, gradient boosting) will identify nonlinear interactions and latent subgroups, while structural equation modelling (SEM) and mediation analyses will clarify causal pathways. Bayesian models will assess uncertainty in multi-omics interactions. Expected Outcomes: This study aims to define high-risk profiles and actionable pathways for Ethic Minority American cancer survivors. Findings will inform individualized interventions, improve survivorship guidelines, and address long-standing health inequities, while providing evidence for cost-effective resource allocation in oncology care.

Machine learning-based image segmentation for analyzing interfacial fluid behavior in porous ‎rocks#

  • Project code: atva-077

  • Main supervisor: Atefeh Vafaie, a.vafaie@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Benoit Cordonnier, benoit.cordonnier@esrf.fr, European Synchrotron Radiation Facility, France

  • Available to: EDSML GEMS

  • This project may accept multiple students.

A comprehensive understanding of flow and transport phenomena in porous rock is critical for a ‎wide range of geoenergy applications. This behavior is largely governed by the microstructure of the ‎rock, which is naturally heterogeneous and constitutes complex distribution and morphology of ‎pores and solid grains. Breakthroughs in micro X-ray imaging of rocks have significantly enhanced ‎our ability to investigate pore-scale phenomena. One of these mechanisms is chaotic fluid mixing. ‎We have recently conducted a state-of-the-art experiment comprising real-time imaging of the co-‎injection of two miscible fluids into rock samples at the European Synchrotron Radiation Facility ‎‎(ESRF) to directly visualize chaotic mixing in rock for the first time. A key step in the image processing ‎step is the segmentation of the two fluid phases within the pore space while effectively masking out ‎the solid grains. However, traditional phase retrieval techniques struggle with heterogeneous, high-‎contrast, or multiphase systems, such as those found in natural rocks. We propose a project ‎to enhance phase retrieval and segmentation algorithms to improve the accuracy of fluid behavior ‎analysis in porous media. We will achieve this by integrating deep learning models with iterative ‎phase retrieval methods to refine fluid interface visualization while excluding solid grains. ‎Additionally, we will incorporate the so-called PagoPag technique—developed at the ESRF—which ‎reconstructs phase information from a single defocused intensity image by optimizing phase shift-to-‎absorption ratios. The anticipated outcomes of this research include (1) a machine-learning tool for ‎precise multi-phase fluid segmentation, (2) a well-documented codebase to support future studies in ‎this field, and (3) high-resolution segmented images that reveal insights into chaotic fluid behavior in ‎rocks, with broad implications for multiphase flow in porous materials.‎

Impact of Environmental Change on Biogeochemical Cycles#

  • Project code: dowe-030

  • Main supervisor: Dominik Weiss, d.weiss@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Kerry Gallagher, kerry.gallagher@univ-rennes.fr, University of Rennes, France

  • Available to: ACSE EDSML

  • This project may accept multiple students.

We are interested in understanding how environmental change affects biogeochemical at global and at microscopic levels. We have a special interest in the reconstruction of past trace element and mineral dust deposition and predicting the effect sea level rise and ocean acidification on biogeochemical cycles and processes

Three different projects we offer are

  1. Mapping effect of soil salinity on biogeochemical processes in coastal and low-lying flood soils

  2. Testing connectivity between global mineral dust deposition since the last glaciation

  3. Identifying chemical composition of natural aerosols using machine learning

If you are interested, contact me well in advance so we can tailor the project to your interests. Projects will develop computational solutions from scratch, substantially extent the capabilities of an existing code or built a model to analysis and interpret experimental data sets from our group or the literature

Micronutrient and Pollutant Dynamics in natural and engineered systems#

  • Project code: dowe-029

  • Main supervisor: Dominik Weiss, d.weiss@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Mengli Chen, mengli.chen@nus.edu.sg, National University of Singapore, Singapore

  • Available to: ACSE EDSML

  • This project may accept multiple students.

We are interested in understanding processes that control micronutrient and pollutant dynamics in the environment. We have a special interest in persistent pollutants such as lead and PFAS and the role that organic ligands play in the acquisition and mobility.
Possible three distinct projects

  1. Identifying lead hotspots using machine learning

  2. Mapping and identifying global atmospheric lead contamination

  3. Testing ligand exchange theory as dominant mechanisms for multiple siderophore excretion on plants

  4. Identify the structural controls of uranium-siderophore stabilty in alkaline solutions
    

If you are interested, pls contact me well in advance so we can tailor the project to your interests. Projects will develop computational solutions from scratch, substantially extent the capabilities of an existing code or built a model to analysis and interpret experimental data sets from our group or the literature

Sustainable Water Treatment#

  • Project code: dowe-028

  • Main supervisor: Dominik Weiss, d.weiss@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Jay Bullen, jbullen@anglianwater.co.uk, Anglian Water

  • Available to: ACSE EDSML

  • This project may accept multiple students.

Our group is in the process of creating an Imperial based spin off which is developing new solutions for the sustainable treatment of contaminated water. There are different problems we address, and we are interested in applicants with an interest in aqueous chemistry and environmental engineering. We offer four distinct projects

  1. Development of open-source software for the analysis of experimental kinetic reaction data

  2. Cost benefit calculations for sustainable phosphorous and arsenic treatment using novel resins developed at Imperial College London

  3. Improvement of statistical data analysis in water treatment

  4. Development of revised pseudo rate equations for the kinetic process of adsorption and photo oxidation If you are interested, pls contact us well in advance so we can tailor the project to your interests.

Modelling stable isotope fractionation in the context of carbon storage#

  • Project code: dowe-118

  • Main supervisor: Dominik Weiss, d.weiss@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project does not accept multiple students.

We are studying geochemical reactions in the context of carbon sequestration. We conduct experiments in the lab and use isotopes to quantify relevant processes. We now want to develop a isotope fractionation model to test/confirm our experimental data. To this end, the student would develop the relevant equations, code them in Python and then compare theoretical calculations with experimental data. We built up this work on previous MSc projects

Rapid modelling of groundwater sourced heating and cooling (GWHC) systems using Machine Learning#

  • Project code: gewe-125

  • Main supervisor: Gege Wen, g.wen@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Matthew Jackson, m.d.jackson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Open-loop groundwater sourced heating and cooling (GWHC) systems are used to provide sustainable, low carbon heating and cooling to large buildings such as office blocks, hospitals and museums. Groundwater at approximately fixed temperature is pumped from an aquifer and used to provide heating and cooling via a heat pump. The warmed (in winter) and cooled (in summer) groundwater is re-injected into the aquifer, creating complex plumes of warm and cool water.
Numerical modelling of flow and heat transfer in the aquifer during storage and production is used to design the installation and assess its likely performance. However, numerical modelling is typically time consuming, requiring fine grid- or mesh-resolution and small time-steps. The time and effort available to run these simulations in practical deployments is usually short, so simulations are often run that are too coarse to be accurate, or do not capture the full range of design options and associated uncertainties. This project will investigate the use of machine learning (ML) models to deliver fast but accurate simulations of GWHC. The project will build on earlier research in which a Graph Neural Network (GNN)-based ML approach, implemented on a purely data-driven basis, was developed to simulate the related technology of Aquifer Thermal Energy Storage (ATES). The ML proxy is trained using outputs from our in-house Imperial College Finite Element Reservoir Simulator (IC-FERST), an advanced code that uses dynamic mesh optimization to provide high solution accuracy at lower computational cost. The practical consequence is that the mesh changes between solution snapshots used for training. Conventional Convolutional Neural Network (CNN)-based ML models require a fixed mesh. This project will test and extend the application of the GNN approach to model GWHC systems.

Particle Modelling using Neural Networks - AI4Particles#

  • Project code: jixi-184

  • Main supervisor: Jiansheng Xiang, j.xiang@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Christopher Pain, c.pain@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Our vision for the AI4Particles is the formation of a general particle based method that can model any particle system from solids structures and fractured Discrete Element Modelling structures to Monte-Carlo for radiation transport. This project builds on a new approach that solves for particles on a convolutional grid and using a neural network with analytically defined weights [1]. Other applications are vast including molecular dynamics, Smooth Particle Hydrodynamics (SPH), Particle in a cell modelling to understanding turbulent mixing by seeding particles in flows to plankton in the oceans and people behaviour. Particles are an example of complex system behaviour with emerging dynamic behaviours and as such the approach AI4Particles may be re-applied to systems modelling in general. We will form a focus team in order to help develop these other application areas and take advantage of all the massive benefits of AI and associated workflows. In addition, it so often common place to represent a systems dynamics or to discover systems dynamics that is represented by a PDE, see [2], e.g. for forming a turbulent model closure. However, particle systems could equally be discovered with the dynamics of interest and thus this work is a major step forward in this direction. e.g. SPH particles or Lattice Boltzmann methods used to model fluids. The Monte Carlo and DEM are developed as extreme case in which the other examples are closely related to one of these and thus relatively universally applicable. Monte Carlo and DEM are also of particular interest in themselves but other particle models can be developed from these. We also address the link to the continuum. The benefits of ANN-based solvers without necessitating training data, enabling GPU-accelerated computations, enhanced programmability, platform interoperability, full differentiability, seamless multi-physics integration, and compatibility with surrogate models based on trained neural networks. Projects on offer include: (1) Monte-Carlo modelling of nuclear reactors (2) SPH modelling (3) Molecular dynamics The coding will be in PyTorch and Python.

[1] Naderi, Chen, Yang, Xiang, Heaney, Latham, Wang, Pain (2024) A discrete element solution method embedded within a Neural Network, Powder Technology 448: 120258. https://doi.org/10.1016/j.powtec.2024.120258.

[2] Chen, Heaney, Gomes, Matar, Pain (2024) Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries, Computer Methods in Applied Mechanics and Engineering 426: 116974. https://doi.org/10.1016/j.cma.2024.116974

Towards Trustworthy Chatbots: Multi-Strategy Indexing, Retrieval, and Ranking in RAG Using Reinforcement Learning#

  • Project code: weya-190

  • Main supervisor: Wenxia Yang, wenxia.yang@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor: Rhodri Nelson, rhodri.nelson@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Available to: ACSE

  • This project may accept multiple students.

Retrieval-Augmented Generation (RAG) is widely used in large language model (LLM)-based chatbots to answer questions that extend beyond the LLM’s training data. One of the key challenges in RAG systems is optimizing the retrieval process—ensuring that the most relevant information is extracted from the vector store to be used as context for generating responses. However, different document formats and content structures make it difficult to define optimal chunking and retrieval strategies through rule-based approaches. Additionally, selecting the most informative recalled chunks to generate high-quality answers remains an open challenge. This project aims to develop a multi-stage ranking system that integrates diverse chunking and retrieval strategies alongside a ranking model optimized using reinforcement learning (RL). The approach leverages LLM-based evaluation, where the model assesses the quality of RAG-generated responses and provides feedback for RL-based optimization. By iteratively refining retrieval strategies, this research seeks to improve the reliability and trustworthiness of chatbot-generated responses.

Dynamic Learning Path Recommendations via Knowledge Graphs Extracted from Lecture Content#

  • Project code: weya-192

  • Main supervisor: Wenxia Yang, wenxia.yang@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project may accept multiple students.

Effective learning often requires structured guidance, with clear learning paths and well-defined milestones. This project aims to automatically generate personalized learning paths by extracting and analyzing latent knowledge graphs embedded within lecture content. Using natural language processing (NLP) techniques and large language models (LLMs), we will develop methods to mine implicit knowledge structures from course lectures. By mapping these concepts into a structured knowledge graph, we can dynamically recommend learning paths based on a user’s background, learning goals, and progress. This research will explore strategies for adapting learning paths in real-time, ensuring that learners receive targeted recommendations that help them progress efficiently while reinforcing prerequisite knowledge. The system could be particularly useful for self-paced learning environments, where students may have diverse starting points and knowledge gaps.

Conversational Agents with Personalized Guidance: Generating Context-Aware Reasoning Steps Description#

  • Project code: weya-191

  • Main supervisor: Wenxia Yang, wenxia.yang@imperial.ac.uk, Department of Earth Science and Engineering, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE

  • This project may accept multiple students.

Effective explanations are tailored to an individual’s background and prior knowledge. For instance, explaining the word “intricate” to a software engineer might involve an analogy such as: “Optimizing the performance of an intricate system, such as a distributed database, requires fine-tuning the complex interactions between servers, caches, and network protocols.” This project will focus on designing a structured representation model for user profiles, incorporating factors such as background knowledge, learning history, and behavioral patterns. Using this personalized profile, we will apply prompt engineering techniques—particularly those based on Chain-of-Thought (CoT) reasoning—to generate customized reasoning steps. The goal is to create an intelligent conversational agent that dynamically adapts explanations to users’ needs, helping them overcome conceptual roadblocks in their learning journey. This research will explore methods for modeling user understanding and for dynamically adjusting explanations to maximize engagement and comprehension.

Categorizing and Evaluating Corporate Sustainability Actions Using NLP and Impact Assessment#

  • Project code: mazo-165

  • Main supervisor: Maurizio Zollo, m.zollo@imperial.ac.uk, Business School, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: ACSE EDSML GEMS

  • This project may accept multiple students.

Together with a dataset with more than 1 million corporate sustainability intiatives, the Leonardo Centre developed a model for their maturity assessement, dividing the initiatives in five groups.

We seek a highly motivated student to support a project analysing the text of initiatives in each group, understanding the evolution, diversity and impact of its actions. That will allowi one to further divide the five groups into sub-categories with refined results.

The GOLDEN dataset, developed in collaboration with the Leonardo Centre on Business for Society, is the most comprehensive global repository of corporate sustainability initiatives. It consists of around 1 million sustainability actions extracted from more than 60,000 sustainability reports of over 12,000 publicly listed companies accross more than 20 years. Machine learning algorithms identify and classify these initiatives based on the 17 Sustainable Development Goals (SDGs) and 14 behavioral components. Besides, with its leading business research expertise, the Centre developed a bheavioural model to assess maturity and impact of initiatives.

The student will use NLP techniques, including topic modeling, text clustering, and sentiment analysis, to analyze sustainability initiatives from Leonardo Centre’s database, integrating financial, environmental, and social impact data for initiative assessment. A second desired output is a machine learning model that classifies initiatives into the identified sub-categories.

Understanding the Evolution and Impact of Corporate Sustainability Actions Over Time through Text Analysis#

  • Project code: mazo-164

  • Main supervisor: Maurizio Zollo, m.zollo@imperial.ac.uk, Business School, Imperial College London

  • Second supervisor is not yet assigned.

  • Available to: EDSML GEMS

  • This project may accept multiple students.

Sustainability initiatives evolve over time, with varying levels of innovation and impact. This project will investigate the behavioral trends of sustainability initiatives, analyzing how different approaches (e.g., innovation-driven vs. philanthropic) impact global challenges.

The GOLDEN dataset, developed in collaboration with the Leonardo Centre on Business for Society, is the most comprehensive global repository of corporate sustainability initiatives. It consists of around 1 million sustainability actions extracted from more than 60,000 sustainability reports of over 12,000 publicly listed companies accross more than 20 years. Machine learning algorithms identify and classify these initiatives based on the 17 Sustainable Development Goals (SDGs) and 14 behavioral components. This provides researchers with a novel level of granularity to analyze corporate sustainability strategies, track the evolution of sustainability initiatives across industries and geographies, and support evidence-based policymaking and investment decisions.

The student is invited to analyse the behavioural dimesion over the years, focused on a main SDG topic. That may consists of a time-series analysis of the behavioural changes followed by text analysis and itnegration with other data for impact assessment. NLP methods such as clustering and topic modeling will be used to classify initiatives, while financial and sustainability impact data will be incorporated to assess outcomes.