Skip to main navigation Skip to search Skip to main content

Frontiers in Graph Generation and Representation

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)

Abstract

Graph data is highly common and flexible, expressing everything from small molecules to billions of users in social networks, but only recently have Graph Neural Networks (GNNs) been developed to expressively learn from such data. Given the nascency of GNNs, the field has many fundamental frontiers, with more research gaps than answered questions. This thesis adopts an iterative approach to developing research questions, addressing extant frontiers in graph learning.

In our first research question, we evaluate whether GNNs can generate realistic social networks for synthetic datasets. We show that GNNs out-perform rule-based models but are constrained by in-memory costs, reaching only hundreds of nodes where real networks are orders of magnitude larger. We address this in our second research question, developing a hierarchical factorisation and the HiGGs framework, producing graphs orders of magnitude larger than any other works using GNNs.

In answering both questions, we identify that graph generator metrics are insufficiently expressive and parameter-sensitive. More expressive metrics require multi-domain graph learners, leading to our third research question where we develop the pre-training method ToP. Key findings show pre-training without features enables consistently useful transfer learning, and out-of-domain pre-training significantly outperforms in-domain. These findings contradict contemporary assumptions and open broad research avenues into Graph Foundation Models (GFMs).

Such findings merit investigation of why ToP pre-training works, but the field lacked methods to interrogate information sources used by graph learners. To answer this final research question we develop Noise-Noise Analysis, measuring information balance between features and structure. We find many GNNs are heavily feature-biased. Applied to ToP encoders, Noise-Noise Analysis shows ToP pre-training induces structural bias, lost when features are more useful downstream.

The combined findings constitute significant contributions to graph learning, presenting both methodological and empirical advances while identifying key frontiers for future research.
Date of Award9 Dec 2025
Original languageEnglish
Awarding Institution
  • University of Bristol
SupervisorNirav Ajmeri (Supervisor) & Telmo de Menezes e Silva Filho (Supervisor)

Keywords

  • Artificial Intelligence
  • Machine Learning
  • Graph Neural Network
  • Deep Learning
  • Social Networks
  • Chemistry
  • Representation Learning
  • Generative Models

Cite this

'