Abstract
IMPRESSION Generation 2 (IMP-G2) is a Graph Transformer Network capable of simultaneously predicting NMR parameters, including chemical shifts and coupling constants,within milliseconds. It can predict δ1H, δ13C, δ15N, δ17O, and δ19F chemical shifts, along with coupling constants for all combinations up to four bonds away (1−4JHCNOF−HCNOF), achieving a level of performance previously unattained.
IMP-G2 is trained on an expansive dataset of molecules containing H, C, N, O, F, Cl, P, Br, S, and Si atoms, with molecular weights up to 500 Da. It delivers state-of-the-art accuracy for δ1H and δ13C chemical shift predictions. The thesis details the incremental expansion of the training dataset, architectural improvements to the model, and enhanced inference capabilities that enabled these milestones.
IMP-G2 represents the most robust machine learning model to date for NMR parameter prediction. It opens a new avenue for rapid generation of DFT-quality data, with the potential to replace density functional theory (DFT) methods for NMR prediction, reducing computation times from hours or days to milliseconds.
Date of Award | 4 Feb 2025 |
---|---|
Original language | English |
Awarding Institution |
|
Sponsors | Genentech & Engineering and Physical Sciences Research Council |
Supervisor | Craig P Butts (Supervisor) |
Keywords
- AI
- machine learning
- DFT
- NMR
- Cheminformatics
- Structure Elucidation
- Graph Neural Network
- Graph Transformer Network
- Chemistry
- pharmaceutical
- Drug Discovery