Categorical Chemistry and Machine Learning for Pharmaceutical and Agrochemical Development

  • Gale, Ella M (Principal Investigator)

Project Details

Description

Key challenge: the only practical way to develop synthetic routes containing novel reactions requires a human chemist who understands the ‘language’ of chemistry.

Retrosynthesis is the process of figuring out synthetic routes (methods to make a molecule) using either known or novel reactions and is critical in all pharmaceutical, fine chemical, agrochemical and materials research. Chemists use a series of ad hoc disconnection rules to do retrosynthesis based on molecular partitions called synthons.

Chemical reactions follow rules that could be described as a language where synthons are word-stems, molecules are words, disconnection rules are a grammar, reactions are sentences and a synthetic route is prose.

Category theory is a young field of mathematics which defines and describes the relations between mathematical concepts and systems in an abstract way: it is a mathematical system for encoding formal languages. Category theory has been successfully applied to linguistics, quantum computing, resource theory, database schema etc.

Until recently, there was no formal mathematical basis for describing synthons and disconnection rules. We have developed one using applied category theory [1,2], specifically, using string diagrams [3, 4]; i.e., we have a formal encoding of the language of chemical reactions.

Aim: I propose to build a proof-of-principle retrosynthesis machine learning algorithm using our recently developed formalisation to demonstrate its potential.

By encoding a limited set of rules and reactions we aim to demonstrate valid and complete synthetic routes, including novel reactions, in a subfield of chemistry.

Layman's description

By utilising the very new mathematics from category theory, we can make chemistry machine learning algorithms that are capable of creativity, allowing us to do drug development much faster.
Alternative titleCategorical Chemistry
AcronymCatChem
StatusActive
Effective start/end date2/10/2325/07/25

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.