AbstractIn this thesis, neural networks are applied to fitting potential energy surfaces (PES) and to de novo drug design, to study the current suitability and effectiveness of these algorithms to different chemical problems.
The first goal was to fit the PES of a long hydrocarbon chain (30 carbons) reacting with a cyano radical (CN). The size of this system makes creating a training set computationally expensive and time consuming. Consequently, a ‘fragment-learning’ approach was employed. The training data set was constructed using hydrocarbons no larger than hexane reacting with CN, as this would reduce the time required to both generate the data and training the neural network. Thanks to the software developed during this project, the fitted PES showed mean absolute errors within 10 kJ mol−1 compared to the reference data. In addition, the prediction times were a couple of orders of magnitude faster than the reference electronic structure calculations. This result is encouraging because it shows the transferability of neural networks potentials of reactive systems.
The second goal was to study the ability of recurrent neural networks (RNNs) to generate new drug candidates. Initially, multiple techniques described in the literature, such as fine-tuning and reinforce- ment learning, were used to designing new Kinase inhibitors. From this first exploratory phase it became clear that the quality of the fine-tuning data set has a heavy impact on the results. Conse- quently, a more deep investigation of the process of fine-tuning RNNs for medicinal chemistry projects was carried out. The results suggest that RNNs should not be fine-tuned with fewer than 250-300 samples, although more are needed if the molecules in the data set are very diverse. This means that in their current form, RNNs may not be the best tool for the early stages of de novo drug design projects and further development is needed.
|Date of Award||29 Sep 2020|
|Sponsors||Centre for Doctoral Training in Theory and Modelling in Chemical Sciences (TMCS)|
|Supervisor||David Glowacki (Supervisor)|
- Machine Learning
- Potential Energy Surfaces
- Drug Design