Design, characterisation and applications of de novo proteins

  • Kathryn L Shelley

Student thesis: Doctoral ThesisDoctor of Philosophy (PhD)

Abstract

The Woolfson lab uses simple, human-comprehensible sequence-structure relationships to direct the rational design of de novo protein structures. Previously the lab has mainly focussed upon the design of coiled coils, as the sequence-structure relationships underpinning this fold are very well understood. Accordingly, it is now possible (with a few limitations) to simply write down a sequence that will fold into a coiled-coil structure with the desired number and orientation of alpha-helices, including backbones not thus far observed in nature.

Coiled coils are not the only protein fold that can be described by simple sequence-structure patterns though. Past bioinformatics analyses, mainly carried out in the 1980s, 1990s and 2000s, found evidence across a wide range of different protein folds of strong amino acid preferences for particular structural features. However, owing to limitations in both the computational power and experimental techniques available at the time, in many cases it remains untested as to whether there is sufficient information encoded in these simple relationships to reliably direct the design of new examples of these folds.

In this thesis, I present two projects that explore the boundaries and limitations, plus the benefits and drawbacks, of protein design with simple, human-comprehensible sequence-structure relationships. In the first of these, I attempt to use such relationships to guide the design of two previously well-characterised all-beta folds: (soluble) beta-sandwiches and transmembrane beta-barrels. After curating an updated dataset of natural examples of these structures from the Protein Data Bank, I use propensity calculations to identify amino acid preferences for different structural features in these folds. I then write a genetic algorithm that is able to use these propensity scores to generate new sequence designs. Lastly I present the progress I have made towards experimentally characterising ten selected transmembrane beta-barrel designs.

In contrast, the second project involves using an array of de novo alpha-helical coiled coils to identify different complex mixtures. The development of this biosensor has been a collaborative effort involving multiple students and postdocs in the Woolfson group; my role has been to develop the computational methods used to analyse the data output from the array. Here I describe how, following careful processing of the input data, I am able to train machine learning models to identify differences between subclasses of a range of sample types, including different brands of tea, plus urine and serum samples from healthy and diseased patients.

Together, these projects are a case study of some of the advantages and disadvantages of rational protein design. The former project explores the extent to which it is possible to design increasingly complex protein folds using simple, human-comprehensible sequence-structure relationships alone; whilst the latter demonstrates how our understanding of such relationships enables reliable fine-tuning of a design’s properties, plus how this degree of control can be exploited in a functional application.
Date of Award13 Oct 2022
Original languageEnglish
Awarding Institution
  • University of Bristol
SupervisorDek N Woolfson (Supervisor) & Leo Brady (Supervisor)

Cite this

'