Gemma Danks
Protein Folding with L-systems
PhD thesis, University of York, 2008

Abstract

Protein folding can be viewed as an emergent phenomenon. The development of the global fold of a protein emerges due to underlying local interactions. These interactions may be modelled using a rule-based approach. L-systems are sets of parallel rewriting rules and are widely used as a mathematical framework for modelling the growth and development of plants.

This thesis presents a proof of concept of the application of L-systems to the protein folding problem. Parallel rewriting rules alter the local conformations of each amino acid residue in a polypeptide chain leading to global conformational changes. Three different L-systems models of protein folding have been developed.

A physics-based model uses parallel rewriting rules that operate on torsion angles according to local interatomic interactions. This model leads to the emergence of global conformations with protein-like compactness.

A knowledge-based stochastic model uses parallel rewriting rules that operate on the secondary structure states of residues according to probabilities that are statistically derived from native protein structures. This model leads to the emergence of global conformations with protein-like secondary structure patterns.

A third model combines physics and knowledge to give an adaptive stochastic L-systems model of protein folding. Probabilities of rewriting secondary structure states are dynamically altered at each derivation step according to local interatomic forces. This model leads to protein-like convergence to a preferred global conformation.

The physics-based, knowledge-based and combined models have been developed further to model the sequential growth of a polypeptide and the simultaneous folding of the partially formed chain. This leads to convergence to different secondary structure preferences for certain residues in the combined model. L-systems provide a natural framework for modelling cotranslational protein folding.