![]() |
http://www.cs.ucdavis.edu/~koehl/ |
|
Protein Sequence Design1. Specificity versus Stability 1.1 What is specificity ?The inverse folding problem was originally defined by Pabo
[1] as the problem of defining the sequences compatible with a given protein
fold. A successful protein design calculation should generate a sequence
compatible with the template fold (the "design in" procedure), and
incompatible with competing folds (the "design out" procedure, or specificity
problem). This problem can be reformulated as finding the sequence, S, such
that it has a high probability, P, to be in the template conformation, Cnat,
at room temperature. P is given by:
E(C,S) is the energy of sequence S in conformation C, T is
the temperature and k is the Boltzmann constant. The denominator in eq. (1)
corresponds to a partition function, Z. A rigorous approach to the problem of
maximizing the probability P would require simultaneous and complete
explorations of all of sequence space and conformation space. While this may
be feasible for a short peptide chain with a simplified representation [2], it
cannot be applied to a longer protein chain with a detailed all-atom
representation.
In some studies, this problem has been ignored: Malakauskas
and Mayo [3] successfully redesigned the core of the B1 domain of protein G,
using a variant of the dead-end elimination algorithm, without explicit
consideration of specificity. Ignoring specificity however has not always
been successful. In the case of the HP model, for example, superstable
sequences have been designed with all H inside and all P at the surface [4,5].
These sequences however are not specific, and can fold into many "native"
conformations [6].
Several criteria have been proposed to simplify or replace
the optimization of the occupational probability defined in equation (1).
These criteria all relate to the concept of "foldability", i.e. the degree to
which a particular sequence is likely to fold. To correlate the foldability
with the stability and kinetic accessibility of particular proteins, the
following parameters were suggested:
Some of these criteria have been used in model studies based
on lattice models. It has not been clear, however, how to integrate any of
these terms into a protein design algorithm that does not involve enumeration
of all sequences and structures.
Even though a systematic exploration of conformational space
is not possible, there have been attempts to include competing backbones in
full atom, off-lattice protein design procedures. Among the successful
results, it is worth mentioning the recent work of Harbury and co-workers [12]
who design families of As an alternative, Shakhnovich and Gutin [17,18] proposed a
simple, approximate solution to the problem of specificity, based on the
random energy model (for review, see Pande et al. [19]). In this approach, the
partition function Z (denominator of eq. (1)) is assumed to depend only on the
amino acid composition and not on the ordered sequence itself. Given this
approximation, specificity can be achieved by optimization in sequence space
alone, provided that the amino acid composition of the sequence is held
constant. This procedure has been applied to protein design simulation on
lattice [17, 18]. A major feature of this approach is that it is
computationally feasible, even in the case of full-atom representations.
1. Pabo, C. Designing proteins and peptides. Nature, 301, 200 (1983). 2. Seno, F, Vendruscolo, M, Maritan, A and Banavar, JR. Optimal Protein Design Procedure. Physical Review Letters, 77, 1901-1904 (1996). 3. Malakauskas, SM and Mayo, SL. Design ; Structure and Stability Of a Hyperthermophilic Protein Variant. Nature Structural Biology, 5, 470-475 (1998). 4. Shakhnovich, EI and Gutin, AM. A new approach to the design of stable proteins. Protein Eng., 6, 793-800 (1993). 5. Shakhnovich, EI and Gutin, AM. Engineering of stable and fast-folding sequences of model proteins. Proc. Natl. Acad. Sci. (USA), 90, 7195-7199 (1993). 6. Yue, K and Dill, KA. Inverse protein folding problem: designing polymer sequences. Proc. Natl. Acad. Sci. (USA), 89, 4163-4167 (1992). 7. Shakhnovich, EI. Theoretical-Studies Of Protein-Folding Thermodynamics and Kinetics. Current Opinion In Structural Biology, 7, 29-40 (1997).
8. Goldstein, RA, Luthey-Schulten, ZA and Wolynes, PG. Optimal protein folding codes from spin glass theory. Proc. Natl. Acad. Sci. (USA), 89, 4918-4922 (1992). 9. Socci, N and Onuchic, J. Folding kinetics of protein-like heteropolymers. J. Chem. Phys., 101, 1519-1528 (1994). 10. Abkevich, V, Gutin, A and Shakhnovich, E. Improved design of stable and fast-folding model proteins. Fold. Des., 1, 221-230 (1996). 11. Melin, R, Li, H, Wingreen, N and Tang, C. Designability, thermodynamic stability, and dynamics in protein folding: a lattice model study. J. Chem. Phys., 110, 1252-1262 (1999). 12. Harbury, P, Plecs, J, Tidor, B, Alber, T and Kim, P. High-resolution protein design with backbone freedom. Science, 282, 1462-1467 (1998). 13. Coldren, CD, Hellinga, HW and Caradonna, JP. The Rational Design and Construction Of a Cuboidal Iron-Sulfur Protein. Proceedings Of the National Academy Of Sciences Of the United States Of America, 94, 6635-6640 (1997). 14. Pinto, AL, Hellinga, HW and Caradonna, JP. Construction Of a Catalytically Active Iron Superoxide-Dismutase By Rational Protein Design. Proceedings Of the National Academy Of Sciences Of the United States Of America, 94, 5562-5567 (1997). 15.Hellinga, HW. The Construction Of Metal Centers In Proteins By Rational Design. Folding & Design, 3, R1-R8 (1998). 16. Hellinga, HW. Construction Of a Blue Copper Analog Through Iterative Rational Protein Design Cycles Demonstrates Principles Of Molecular Recognition In Metal Center Formation. Journal Of the American Chemical Society, 120, 10055-10066 (1998). 17. Shakhnovich, EI and Gutin, AM. A new approach to the design of stable proteins. Protein Eng., 6, 793-800 (1993). 18. Shakhnovich, EI and Gutin, AM. Engineering of stable and fast-folding sequences of model proteins. Proc. Natl. Acad. Sci. (USA), 90, 7195-7199 (1993). 19. Pande, VS, Grosberg, AY and Tanaka, T. Statistical-Mechanics of Simple-Models of Protein-Folding and Design. Biophysical Journal, 73, 3192-3210 (1997). |