RICE (US)—A computer program is allowing researchers to accurately simulate protein folding dramatically faster than previous methods.
Understanding the intricacies of protein folding is a crucial step in deciphering the genetic code that serves as the operating system of all living things. Protein misfolding is a critical factor in many diseases, including Alzheimer’s, cystic fibrosis, emphysema, and various cancers. The new computer program will allow scientists to peer deeper into the roots of the diseases.
The researchers describe their simulation of three short proteins using the new technique in the cover story of the current Journal of Chemical Physics.
“Protein folding is regarded as one of the biggest unsolved problems in biophysics,” says Jianpeng Ma, who is a professor of bioengineering at Rice University and of biochemistry and molecular biology at Baylor College of Medicine. “This is a technically challenging task, and many groups around the world have been competing for years to make the process faster and more accurate.”
Correctly folded proteins perform many roles: as enzymes vital to metabolism; structural elements in bone, muscle, and cell scaffolding; mechanisms in cell signaling and immune response and much more.
Proteins start as amino acid molecules floating in a cell. Following DNA blueprints, the molecules are strung together like beads on a necklace, called a polypeptide. Every polypeptide of a given sequence will fold precisely the same way into the shape, called the native state, that determines its function.
Like a river finding the shortest route to the sea, proteins always find their way to their native states in an instant. How that happens is one of life’s great mysteries.
“The question is how nature finds this final folded state so quickly,” Ma says.
Ma and graduate student Cheng Zhang reached unprecedented accuracy and speed in simulating the folding of three relatively short but well-understood proteins—trpzip2, trp-cage, and the villin headpiece—in the presence of water molecules, which Ma described as the best way to simulate physiological conditions.
Though the proteins assemble themselves in nature almost instantly, the team’s algorithm took weeks to run the simulation. Still, that was far faster than others have achieved. “And for trpzip and villin, nobody has reached this level of accuracy in the native state under similar simulation conditions—that is, in the presence of water, which is the most stringent condition,” Ma says.
The researchers employed two novel strategies, continuously variable temperature and single-copy simulation.
“In the process of simulation, called sampling, the computer has to search through many, many possible structures of the protein chain to find the lowest-energy solution,” Ma explains. “A polypeptide chain en route to its native state encounters many energy barriers, much like when one navigates through a rugged mountain landscape.
“Speeding up the process of crossing those barriers is the key to finding the true global minimum (energy state),” he says. “In our simulation, temperature is a variable that goes continuously up and down. When the temperature is higher, proteins can overcome energy barriers faster. It’s equivalent to speeding up the motion of atoms.”
Ma says the previous state of the art was to run multiple copies of a simulation in parallel on many computers—an intensive and expensive approach. “The single-copy approach uses only one simulation, essentially, to find the native state of the protein. This is a major plus, because anyone with reasonable computing power can run this method.”
The National Institutes of Health, the National Science Foundation, the Welch Foundation, the Welch Chemistry and Biology Collaborative Grant from the John S. Dunn Gulf Coast Consortium for Chemical Genomics and the Rice Faculty Initiatives Fund supported the research.
More news from Rice: www.media.rice.edu/media/