From Symphony to Structure: Listening to Proteins Fold

ABOVE: Scientists sonified the folding-unfolding dynamics of a small protein structure (WW domain) surrounded by water molecules to understand how the protein gets to its final conformation. The WW domain structure is surrounded by a tubular wave that symbolizes the water and the sound wave. Martin Gruebele and Carla Scaletti, with use of Canva generative AI

When a protein folds, its string of amino acids wiggles and jiggles through countless conformations before it forms a fully folded, functional protein. This rapid and complex process is hard to visualize.

Now, Martin Gruebele, a chemist at the University of Illinois-Urbana Champaign, and his team have found a way to use sound along with sight to better understand protein folding. He teamed up with composer and software developer Carla Scaletti, the cofounder of Symbolic Sound Corporation, to convert simulated protein folding data into a series of sounds with different pitches. The scientists identified patterns in the sounds and inferred how the bonds between the amino acids played a major role in orchestrating the folding process. The results, published in the Proceedings of the National Academy of Sciences, will help scientists unravel the mysteries behind protein folding.¹

“Vision is one of the most obvious and direct ways to process input, but when you think about it, you use your ears a lot for clues from the environment to get around. You aren’t even often aware of how you use sounds to navigate along with vision,” said Gruebele.

For their analysis, the team focused on hydrogen bonds, which are weak bonds that the protein forms internally between the atoms in its amino acids and with the water surrounding it. These bonds are dynamic, rapidly forming and breaking over time as the protein folds. Because it only takes nanoseconds to microseconds for a protein to morph along its path to its final structure, the scientists needed to slow down the process during their analysis to be able to catch the sounds.

“We were a good match because Martin and his group are interested in dynamics,” said Scaletti. “Not just the spatial structure of the protein but how it changes over time, and that seemed like a perfect match for sound as sound doesn’t even exist without time.”

Usually, scientists visualize protein folding with the help of molecular dynamics (MD) simulations, which model the physical movement of the atoms in the folding protein and the water surrounding it.²

“Water is a very difficult thing to visualize,” said Gruebele. “People solve MD simulations; there’s only one protein molecule in these simulations but there are thousands of water molecules. And it’s hard to see what they are doing. They just seem to be randomly moving around. Carla and I wanted to put some order in that chaos.”

The scientists used WW domain, a protein domain with two conserved tryptophan residues, to model the folding process. Using data from MD simulations of the protein domain folding and unfolding, they determined where a hydrogen bond could potentially form. Scaletti then assigned a pitch to each hydrogen bond. Whenever the conditions were right for a bond to form, the software would make a sound. The resultant tunes arising from the series of bonds that emerged over time informed the scientists of how the protein dynamically changed its conformations in water.

A heatmap of different hydrogen bond transitions from left to right, with yellow horizontal lines indicating the transitions. Time is on the x axis and the different hydrogen bonds on the y axis.

The scientists could see—and hear—the different patterns of hydrogen bond (H-bond) formation as the protein folded or unfolded. A ‘piano roll’ representation of H-bond likelihoods for each of the folding transitions, arranged in order from shortest duration on the left to the longest duration on the right. Time is represented from left-to-right on the x axis (and several transitions are displayed side-by-side for comparison). The likelihood of a bond is mapped to color and intensity.

Image generated in Kyma by Carla Scaletti

As the team listened to the hydrogen bonds breaking and forming as the protein folded and unfolded, they picked up on patterns in the noise. “It’s like listening to a symphony orchestra where a lot of people are playing, but with some effort you can listen to the individual players,” said Gruebele.

The team used a combination of sonification, visualization, and physics calculations to understand how the hydrogen bonds contributed to the folding and unfolding of the protein. Based on their auditory analysis, they found that the protein took multiple trajectories as it either raced or ambled towards its folded structure. They called the slower transitions "Meander," where it appeared as if the protein and water interacted erroneously, getting caught in a loop of wrong hydrogen bonds that prevented correct folding before the right bonds took over for the final protein structure. There were also “Highways,” where the correct bonds would form very quickly, and everything would fall into place very fast. The water molecules played a major role in stabilizing the protein and regulating these transitions with their own hydrogen bonds. “We now understand why the protein would have evolved with those particular amino acids in it to make those 3D patterns that allow it to fold,” said Gruebele.

“This is a nice example of using sonification for discovery,” said Roseanne Ford, a chemical engineer at the University of Virginia, who was not involved in the research. “Visually, there are simultaneous things going in your field of view—with multiple hydrogen bonds in different locations of the protein—and your eyes can’t track all of that. But you can hear multiple tones or multiple pitches at a time, so you can get a sense of the temporal changes in the hydrogen bonding that are harder to get at visually,” said Ford.

References

Scaletti C, et al. Hydrogen bonding heterogeneity correlates with protein folding transition state passage time as revealed by data sonification. Proc Natl Acad Sci U S A. 2024;121(22):e2319094121.
Piana S, et al. Computational design and experimental testing of the fastest-folding β-sheet protein. J Mol Biol. 2011;405(1):43-48.