Why does one song soar to the top of the charts while another plummets? The anatomy of a hit remains a stubborn mystery that researchers and the music industry at large have been longing to solve. A new study suggests that the secret to distinguishing a hit lies in the brains of listeners—and that artificial intelligence can analyze physiological signals to reveal that secret. But other “hit song science” researchers aren’t ready to declare victory just yet.
Researchers from Claremont Graduate University used a wearable, smartwatchlike device to track the cardiac responses of people listening to music. They used an algorithm to convert these data into what they say is a proxy for neural activity. The monitor focused on reactions associated with attention and emotion. A machine-learning model trained on these data was then able to classify whether a song was a hit or a flop with 97 percent accuracy. The finding was published in Frontiers in Artificial Intelligence earlier this month.
This study is the latest, and seemingly most successful, attempt to solve the decades-old “hit song science problem,” which suggests that automated methods such as machine-learning software can anticipate whether a song will become a hit before its release. Some commentators have suggested this technology could reduce music production costs, curate public playlists and even render TV talent show judges obsolete. The new model’s purported near-flawless accuracy at predicting song popularity dangles the tantalizing possibility of transforming the creative process for artists and the distribution process for streaming services. But the study also raises concerns about the reliability and ethical implications of fusing artificial intelligence and brain data.
“The study could be groundbreaking but only if it’s replicated and generalizable. There are many biases that can influence a machine-learning experiment, especially one that attempts to predict human preferences,” says Hoda Khalil, a data scientist at Carleton University in Ontario, who has researched hit song science but is not affiliated with the study. “And even if we have sufficient statistical evidence to generalize, we need to consider how this model could be misused. The technology cannot leap far ahead of the ethical considerations.”
Thus far, determining what qualities link popular songs has been more a matter of alchemy than science. Music industry experts have traditionally relied on large databases to analyze the lyrical and acoustic aspects of hit songs, including their tempo, explicitness and danceability. But this method of prediction has performed only minimally better than a random coin toss.
In 2011 machine-learning engineers at the University of Bristol in England developed a “hit potential equation” that analyzed 23 song characteristics to determine a song’s popularity. the equation was able to classify a hit with a 60 percent accuracy rate. Khalil and her colleagues have also analyzed data from more than 600,000 songs and have found no significant correlations between various acoustic features and the number of weeks a song remained on the Billboard Hot 100 or Spotify Top 50 charts. Even the entrepreneur who coined the term “hit song science,” Mike McCready, was later scrutinized by researchers who determined there simply wasn’t enough science at the time to support his theory.
A fresh approach was overdue, says Paul Zak, a neuroeconomist at Claremont Graduate University and senior author of the new study. Rather than focus on songs themselves, his team sought to explore how humans respond to them. “The connection seemed almost too simple. Songs are designed to create an emotional experience for people, and those emotions come from the brain,” Zak says.
He and his team equipped 33 participants with wearable cardiac sensors, which use light waves that penetrate the skin to monitor changes in blood flow, similar to the way that traditional smartwatches and fitness trackers detect heart rate. Participants listened to 24 songs, ranging from the megahit “Dance Monkey,” by Tones and I, to the commercial flop “Dekario (Pain),” by NLE Choppa. The participants’ heart rate data were then fed through the commercial platform Immersion Neuroscience, which, the researchers contend, algorithmically converts cardiac activity into a combined metric of attention and emotional resonance known as “immersion.” The team says these immersion signals were able to predict hit songs with moderate accuracy, even without machine-learning analysis—hit songs induced greater immersion. In contrast, participants’ subjective ranking of how much they enjoyed a song was not an accurate proxy for its ultimate public popularity.
Zak—who co-founded Immersion Neuroscience and currently serves as its chief immersion officer—explains the rationale behind using cardiac data as a proxy for neural response. He says a robust emotional response triggers the brain to synthesize the “feel-good” neurochemical oxytocin, intensifying activity in the vagus nerve, which connects the brain, gut and heart.
Not everyone agrees. “The study hinges on the neurophysiological measure of immersion, but this measure needs further scientific validation,” says Stefan Koelsch, a neuroscientist at the University of Bergen in Norway and guest researcher at the Max Planck Institute for Human Cognitive and Brain Sciences in Leipzig, Germany. Koelsch also notes that although the study cited several papers to support the validity of “immersion,” many of them were co-authored by Zak, and not all of them were published in peer-reviewed journals.
This wouldn’t be the first time scientists have used brain signals to predict song popularity. In 2011 researchers from Emory University used functional magnetic resonance imaging (fMRI), which measures brain activity by detecting changes in blood flow, to predict the commercial success or failure of songs. They found that weak responses in the nucleus accumbens, the region that regulates how our brain processes motivation and reward, accurately predicted 90 percent of songs that sold fewer than 20,000 copies. But even though this technique was good at pinpointing less successful music, it could only predict hit songs 30 percent of the time.
The fMRI approach, aside from having lower predictive power, is somewhat impractical. A typical fMRI session lasts at least 45 minutes and requires participants to endure the discomfort of being confined in a cold, sterile chamber that can make some people feel claustrophobic. So if a portable and lightweight smartwatch can truly measure an individual’s neural activity, it may revolutionize the way researchers tackle the field of hit song science.
It may also be too good to be true, Koelsch says. Based on his previous research on musical pleasure and brain activity, he’s skeptical not only of immersion but also of the very idea that machine-learning models can capture the intricate nuances that make a song a hit. For instance, in 2019 Koelsch and his colleagues performed their own study of musical enjoyment. It involved using machine learning to determine how predictable a song’s chords were and fMRI scans to study how participants’ brain reacted to those songs. Although the initial study uncovered a relationship between predictability and emotional response, Koelsch has since been unable to replicate those findings. “It’s very difficult to find reliable indicators for even the crudest differences between pleasant and unpleasant music, let alone for the subtle differences that make a nice musical piece become a hit,” he says. “So I’m skeptical.” As of publication time, Zak has not responded to requests for comment on criticisms of his recent study.
If these recent results are successfully replicated, however, the new model might hold immense commercial potential. To Zak, its primary utility lies not necessarily in creating new songs but in efficiently sorting through the vast array of existing ones. According to him, the study originated when a music streaming service approached his group. Zak says that the streamer’s team had been overwhelmed by the volume of new songs released daily—tens of thousands—and sought to identify the tracks that would truly resonate with listeners (without having to manually parse each one).
With the new model, “the right entertainment could be sent to audiences based on their neurophysiology,” Zak said in a press release for the study. “Instead of being offered hundreds of choices, they might be given just two or three, making it easier and faster for them to choose music that they will enjoy.” He envisions the technology as an opt-in service where data are anonymized and only shared if users sign a consent form.
“As wearable devices become cheaper and more common, this technology can passively monitor your brain activity and recommend music, movies or TV shows based on that data,” Zak says. “Who wouldn’t want that?”
But even if this approach works, the prospect of combining mind reading and machine learning to predict hit songs remains fraught with ethical dilemmas. “If we train a machine-learning model to understand how different types of music influence brain activity, couldn’t it be easily exploited to manipulate people’s emotions?” Khalil says. She points out that relying solely on an opt-in approach for such services often fails to safeguard users from privacy breaches. “Many users just accept the terms and conditions without even reading them,” Khalil says. “That opens the door for data to be unintentionally shared and abused.”
Our favorite songs may not seem like intimate, personal data, but they can offer a window into someone’s moods, tastes and habits. And when these details are coupled with personalized data on brain activity, we’re forced to consider how much information we’re willing to relinquish for the perfect playlist.