Can AI Learn to Understand Emotions?

Growing up in Egypt in the 1980s, Rana el Kaliouby was fascinated by hidden languages—the rapid-fire blinks of 1s and 0s computers use to transform electricity into commands and the infinitely more complicated nonverbal cues that teenagers use to transmit volumes of hormone-laden information to each other.

Culture and social stigma discouraged girls like el Kaliouby in the Middle East from hacking either code, but she wasn’t deterred. When her father brought home an Atari video game console and challenged the three el Kaliouby sisters to figure out how it worked, Rana gleefully did. When she wasn’t allowed to date, el Kaliouby studied her peers the same way that she did the Atari.

Rana el Kaliouby, who grew up hacking Ataris, is now helping AI understand human emotion.

“I was always the first one to say ‘Oh, he has a crush on her’ because of all of the gestures and the eye contact,” she says.

Following in the footsteps of her parents, both computer scientists, el Kaliouby knew that her knowledge of programming languages would be a foundational skill for her career. But it wasn’t until graduate school that she discovered that her interest in decoding human behavior would be equally important. In 1998, while looking for topics for her Master’s thesis at the American University in Cairo, el Kaliouby stumbled upon a book by MIT researcher Rosalind Picard. It argued that, since emotions play a large role in human decision-making, machines will require emotional intelligence if they are to truly understand human needs. El Kaliouby was captivated by the idea that feelings could be measured, analyzed, and used to design systems that can genuinely connect with people. The book, called Affective Computing, would change her career. So would its author.

Today, el Kaliouby is the CEO of Affectiva, a company that’s building the type of emotionally intelligent AI systems Picard envisioned two decades ago. Affectiva’s software measures a user’s emotional response through algorithms that identify key facial landmarks and analyze pixels in those regions to classify facial expressions. Combinations of those facial expressions are then mapped to any of seven different emotions as well as some complex cognitive states such as drowsiness and distraction. Separate algorithms also analyze voice patterns and inflections.

Rosalind Picard pioneered the field of affective computing.

Affectiva’s software allows market researchers to gauge a response to ads and TV shows. It powers furry social robots that help children stay engaged in learning. And, in the near future, it will allow cars to detect when drivers are dozing off.

By creating AI systems that incorporate emotion data, el Kaliouby and others in the affective computing field envision a world where technologies respond to user frustration, boredom, or even help alleviate human suffering.

“I see that our emotional AI technology can be a core component of online learning systems, health wearables even,” el Kaliouby says. “Imagine if your Fitbit was smart about when it told you to go to sleep or when you needed to get snacks. It could say, ‘Oh, I see that today is going to be a really busy day for you and you’re going to be stressed. How about you take three minutes to meditate?’ ”

Calculating Emotion

Analyzing emotions in real time is a mathematical problem of astronomical proportions—an equation our brains solve in microseconds over and over and over again throughout the day.

“If you take chess or Go—these games in AI that people think are so hard to solve—those are nothing to compared to what can happen in a few minutes in facial expressions,” says Rosalind Picard, founder and director of the Affective Computing Research Group at the MIT Media Lab.

By conservative estimates, one single chess game can have up to 10120 possible moves, presenting a colossal challenge for artificial intelligence systems at the time. AI is more sophisticated today—last year Google’s AlphaZero algorithm taught itself the game and defeated a world champion chess program called Stockfish in just four hours—but analyzing metrics like facial expression in real time “isn’t even in the same league” Picard says.

Humans start an interaction with any of 10,000 possible combinations of facial muscle movements that can create a facial expression. Each expression is created through a combination of more than 40 distinct muscle movements ranging from eyebrow furrowing to nose wrinkling to lip puckering. Those expressions are often accompanied by any of roughly 400 possible aspects of vocal inflections along with several thousand potential hand and body gestures. These face-voice-hand permutations change continuously throughout a single conversation, creating an ocean of data that zips from one person to another instantaneously. While our brains subconsciously process complex emotions and their intensities, teaching an artificial neural network to wade through that tsunami of data is an extraordinary technological challenge, one that’s further complicated by the fact that nonverbal communication varies between cultures.

How does today’s artificial intelligence actually work—and is it truly intelligent? Watch "Can We Build a Brain?" Wednesday, May 16 at 9/8c on PBS.

Despite the challenges, artificial emotional intelligence is a technological brass ring for a growing number of companies and researchers. While the field is in many ways still in its infancy, serious resources are being devoted to developing tools that can analyze and predict emotional response. These emerging tools include apps that forecast when students will be stressed out, vocal analysis software that helps diagnose mania and schizophrenia, and programs that predict suicide risk based on social media posts. These tools come with serious privacy and ethical questions that haven’t yet been answered as well as significant technical challenges.

“There’s just a huge, huge amount of data and research that has to happen before it’s going to be something that our computers are smart about,” Picard says.

Making the Field of Feelings

While el Kaliouby was fighting to be taken seriously as a computer scientist in Egypt, Rosalind “Roz” Picard was in Boston waging a somewhat similar war. Picard spent her early days at the MIT Media Lab building mathematical models that emulate how the brain detects patterns from data it collects from the outside world. Emotions, she discovered, have more to do with it than one might suspect.

“As I learned more and more about the role of feelings I went, ‘Oh shoot. This looks really important for AI and computer intelligence, and I sure don’t want to do it,’ ” Picard says. “This would totally destroy my career as a woman. Who wants to be associated with emotion?”

Picard tried to recruit male researchers, but no one bit. She began to do it herself, testing ways to capture data on genuine, spontaneous emotions and applying the same machine learning techniques she had used in previous research. Her first papers were rejected and criticized, with one reviewer writing that one article about engineering emotional intelligence was perhaps best suited for an in-flight magazine.

Like el Kaliouby, Picard persisted, turning what began as a small collection of academic papers into her groundbreaking book, Affective Computing, which was first published in 1997.

Seven years later, Picard met a starstruck el Kaliouby, then a Ph.D. student who was designing facial analysis software that could recognize emotional states. The system, called MindReader, was trained using video footage from the Autism Research Centre at the University of Cambridge. It featured actors making hundreds of different facial expressions—a sort of library originally compiled to teach children on the autism spectrum how to read nonverbal cues. El Kaliouby was planning to return to her husband and home country after finishing school. Instead, Picard offered to collaborate with her in Boston.

“I was like ‘I would love that. That would be a dream come true; however, I’m married. I have to go back,’ ” el Kaliouby recounts. “She actually said, ‘Commute from Cairo.’ It was insane.”

El Kaliouby finished her Ph.D. and embarked on a three-year stint at the MIT Media Lab, flying between Egypt to Boston while creating the next iteration of MindReader. Picard, in the meantime, had already developed several new tools for capturing emotions in data computers could read, including a set of sweatbands embedded with sensors to measure skin conductance. Worn on the palm of the hand, the sensors picked up changes in electrical conductivity that happen when someone becomes psychologically aroused and begins to sweat. Believing that MindReader and the biometric sensors could be used to help children on the autism spectrum learn to navigate social situations and control their emotional responses, el Kaliouby and Picard began a multi-year study.

As the project progressed, the pair demonstrated both technologies for corporate sponsors visiting the Media Lab. They were overwhelmed by how many organizations in industries ranging from retail to banking to robotics were interested in real-time data on their target audience’s emotional states. In 2008, they asked then Media Lab director Frank Moss to expand their research team. He refused, but offered a different proposition: Form a company. Reluctantly, Affectiva was born.

Affectiva's software maps a person's face and uses a series of neural networks to judge their emotion.

Nearly a decade later, neuroscientist Dr. Ned T. Sahin is using Affectiva software to fulfill Roz and Rana’s early dreams of using the technology to help people on the autism spectrum. Sahin is the founder of Brain Power, a company that makes wearable life coaching technologies for people with brain and cognitive challenges. Sahin’s team has developed a suite of Google Glass augmented reality applications, some of which are powered by Affectiva algorithms, and many of which were originally designed for children but have applications for wider audiences.

One game, called Emotion Charades, prompts a partner sitting across from the user to make a specific facial expression. Affectiva algorithms identifies the emotion and shows the user one augmented reality emoji representing that feeling and another that doesn’t. Users earn points by picking the correct emotion while prompts encourage players to discuss how they experience that feeling in their lives.

Like all Brain Power apps, Emotion Charades is designed to be used in short, daily spurts, just enough for users to practice skills they can use in their everyday lives.

“It’s like training wheels on a bike that then get removed,” Sahin says.

El Kaliouby and Picard agree that affective computing should focus on human needs. People should be able to decide whether and when to use the technology, understand how their data is being used, and maintain a level of privacy. Affectiva’s licensing agreement prohibits the software from being used in security or surveillance, and it requires partner organizations to obtain explicit consent from users before deployment.

But as the field expands, potential for misuse ratchets up. Groups like the IEEE Standards Association have issued guidelines for affective design that include calls for explicit consent and data transparency policies. When a system is likely to elicit an emotional response, it should be easily modifiable in case it’s misunderstood or if it unexpectedly hurts or upsets. Whether and how organizations will implement those guidelines is still up in the air.

Automating Mental Health

Answering those questions now is crucial, says Munmun De Choudhury, an assistant professor of interactive computing at Georgia Tech. Back in 2010, while completing her dissertation, De Choudhury unexpectedly lost her father. As she processed her shock and grief, she began thinking about the loss from a more scientific perspective—how do users change their social media behaviors when a major life event happens?

De Choudhury began analyzing how and what new mothers post on Twitter after they’ve had a baby. She expected to see shifts in positive social media activity, but her data also revealed that some new moms were expressing negative emotions, too, and posting less often than they were during pregnancy. Suspecting that these might be indicators of postpartum depression, De Choudhury, then working at Microsoft Research, conducted a separate study that compared data from Facebook posts to interviews with mothers before and after their children were born. She found that data from social media posts could not only detect when a user had postpartum depression, but it could also predict which users would become depressed after giving birth.

Since then, De Choudhury has used social media to identify mental illness risk, including psychosis symptoms among patients with schizophrenia, while other researchers have created algorithms that detect signs of anxiety, depression, and post-traumatic stress disorder. Another team at Vanderbilt has built algorithms to predict suicide risk and is currently seeking ways to translate them into medical practice. Late last year, Facebook rolled out several suicide prevention tools, including an artificial intelligence program that scans posts and comments for words related to suicide or self-injury.

“Social data can be helpful to clinicians and psychiatrists as well as public health workers because it gives them a sense of where are the risks,” De Choudhury says. But, she adds, “currently the landscape is really, for lack of a better word, ‘primitive,’ in how algorithmic inferences can be incorporated into interventions.”

Chris Danforth, co-director of the University of Vermont’s Computational Story Lab, believes that conversations around when and how to deploy predictive mental health algorithms are especially important as opaque organizations like Facebook move further into the field. Danforth has designed one proof of concept computational model that can predict whether users are depressed by observing their Twitter feed and another from their Instagram photos.

Rosalind Picard is also focused on mental health. She left Affective in 2013 and has since concentrated on several health-minded projects, including work with MIT research scientist Akane Sano to build predictive models of mood, stress, and depression using data from wearable sensors. The goal is to create models that anticipate changes in mood and physical health and to help users make evidence-based decisions to stay happier and healthier, she says. Picard has also launched Empatica, a start-up that makes wearable devices for medical research. Earlier this year, Empatica received FDA clearance for the Embrace smartwatch, a device that uses skin conductance and other metrics combined with AI to monitor for seizures.

Meanwhile, el Kaliouby spends much of her time developing Affectiva tech. Since launching the software development kit in 2014, the company has licensed its software to organizations in healthcare, gaming, education, market research, and retail, to name a few. The company is currently focused on automotive applications as well as incorporating voice analysis into its “Emotion AI” software. Last year, Affectiva also joined the Partnership on AI—a technology consortium developing ethics and education protocols for AI systems—and el Kaliouby is currently working with the World Economic Forum to design an ethics curriculum for schools. She envisions a future where machines are tuned into our feelings enough to make our lives happier, healthier, maybe even more human.

“I just have this deep conviction that we’re building a new paradigm of how we communicate with computers,” el Kaliouby says. “That’s been the driving factor of my work. We are changing how humans connect with one another.”