Watch a discussion about how we react when computers speak to us
Cliff Nass explains that when the face on a computer screen and the voice we hear appear mismatched, we automatically distrust it
Talk to the Screen
Learn the difference between speech recognition and speech understanding
Artificial voices may sound mechanical, but they’re getting better all the time — and people seem to respond to them as if they’re from real people. Cliff Nass explains how old brains respond — exquisitely — to new technologies, as learned in working on new BMWs.
Read Full Article.
Humans are evolved to talk: Speech involves more parts of the brain than any other activity. People with IQ scores as low as 50 or brains as small as 400 grams (one-third the size of a normal human brain) can speak. By the age of 18 months, children start learning a new word every two hours and keep going at that rate through adolescence.
Humans are also evolved to listen. Four days after birth, babies can distinguish their native language from other languages. Humans are so tuned to speech that the right ear (left brain hemisphere) shows a clear advantage in processing native language, while the left ear (right hemisphere) attends to all other sounds.
Language arguably evolved primarily to transmit social information. People rapidly categorize voices in terms of gender, personality, emotion, who is speaking and place of origin from such speech characteristics as pitch, cadence, volume, pitch range and word speed. Each of these categories guides us on whom to like, whom to trust and with whom to do business. Sensitivity to voice and language cues has played a critical role in interpersonal interactions for as long as we have lived in social groups.
Technology is adding a new dimension to language. People routinely use voice input and voice output systems to check airline reservations, order stocks, control cars, play games, dictate text into a word processor and a host of other tasks. Consumers can “converse” with handheld or mobile devices as well as household appliances. Because voice interfaces tap into our highly developed speaking and listening skills, they are intrinsically comfortable, easy-to-use and efficient.
How does our old brain react to new technologies? More than 50 research studies done in my lab and others around the world show that people behave toward and draw conclusions about voice-based technology using the same rules and shortcuts that they normally apply to other people. Technological voices, just like the voices of other people, voice-activate the parts of the brain associated with social interaction.
Voice interfaces turn out to be social interfaces. Here are some findings of interest about technology-based voices, which inform the work of designers and the choices of consumers:
To draw these conclusions, we start with an appreciation of the
evolutionary grounding of speech. To confidently provide a design idea
or a psychological principle, we test and refine numerous hypotheses.
In the accompanying essay, I’ll talk about the research behind the
selection of a voice for use in the BMW Five Series.
William and Flora Hewlett
© COPYRIGHT 2005 MACNEIL/LEHRER PRODUCTIONS. All Rights Reserved.