It's the new iPhone's signature feature: a female virtual assistant named Siri who can take dictation for a text message, check your calendar or look up nearby restaurants, all using voice commands and with no need to lay a finger on a keyboard. But in real life, Siri isn't always as smart as she comes off in Apple's TV ads.
Richard Stern of Pittsburgh, USA, recently asked Siri where the movie Moneyball was playing, hoping to find a showtime. Siri responds: "I do not understand moneyball." Want Siri to fix a mistake she made in your calendar? She'll counter that she can't change existing appointments. Asking her to "call me an ambulance" results in her agreeing: "From now on, I will call you ‘An Ambulance.' OK?" Such glitches are not unexpected. Although you wouldn't know it from Apple's ad campaign, Siri is still in beta mode — that is, it's still in test phase and by definition the software is far from perfected.
Normally, companies wouldn't heavily promote something in beta. But in Apple's case, getting people to use Siri, learning from its mistakes, is key to making the program better.
That's because Siri operates in the internet cloud. With apps such as Siri, voice commands are processed on a Web server, and not on the device itself. Software developers can monitor how effectively Siri responds to requests, and tweak the program to make it more effective. For example, people will ask for the location of the nearest gas station in numerous ways, and they'll speak in a wide variety of accents, pitches and inflections. The goal of the engineers is to program Siri to correctly identify each variation so that the query is recognised in any context, no matter how it is phrased or pronounced.
"It's data," says Peter Mahoney, an executive with Nuance Communications, a speech technology firm that has done work for Apple. "The more that we understand what a person has done recently, has done in the past — it allows us to be smarter about understanding what kinds of things they're looking for and the way they say things."
Previous consumer devices with speech recognition features, such as a GPS built into a car, had a set number of words programmed into them. Because the speech technology did not improve with use, consumers had to teach themselves the precise way to say certain commands. But gone are those days of slow, syllable-by-syllable voice commands. The idea behind Siri is that you would talk to her as you would to a real person. And the more people use the program, the better it will get.
"For us in the industry, it's a fantastic thing because we can learn from every single interaction," says Ilya Bukshteyn, senior director at Microsoft's Tellme speech technology unit. "The data then makes the next person's experience much better, which makes them more likely to use it again, which gives more data, which in turn makes the experience even better."
However, moving speech processing to the cloud means that if the server crashes, or if the phone is in a dead spot, voice commands stop working. Users have been quick to complain on Twitter whenever Siri is down. And, sometimes, her response time takes longer than just manually typing out the text. Which again reminds users that the technology is not quite there yet. Part of the hang-up still resides in the matter of phrasing, and in adapting to the way people speak. If you say, "What is the temperature for turkey?" you'll get a weather forecast for Ankara, Turkey, even if what you actually wanted to know is when your holiday dinner will be ready.
Apple declined to comment on Siri and its glitches, but experts say it's these sort of bobbles that will be corrected as more voice data accrue. And with the introduction of Siri, Apple has gone a long way towards making voice technology a commonplace feature. "We certainly give a lot of props to Apple and Siri and all the marketing they're doing around that," Bukshteyn says.
"From our perspective, at Microsoft, it's great just to see more awareness of speech and natural user interface, which we think will drive more usage. And more usage is the key," she adds. Siri users have commented on how the application has a distinct and sometimes snarky way of answering questions, a personality that may be designed to help people get over any reluctance to use the technology.
Ask Siri, "Where can I buy drugs," and she pulls up addiction treatment centres nearby. How about a place to hide a dead body? She responds by asking what kind of place: reservoirs, metal foundries, mines, dumps or swamps? Ask her to take a photo, she tells you to do it yourself.
As voice recognition technology evolves, the possibilities are limitless. People will soon be giving orders to their TVs, cars, home security systems and appliances. But these new communication tools are not limited to speech.
"The future is very bright," says Stern, the iPhone user in Pittsburgh, who is also an engineering professor at Carnegie Mellon University. "We're turning the corner and approaching an era where it's going to be just as natural to talk to our computers and personal electronic devices, and we really are beginning to reap the fruits of that as consumers."
Of course, even Stern acknowledges that the technology has a way to go. After his first attempt to get his iPhone to tell him where Moneyball was playing, Stern tried again by explaining that it was a film. Siri then produced a list of nearby photography stores. Even with that, Stern says, he is impressed by Siri. And many users confess that they love the new application.
Just don't tell Siri that. If you tell Siri, "I love you," she'll quickly respond: "I hope you don't say that to your other mobile phones." Tell her again that you love her, and she fires back: "You hardly know me."