Around 15 years ago Flash technology was in the ascendancy. One of the odd conventions to emerge at the time was the ‘Flash intro’. Very often, to build your anticipation for the website awaiting you, you would be entertained with what was essentially an opening title sequence. And if you were really unlucky, on the other side of it was a website fully rendered in Flash.
What you wanted was content; what you got was an extended journey through a designer’s ego trip (and yes I should know, I was one of those designers). The basic premise of a Flash-built website was that tricking out the interface would make for a better user experience. That assumption turned out to be wrong.
With Siri, Alexa et al entering our lives, our interfaces now have personalities. If a digital misunderstands our requests we are likely to learn about it through a witty quip. TV ads featuring virtual assistants often make a particular show of droll one-liners emanating from the device.
But as Neilsen Norman Group research shows, voice interfaces are falling far short of user expectations. It seems that priorities need to be reassessed.
A little less conversation
As part of a project last year I began designing for command line interface. With no previous experience, a terminal window or console can be a daunting place. Initially I was puzzled why user prompts and feedback in this world were so clinical and abrupt. Why would command line users not want to be addressed in a more human fashion? The answer lies in task efficiency.
Command line interface evolved from single-line dialogue between two human teleprinter operators. Over time, one end of the human-human dialogue became a computer, and the conventions remained. These interfaces provide users a more efficient method of performing tasks. In short, command line users are just like the rest of us: that is, trying to perform a lot of tasks in as short a time as possible, without surplus dialogue or clutter getting in the way.
This method of working is totally in keeping with our tendency towards ever more concise communication. Email is on the wane due to the long, unwieldy threads it encourages. The rise of chat apps such as Slack is due in large part to the tendency towards more concise messages. We’re making less mobile calls, opting instead for text messages using abbreviations, acronyms and emojis.
Many rivers to cross
As designers we are not always trying to mimic a conversation. We are creating an exchange which delivers for the user as efficiently as possible. To re-cast all human-computer interactions as conversations is to misunderstand our relationship with machines and devices.
The obstacles to success with voice UI are many. Users need to think more than once about the commands they give. They are required to speak in a manner that often isn’t natural for them. Even relatively simple queries may need to be broken down into smaller questions before reaching anything like the right answers.
When barriers are placed between a user and the outcomes they want the end result is predictable: they will simply opt out. A report from The Information suggests that only 2% of Alexa speakers have been used to make a purchase from Amazon in 2018. Additionally, 90% of the people who try to make a purchase through Alexa don’t try again.
We are still some distance away from the dream that voice UI promised. Perhaps this is voice’s Flash period, where the user needs to work hard to access the content they want. And I’m willing to bet that most frustrated users would be willing to trade every ounce of their virtual assistant’s sassy responses for just a little more efficiency.
The fact is that voice UI is still pretty hard work, no matter how hard Siri or Alexa try to entertain us.