As of 2021 voice technology affords us the power to control our smart home and demand a song or news report, as if we had human assistants at our beck and call. But for more complex tasks, our current platforms can struggle. This is of course an interim state of affairs; across the world, new language models are being trained and refined every day.
Voice assistants are becoming much better at understanding not only the words a user says but also what the user is trying to accomplish. This natural language understanding is able to respond to a user more naturally, without having to spell out every step. It can also sustain more conversations with applicable follow-up questions. It can maintain context, so if you back up and change your mind it can gracefully pick up where you left off.
Dunkin enables a consumer to tell its voice assistant that they will have their usual order. Dunkin will then confirm a location and time, then send an order for their preferred beverage with a simple voice command.
In addition, synthetic voice tools promise a bold frontier where a virtual voice is indistinguishable from a real person–even a famous person or voice actor. This would enable scale for brands to optimize and automatically tweak on the fly the voice with which they speak to each customer. An early example of this tech at work is the option to make Samuel L. Jackson your Alexa voice. Amazon obviously didn’t record him saying every word Alexa could possibly say. It recorded him pronouncing enough sounds so that their software could—in a split second—assemble a realistic simulation of Jackson’s natural speaking voice saying whatever needs to be said.
We live in an age where much of what we want to accomplish is done on a screen, and the deluge of open tabs and apps make multitasking especially hard. The ability to act on a “call to action” without opening up yet another tab or picking up your phone while leaning back can be extraordinarily powerful. Voice interaction can deliver this capability. Since voice interactions, like CTV, are timestamped, cross-platform attribution can be a valuable new measurement tool—given proper user privacy opt-ins. A consumer in the market for a truck who sees a CTV ad for a new model could simply say to a voice assistant, “Find a time on my calendar for me to test drive the 2023 Ford F150 and make the reservation with my local dealer.”
The IAB is paying close attention to the opportunity voice will play in the overall marketing ecosystem. In some ways, the future is already here. Most smartphones ship with a voice assistant. Smart speakers are becoming more advanced while more models at ultra-low price points roll out. The penetration is here and growing for brands to converse with consumers. That’s the science. The art is in how to listen and what to say. If you get that right, it will feel like magic.
Subscribe to Ad Age now for the latest industry news and analysis.