From “Okay Google” to
“Jarvis, you there”?
Personal Digital Assistants.
How can we create the perfect in-vehicle voice assistant?
The key will be combination of proactivity, personality and ability to manage tasks, based upon learning about your personal data.
Diana Chaush, Product Manager
“Okay Google” to “Jarvis, you there?”, precisely where are we and what challenges does the future hold for the autonomous personal assistant? As always, not one but multiple companies have taken on an ambitious challenge to create such a system. Siri, Google Assistant, Alexa and many more are on our phones and in our homes.
Having a Personal Assistant (PA) is crucial for the car market, as driver distraction, caused by attention on in-car controls, is a serious safety concern. So, introducing natural language voice assistance is a logical evolution of the in-car experience for every major OEM.
There are 3 trends:
1. OEMs develop their voice-powered assistants and partner with companies that provide natural language processing, both offline and on-line. An example of this is Mercedes MBUX.
2. OEMs integrate third-party Digital Assistants into their HMI as the primary voice assistant. An example of this is Ford with Alexa.
3. Home assistants like Alexa and Google get used in the car via aftermarket peripherals or in-screen applications linked to users phones. Examples of this are CarPlay and Android Auto.
Though these assistants are third-party, they have their own pros – no need to develop AI from scratch and tackle a zoo of assistants (the fewer assistants, the better, right?). Despite the variety, common among all variants is the ability to respond to your basic questions, e.g. “What’s the weather like?”.
Some assistants can keep up with a conversation backed by rule-based logic, meaning certain words or phrases trigger specific responses. This often works well enough to produce a simple logic that enables
your car to give you a witty and entertaining response or even play a game. When you allow logic-powered NLU to control functions on the car’s CAN bus, you can enhance the experience and make it quite premium. Mercedes MBUX and BMW iDrive are good examples of this.
One important partially resolved task on the way to perfection is for the assistant to stay in context and understand commands in connection with it. By context, we mean the ability to recognize a phrase’s indirect meaning, understand if a user made a mistake and generally be in the moment. For example, the driver could say, “I’m hot” instead of “Lower the temperature”. Phrases like, “Set the radio station to 102… no, to 103 FM” can set a simple assistant to failure, whereas, for advanced contextual assistants, this will not be a challenge.
As far as it can go, the human-AI conversation will sound as natural as human-to-human conversations because of how well the AI can handle the context. The best example of an advanced technology like that is Google Duplex, which is not a vehicle assistant. Still, it just shows how impressive this tech can become.Just listen to its conversations with businesses when it calls to make a reservation.
These features are somewhat behind on the scale, from simple voice assistant to Jarvis for responding in a context and controlling car functions.
To see what other stages there are, let’s look through a prism of a use case. As there are no references on the market, let’s imagine a situation where a digital personal assistant (DPA) would significantly simplify the driver’s life.
“I’m heading home. It’s a stormy winter day, and I am exhausted”. Ideally, what needs to happen to make my trip safe and enjoyable, starts with my DPA being aware of my intentions. And I don’t necessarily have to voice them. The DPA knows I’m heading to my car now, so the preheating of the cabin starts. The DPA knows I’m on my way home. It checks routes and traffic. One of the main roads is closed due to a snowstorm. As soon as I sit in the car, the DPA tells me …
“We are heading home, the traffic is heavy, your usual road is closed, so I have built a new optimal route. And by the way, your favorite grocery store is on the way. Do you want to jump in before everything gets closed?”
I arrived home after stopping at the store. My house greets me by turning ON the driveway lights and turning UP the thermostat.My journey home was simple, stress-free and I didn’t waste time, thanks to the proactive actions of the DPA. Sounds amazing? Let’s see what’s under the hood of such a system.
1. Being connected across all devices
Think about how much data each digital space possesses – your phone, home, and car. They all know you in their own way. Now imagine that this knowledge is connected between them. Look at the visualized use
case to understand how many systems are involved in enabling perfect DPA.
So the first test for DPAs of the future is operating in the Internet of Things (IoT) ecosystem. IoT connectivity is partially solved by assistants like Alexa, Google and Siri, so far, they can only control simple functions in the car, depending on the level of integration.Connected cars are starting to go mainstream, with around 25 million expected in 2021, rising to around 470 million by 2025 [ 2 ].
As more and more cars receive direct access to the internet, ideas like Vehicle-to-vehicle (V2V), Vehicle-to-pedestrian (V2P), Vehicle-to-infrastructure (V2I) and many more arise. True IoT connectivity will enable use-cases such as safety monitoring, autonomous vehicles, smart city management, seamless mobility and simplified payment [ 3 ].
2. Building personalized profile
When all possible information sources are connected and available the DPA can build your personalized profile. Providing enough insights into what you are used to and what you would like, suggesting various options on a variety of matters. Is it your usual time to drink coffee? How about we check on this coffee shop nearby. Is today your aunt’s birthday? Let’s call her. Your favorite grocery store is on your way home; would you like to go there? There are dozens if not hundreds of use cases you can come up with when you have so much available data about the driver. What is the limit here? Ethics and possibly computing powers.
3. Implicit context awareness
But let’s say that there are no ethical issues. Now, when the DPA is connected to all possible digital sources and has enough data in the driver’s profile, what comes next is building an even deeper understanding of context. They say human consciousness comes from us separating the past from the present and the future. With the prescribed amount of data, we can enable an AI with knowledge about past events. Connected with current stats, it can now build hypotheses about the future. The weather is becoming worse → in the past, stores were closing due to bad weather conditions → it is better to suggest a store visit just in case. To be perfect, the assistant needs to know how to connect the dots, which will make it aware of the things that are not on the surface.
4. Predictive schedule management
As the DPA can now predict the future, we can enable perhaps the most classical purpose of a human assistant, which is managing the boss’ schedule. Like a live assistant, the digital assistant needs to know your day to help organize your routines. Need to remember to pick up your child from school? No problem, a real personal assistant will remind you. A perfect personal assistant will let you know precisely when to leave so you are never late due to a commonplace traffic jam.
At CloudMade, we are enabling this particular use case with a time to leave notification. The algorithm is learning your regular trips and routes and generates predictions for your next destination. A time to leave notification will consider the traffic to be sure about your arrival time. Integrating your personal and work calendars to this algorithm allows a rich daily agenda to be created.
5. Proactive conversation
When we think about Siri or Alexa, what kind of assistance can we expect? They answer our basic Wikipedia questions, call Mom, and get info on the weather. All of this, when they are asked. Being reactive is a significant trait of all personal assistants nowadays. But when we think about the ideal DPA, it doesn’t only know us, it can monitor the surroundings and dive into context (which we have just discussed). Like a child, it learns from your previous experiences and can proactively suggest things.
‘It’s Thursday night, today is the day I drive from the office to the gym. I might need to buy a fresh bottle of water and some energy snacks on my way.’ The ideal assistant will help build the perfect route, demonstrating both the knowledge of my habits and care for my well-being. “Would you like to buy some energy snacks on your way to the gym?” – “Sure I’ll do that, thanks” – “I’ll find the perfect store on our way, so you are not late”.
To be fair, even now digital assistants can demonstrate care for your health and well-being. Apple Watch will warn you about too much of the surrounding noise or call an ambulance if you fall. But we still can’t call it a proactive conversation. It is a reaction or a rule-based logic. We could honestly ask the systems to notify us about everything, but wouldn’t it be annoying? It would, because most of the time we silence or switch off notifications like this. At the same time, having an artificial assistant talking to us at random times would be at the least annoying, perhaps even terrifying. So this is not exactly what we mean by proactive conversation skills.
In that case, what is a true proactive conversation with an assistant?
The assistant should speak when spoken to. Without a prompt, it should talk about matters in such precise and exceptional scenarios that the user is never irritated. A use case such as “Hey Mercedes, I’m bored” could be made proactive by the assistant using the driver monitoring system to know in advance the level of boredom. It could then suggest to play a game or offer to play some music, to improve
the driver condition.
Of course, there are many implications and exceptions to such advanced use cases. The driver could just be thinking about something. Or asking this question could be considered impolite.
It eventually becomes apparent that an all-knowing and predictive personal assistant will have to adopt some kind of personality to be helpful and wanted.
Individualized personal assistants would take the form of a personality type, acting in different ways according to the use case, or always acting in a certain way that the driver likes. For example, a caring mamma, or a professional secretary.
In the case of the caring mamma, the use cases could be planning lunch, checking health stats, warning about speeding, or a reminder to put on your hat today.
For the professional secretary; a reminder to call your regular contact, provide energy consumption stats for your house and submit expenses.
It can act as a particular type of person.
Even more than just predefining a personality on the development stage, the perfect assistant would eventually learn what it needs to be for its owner on its own. A bit of a far-fetched example, but think about Joi from “Blade Runner 2049” ,who was trying to guess how to entertain the main character by changing her looks and outfits. After this stage, no need to worry about naming your assistant Jarvis because it will eventually act like the character to which you wish to interact.
CloudMade has been building components to add context to
voice assistants for many years and they are ready for market.
If you are building a new digital assistant for your product at the moment, get
in touch to see how CloudMade can help you add context and create a truly
personalized experience for all your customers.