Power of Voice Interfaces with Preston So

We are beginning to understand the power of voice interfaces. Power it holds to make our lives easier by creating interfaces that are responsive to things people normally say every day. We are seeing a digital transition from screen to speech dominance. We are moving from interacting with our fingers to the most fundamental human form of contact — "talking" and that's a revolution in making.

In our season one finale, Ep.6 we speak with Preston So, Senior Director, Product Strategy at Oracle about his recently launched book "Voice Content and Usability". Let's listen as he unpacks excerpts from his book, upskilling as voice interface designers, voice interface ethics, and accessibility.

Subscribe to Designwise on Apple Podcast and Spotify.

If you think about Alexa, Siri Cortana, a lot of these devices, you think about who the person is that you're drawing in your mind as you talk to this person. And you're generally thinking about a white woman who is potentially an executive assistant or in a secretarial, which is a very, very sexist way to think about a lot of these voice assistants and really is very, very striking. A very impolite approach or disrespectful approach to treating these voice interfaces that might give lead somebody to think that they can do that in real life as well with somebody who's actually human.

— Preston So

"Voice Content and Usability" is a book that will give you the techniques and insights you need to make voice content tangible—and talkable. Learn from the real-world example of Ask GeorgiaGov, the first-ever Alexa skill for residents of the state of Georgia and one of the earliest content-driven voice interfaces. Get your copy, NOW! Talk to Preston in person and know more about voice interfaces through LinkedIn and Twitter.

Transcript

Priyanka — Hi! Welcome to QED42's podcast designwise. I am Priyanka Jeph and here we are with our 6th episode with Preston so. Preston could be introduced as someone who sits perfectly at the intersection between design and technology. He is a multilingual speaker and can speak more than 8 languages. He is an editor at A List Apart, a columnist at CMSWire, and a contributor to Smashing Magazine. Preston launched and led the Acquia Labs innovation center, directed voice-driven experiences for clients like the State of Georgia and Nestlé. Preston wrote the first and the only comprehensive book which acts as a guide to decoupled Drupal which was launched in 2018. He has also written "Gatsby the definitive guide" which is all set to launch in November 2021, He is currently with Oracle, as Product Strategy Director and today the prime focus of our book will be Preston's recently launched book which is "Voice content and usability". Let's get right to it then. Hi Preston, Welcome to design-wise. How are you?

Preston — Hey Priyanka. I'm doing very well. Thank you so much for having me here today on designwise, it's such a pleasure to speak with you today and talk about some of these amazing things that we've got on the agenda.

Priyanka — Oh! the pleasure is completely ours. So, Preston, the first, uh, and the most important thing that I would like to ask you today is about what we told the audience of designwise, that you've got this perfect balance between design and technology. So how did you get here and what is your story?

Preston — That's a great question, Priyanka and there's a lot of people out there I think that can identify with this sitting in between the worlds of design and technology is something that I think a lot of us have dealt with, especially those of us who work on the web.

I know a lot of people, not only who are part of QED42, but also who listen to 'designwise' are involved in web development and web design. I started out as a web and print designer, which means that I actually began my uh, venture into, uh, web design and graphic design through the print medium and through the web medium, I actually combined my interests in computer programming, which I started when I was very young.

Um, and my experience into beginning to work on web design, and I've really had the opportunity and the privilege to work on all sorts of different sides of the equation when it comes to web architectures, design architecture is especially on the web. Not only have I had my own independent web design studio which is no longer an operation, but I still was going to, I do have a new consultancy today that works on voice interface, design, and things of that nature, but I've also worked for agencies and, uh, consultancy is as well as, uh, the platform side of content-driven architectures as well as software products and SAS products.

So my story is very much an interesting journey because as you can see from the books that you just mentioned Priyanka I've worked on both books that are in the realm of technology directly, really focusing on things like how people build websites with Drupal or with Gatsby, but voice content and usability was just came out last week with a book apart is my first book on design and user experience and specifically voice interface design, which is an area that I've been wanting to write about for a very long time.

And I'm very excited about it not just because it is the first book on voice content, strategy, voice content design. It's also a book apart, my publisher, their first-ever book on voice interfaces in general. So it's a very exciting topic. And I think one of the things that's really difficult obviously is to maintain that equilibrium between the design and technology world. I think we all try our best. However.

Priyanka — So in your book, what is the course of content does it even cover pressing issues like privacy? Siri statistics reveal that it is the most used voice assistant among mobile users. Alexa being a multipurpose assistant has millions of users. Then there is google home, Microsoft Cortana and, according to Microsoft's study which talks about 41% of these voice assistant users are concerned about trust, privacy, and passive listening, so can design help this concern? Is Alexa hearing all our conversations!

Preston — That's a very good question, Priyanka. And let me start by using the case study that really underpins and serves as the foundations for my book, voice content and usability, because this really ties into a lot of the privacy issues that many of us have.

Um, you know, somebody that I follow online, I had the pleasure of sharing the stage with as well. Uh, Sarah and Watson has written extensively and done a lot of work on privacy in voice assistants and conversational interfaces that do operate in voice and, and how it is that we don't pay as much attention to some of these devices.

Um, potentially as much as we do some of the devices that are more visual or have screens. So ASK GeorgiaGov was the very first voice interface for the residents of the state of Georgia here in the United States and it was part of the state of Georgia's efforts to focus on some of the ways in which other people besides web users who might be using voice interfaces like Amazon Alexa or who might be elderly or members of disabled communities want to be able to access content through the georgia.gov website without necessarily incurring the cognitive burden or some of the barriers that come about when you're using a screen reader or you're using a website, um, and would rather use a voice interface that you can have a conversation with.

Now, one of the big issues of course, with these voice assistants is, well, you can certainly imagine situations where it's much more helpful, much more catered and personalized to that user's requirements or their needs at that very given moment based on certain traits you know about the user, for example, knowing their email, their location, their name, um, certain information that might be personally identifiable information or PII could really be useful to helping these, these voice interfaces conduct these transactions on behalf of the user or to serve information or deliver information or content that is relevant to that user.

However, the privacy concerns are very large. And I think, you know, in addition to the really interesting reporting you mentioned Priyanka that there's also the issue of the fact that there have been instances of which, um, Amazon Alexa has been heard to actually be recording conversations that it's not supposed to be privy to, that it's not supposed to have access to.

And one of the decisions that we made at the very beginning of the asked Georgia gov. project for the state of Georgia was to say, okay, we're not going to collect any information it's possible. You know, you certainly could do it. It's feasible, but because of the really big concerns and uncertainty around privacy, the very strong risks that a lot of these voice assistants have when you're potentially handing a lot of this data over to a large corporation, you don't know where it's going.

We opted to not collect location, not collect email addresses, not collect any information that's identifiable about that particular resident or user, even though we could potentially help them much more in a personalized fashion if we had. But I think this really comes into question when you think about all of the privacy regulations that are now coming about, um, not just in Europe and the United States, but also all over the world.

We now have GDPR. We now have HIPAA. We, you know, there's so many new approaches that I think are very important to keep in mind. We've just finished kind of this long process of understanding how privacy really impacts the user experience on a website. Now we're just about to begin the journey of how to understand those things in the context of a voice interface.

Priyanka — So, um, so what I've understand from the whole thing that you've told us that, you know, it's up to a good business, it's up to a designer or a developer to decide how much information they would like to record and, you know in order to be able to help the users. It's something that can be done, it's not that, it's something that can be achieved. So, ya right So the next question is back to basics and something that our audiences of designwise would definitely like to know more from you is what exactly is voice UX and what is the difference between designing for voice UX and on-screen experiences?

Preston — Sure, and you really captured it right there with that last little bit there, Priyanka so the, the thing I will say is that voice user experiences and voice user interfaces and voice interface design in terms of how we think about the universe of user experience today, and the world of design today is a very, very different island. It's really on its own kind of continent over here and one of the reasons for that is exactly, as you mentioned, Priyanka, there's a very big difference between the visual or physical mediums that we work in as designers like screens and mobile devices and wearable technology and televisions nowadays. Um, and the physical interfaces that we use on a daily basis as well, like, uh, computer keyboards and computers and video game controllers. All of these things that mediate our human-computer interaction are primarily tactile or visually rooted interfaces. Of course, voice user interfaces are very different because unlike written conversational interfaces like chatbots or text bots or WhatsApp messenger bots, for example, these voice interfaces, especially the ones that are pure voice interfaces, those that do not have any screen or visual component whatsoever are really unique because as opposed to having something that we can touch or something that we can see or point out or click instead instead It's entirely an interaction that is mediated through the the realm of speech, which means that all of the interactions that we have as users have to take place along the dimension of time and not in a dimension of space as Erica hall notes in her book, conversational design, what are the most important aspects of pure voice interfaces? However, it's also the fact that. Well, context. And what that means is you really don't have the capability to give a user, a visual rendition of a mental model. That is something that's memorable and works for them. For example, I can't give a user, a site map. I can't give a user, a navigation bar. I can't give a user breadcrumb links. I can't give a user even a link itself because there's no way to color, text blue, and underline it in a voice interface. So voice UX is a very, very thrilling, but also very challenging. Uh, area because we have to really remove ourselves as designers and user experience practitioners from the entirety of the world that we've been operating in over the last 50 - 100 years when it comes to these manual and visual interfaces towards more of these kinds of human, organic, and conversational approaches that really distinguish voice interfaces from the others that we work with.

Priyanka — Absolutely. So, um, another thing related to the same topic that our audiences would love to know is, how do you onboard like a new designer from onscreen to voice UX? How do they upskill themselves?

Preston — Absolutely, Priyanka so what I'll say is for those who are looking into getting into voice interface design, especially those who are already operating in web design and user experience design, I actually just had a question yesterday at a talk I gave from somebody saying, oh, should I do XYZ first, is there some kind of foundation I need, but the reality today is that okay, unlike 20 years ago or 30 years ago when designing voice interfaces really required you to have a computational linguistics degree or a very deep understanding of computer science nowadays, there are all sorts of low-code or no-code platforms that are emerging that allow for those who are much more comfortable in tools like photoshop, to be able to use a visual tool that allows you to build the voice interface or a conversational interface. Now, what that means with these, what you see is what you get or wizzy with tools is that nowadays, no designer needs to learn a technology, learn the particularities of standards like voice SSML or SML, um, to really be able to build a chatbot or a voice interface, which really makes this a very compelling new industry for those who are already very well versed in design, or also may not have really gotten into web design and my book, voice content usability actually very much focuses on this later trend that's happening over the past few years for a lot of these conversation design tools to not demand any technical approaches and for them to be more agnostic to some of these approaches, which means that my book doesn't focus on a single technology or a single approach, because all of the principles that I talk about for voice interface, design, like flow diagrams or dialogue writing, or usability testing, apply to all voice interfaces. Regardless of what technology platform you're building So my biggest advice is to find a use case, find something that's interesting to upskill your current toolbox with design and try out some of these tools that are out there, like dialogue flow or bot society, Oracle digital assistant that might give you the ability to design a voice bot or a chatbot that doesn't necessarily require you to write any code whatsoever.

Priyanka — Right. So, uh, this is an in-between question about the book. I mean, uh, comes out of the things that you said that you have all these tools and techniques that designers can practice. So when you were writing the book, I mean, when you thought of this topic that I have to write about this, which was mainly because you worked with this technology for a very long time so how much time did you take to write it and, and what is the process of your research.

Preston — My process when writing books is very unusual and I think this is really one of those things that, um, is a little bit tricky is that I really don't plan out things very far in advance in terms of the actual things I want to talk about the things I want to write. A lot of it is very stream of consciousness. This is how I write all of my articles in my work. I do have usually, some sort of an outline at the beginning, but the book itself really came together, um, through a general sense of what I knew I wanted to talk about in every chapter, you know, basically. Okay. Chapter one is going to be about voice content. Chapter two is going to be about actually taking that voice content into voice chapter three and four are going to be about dialogues and flows respectively chapter five is launching your voice content and chapter six is about the future or the, or the outlook ahead. And, um, you know, I think one of the challenges of course, is that it was during a pandemic and, uh, an ongoing pandemic of course. Um, and I do want to make sure to hold space for everyone that design wise, uh, listeners who are still dealing with this ongoing struggle of ours. But the general approach that I took with the book was just to devote as much time as I could outside of my normal workday to really thinking about these ideas and how I wanted to delve into them. But it's really tough. I think when you're writing a book at the scale of a book of this size to focus on the narrow perspective that an individual chapter has, as opposed to thinking about the big picture and how you want the entire book to go, because you can very quickly lose the forest for the trees or vice versa. Um, sometimes when you write a book and I know that, you know what that's like as a host of a podcast as well.

Priyanka — That's what, like, I think that there are a lot of people out there who have the talent, they want to write. Just that something stops them. It's probably the fear. So this question was really important for them to hear. Right. Um, so what are the things that, um, most people don't know about voice design, something, I mean, are there any unknown insights that you found out why you were working with voice design technology?

Preston — You know, it's really interesting you say that because I definitely think that there are some interesting insights here with voice interface design that are really interesting to keep in mind.Um, the first is that you know, I think that a lot of people think that nowadays, especially with all of these platforms emerging with so many new techniques and so many people getting involved with voice interface, design, that it's kind of a mature landscape that, you know, it's very much as well-developed as the web, but nothing could be further from the truth, uh, in that regard, because voice interface design is still a very, very age fields. And I'll share one example of this from the case study that we did in the book in asked Georgia gov for the state of Georgia. One of the biggest issues that we faced was actually not even the fault of the design that we. And, um, you know, I think one of the things that a lot of people who are web developers or web designers remember is that back in the early two-thousands or mid two thousand, there were a lot of issues with how browsers would use the code that we wrote in order to display certain things. Through CSS. And there were a lot of issues with things like quirks mode, compatibility, uh, box, model hacks, all sorts of things that a lot of us know from working in the web back in those days. Nowadays, of course, the web is so mature that you don't have to worry about any of those things. We're still in those early-stage years when it comes to voice interface design because one example of this is. We worked very hard on the voice interface itself and the application itself that ultimately ended up being an installable Alexa skill for your Amazon Alexa. And there was one result that kept on popping up in the logs and analytics and reports that we built for the Georgia DACA editorial team. And that was a result that kept on coming up over and over again in these 4 0 4 errors.Now, just to give the listeners a sense of what this actually can do is every single time you. With the ask, Georgia gov voice interface. You're really conducting a search across all of the frequently asked questions, content that's available on georgia.gov. So you can ask things like how do I renew my driver's license? How do I get a fishing license? How do I register to become a new. Um, how do I register for a small business loan? How do I register to vote obviously very important topics? Now, one of the results that kept on coming up returned an error, you know, so basically there were no search results returned. The user wasn't able to find any information or content relevant to this topic. Was this particular word, this keyword that kept on coming up in the logs as Lawson's as an L A W S O N apostrophe. Yes. And we had a retrospective about eight months after the. Uh, Alexa skill went live, and people started using it and we found that this really strange word kept on coming up. Why is somebody trying to search for this thing that has absolutely no relationship to anything that is in the state government of Georgia. We thought about it for a long time. And then suddenly one of the native peoples from Georgia in the room, who's lived in Georgia, all her life. If you kind of perked up and she said, oh, you know what? I think that might be somebody who's trying to say. Like driver's license, but it's in a very Southern or Georgia and, uh, dialect or accent. And, and you know, this is not actually the fault of the application. Right? Cause we did usability studies. We did all sorts of research to make sure that it would, uh, go well as soon as we launched it but this is actually an example of where Alexa fell apart or where Amazon fell apart because the underlying. Understanding the natural language understanding mechanism that Alexa was using was not developed enough yet to be able to hear, um, American accents that are not the same as the American accents that it was trained on and if you think about all of the different dialects that people have in English all over the world, you can see how this can become a very big problem very quickly. Now, the other big thing that I will say that, um, didn't really come as a surprise to me, but I think will come as a surprise to a lot of the people in the audience of design-wise and for this book is that I think one of the things that we often forget about. The world of accessibility for voice user interfaces is that voice user interfaces are really intended to accelerate interactions with content or with tasks and transactions that you need to perform. Um, but on the web, one of the big problems and disadvantages of the screen reader, which we use all the time to do this sort of work, is that it is fundamentally a visually worded. Which means that screenwriters rely on the visual structures of websites in order for people who are using screen readers or disabled people, to be able to actually interact with the web a user, you know, the the UI elements or the text that's on a page and so there's a voice interface designer named Chris Miller. Who has written extensively about this and says, you know, I never really understood why it was that people built Sweden meters this way, where it relies on the approach of the web design. In order for content to be delivered in a way that makes sense to the reader user, it should be the other way around. There should be a voice interface that allows for a much more efficient interaction and being a blind person himself. He has a very great, you know, good amount of insight into why it is that people should really think about the fact that. Well, screen readers are kind of the normal or the default way that many of us think about web accessibility, but it's actually still not that optimal or ideal experience for those who really want to be able to have an interaction with a voice interface as disabled users needing to access that content. So I think those two insights were both very interesting to me from the standpoint of, well, you know, voice interfaces are not so ready for prime time, as we think they're not quite ready to beat us at our own game of conversation at the same time. Um, there's also areas where voice interfaces can far outperform some of the things that we already have that are meant for accessibility solutions for disabled communities. So, um I think both of those insights might come as a bit of a surprise, uh, to some of the folks who are going to read my book and also listen to designwise.