May 23-25, 2016 | Omni Shoreham Washington DC

View Presentations

Wednesday, May 25, 2016

Sunrise Discussions

SD301 - Emotional Intelligence-Driven Voice User Interface Design

8:00 a.m. - 8:45 a.m.
Michael Mendelson, Senior Technical Lead, PTP

Emotional Intelligence (EI) is the ability to recognize emotions, to discriminate between different feelings, label them appropriately, and to use emotional information to guide thinking and behavior, according to Daniel Goleman, a psychologist and writer. By applying general themes from EI to the VUI design process, designers can ameliorate users’ negative experiences by anticipating their emotional reactions and addressing them either subtly or head on. Common EI themes such as empathy, optimism, adaptability, gaining trust, and managing conflict can be infused into the design to help negotiate specific pain points in speech interfaces.

SD302 - Challenges and Solutions of Identity & Voice

8:00 a.m. - 8:45 a.m.
Mary Constance Parks, Principal Experience Designer, Automation and Control Solutions, Honeywell

Researchers have shown that we dislike inconsistencies in others’ identities and that we like those whose identities are similar to our own. This poses two challenges. One is the need to create an experience that promotes the perception of a consistent identity. The other is that people may dislike a voice experience because the identity doesn’t match their own. What to do? This session talks about these challenges and possible solutions.

SD303 - The Danger of Standards

8:00 a.m. - 8:45 a.m.
Carrie Claiborn, Senior VUI Architect, Interactions

User-centric design is built on the premise that each user group and context of use is unique. Does adherence to standards limit how centered an experience can be? Standards can be seen as codifying existing knowledge to prevent us from reinventing the wheel with each design, but may also limit innovation. Join us for a rousing discussion of the benefits and possible dangers of standards for voice interaction design.

SD304 - Overview of Speech Recognition Applications to Support Air Traffic Management

8:00 a.m. - 8:45 a.m.

Overview of Speech Recognition Applications to Support Air Traffic Management

Robert Tarakan, Senior Principal Engineer, Center for Advanced Aviation System Development, The MITRE Corporation

Controller-pilot voice communications are critical to the Air Traffic Management system but are an underutilized source of information for improving aviation safety and efficiency. Automatic Speech Recognition (ASR) provides a means of mining the voice communications for information of interest for use in real time or post-event analysis. The success of applying ASR technology to improve aviation depends on taking advantage of domain-specific opportunities and overcoming domain-specific challenges. This session highlights the opportunities and challenges.


Driven to Distraction? Speech Technologies in the Automobile

9:00 a.m. - 10:00 a.m.
Prof. Richard Young, Research Professor, School of Medicine & College of Engineering, Wayne State University, Detroit, MI USA President, Driving Safety Consulting, LLC

Driver distraction is a persistent problem that can result in dire consequences, but new interactive technologies continue to be introduced into automobiles, giving drivers access to a wide range of information and entertainment options. Perhaps more significantly, drivers bring a variety of connected mobile devices into the car and are increasingly unwilling to disengage in the car. Speech technologies hold the promise of providing a safer interface by enabling eyesfree, hands-free interactions. However, the safety record for speech technologies in the car is still very much in question. Join us for a critical evaluation of the research, a discussion of its limitations, and an exploration of what is still unknown about the use of speech technologies in the car.


10:00 a.m. - 10:45 a.m.


A301 - New Applications for Using Voice, Part 1

10:45 a.m. - 11:30 a.m.

Voice-Enabled Remote TV Control

Moderator: Dr. William Meisel, President, TMA Associates
Jeanine Heck, Senior Director, Product Development at Comcast Technology and Product, Comcast

What are the challenges for developing and deploying a voice-enabled remote control for a television and how do you meet these challenges? How will it affect user behavior? How can the remote control device accommodate the ever-increasing number of functions available on televisions, including telephone/Skype and smart home devices? What is the relationship of the TV remote control device and other devices including smartphones and automobiles, etc.?

Voice Capture of Agricultural Data

Moderator: Dr. William Meisel, President, TMA Associates
Bruce Rasa, CEO, AgVoice
Bruce Balentine, Chief Scientist Emeritus, Enterprise Integration Group

Field notes are essential for agronomists, farmers, and other inspectors in the data-intensive world of agriculture. But the outdoor environment is harsh, and users’ hands are busy manipulating soil, samples, and tools. In this session we discuss the technology and human factors challenges for AgVoice, a fully hands-free, voice data entry application.

A302 - New Applications for Using Voice, Part 2

11:45 a.m. - 12:30 p.m.

Intelligent Mobile Apps in the Enterprise

Moderator: Dr. William Meisel, President, TMA Associates
Vinay Dwivedi, Director Product Management, Product Management, Emerging Tech, UX, Oracle

We speak three times faster than we type. So why are we constantly filling out forms and navigating complex menus to get to the information we need? This presentation is about the productivity revolution that speech technology and intelligent virtual assistants are bringing to enterprise users, from sales teams looking to work with their sales data on-the-go, to employees looking to enter expenses or time off from anywhere. The presentation includes demos, lessons learned, challenges faced, and insights into the next generation of enterprise virtual assistants.

Speech-to-Speech Mobile Translation for Medical Triage

Moderator: Dr. William Meisel, President, TMA Associates
Adam Sutherland, CEO, AppTek

Municipal and health workers are often faced with a lack of adequate interpreter support or advanced language skills to effectively communicate and provide aid to diverse populations. AppTek has developed and deployed a speech-to-speech mobile translation application to the New Jersey Health Officer’s Association to provide medical triage to patients speaking different languages.


12:30 p.m. - 1:45 p.m.

A303 - How the Google Wallet App Leverages NLU

1:45 p.m. - 2:30 p.m.
Dr. Erhan Onal, Speech IVR Developer, Google
Nandini Stocker, Sr. Product Design Manager, FlipKart

Running on both Android and iOS devices, Google Wallet app is a fast, free way to send, request, receive, and use your money. Learn how this natural language understanding (NLU) app was designed, implemented, and deployed. Learn how utterances were collected and tagged to create a statistical language model. This session about Google’s NLU IVR application is designed to benefit anyone who wants to gain expert knowledge on natural language understanding and learn useful tips in creating one.

A304 - Talking Toys: Technology & Outlook

2:45 p.m. - 3:30 p.m.
Amy Stapleton, Analyst, Opus Research / Founder, Hutch.AI, / Opus Research

There is a growing trend among both traditional toy manufacturers and startups to develop sophisticated conversational toys that are connected to cloud-based information systems. This presentation explores techniques used by today’s toy makers to create talking toys that are engaging and educational and leverage existing speech recognition, natural language processing, and dialogue scripting technologies. The presentation also explores challenges facing talking toys, both from a technical as well as from a social perspective.


B301 - The Art and Architecture of Focused Prompting

10:45 a.m. - 11:30 a.m.
James Giangola, Creative Lead, Conversation & Persona Design, Google

Call flows and detailed design specifications are essential to building successful speech solutions, but are suboptimal for writing dialogue that accounts for users’ unconscious expectations of how information is naturally structured. The result is often an awkward and unfamiliar listening comprehension task at odds with users’ inherent evolutionary, neurological, cognitive, and developmental makeup. This session demonstrates how to leverage the theory of information structure to optimize users’ comfort and comprehension, while minimizing cognitive load.

B302 - Bridging the Gap to the Sci-Fi Promise

11:45 a.m. - 12:30 p.m.
Nandini Stocker, Sr. Product Design Manager, FlipKart

FromStar Trek’s computer to Jarvis and Samantha, science fiction has promised us a world integrated via seamless voice interactions, in which speech technology is synonymous with artificial intelligence. Yet our current reality is filled with less-than-brilliant voice interactions that we love to hate. Will we ever fulfill the sci-fi promise? What’s slowing us down? This multimedia presentation explores these questions and identifies how the speech industry is hindering the possibility of realizing this fantastical future.


12:30 p.m. - 1:45 p.m.

B303 - Designing the Conversation

1:45 p.m. - 2:30 p.m.
Aaron Gustafson, Web Standards Advocate, Microsoft

Users are gradually becoming more accustomed to and reliant on voice-based interactions, so enabling users to complete critical tasks without a visual user interface is crucial for the long-term success of websites. This session shows how designing such a “headless” user interface is equivalent to designing the conversation you want to have with your users. Learn how to ensure that the technological decisions you make with respect to HTML, CSS, and JavaScript respect and support that conversation.

B304 - PANEL: The Future of Conversational User Interfaces

2:45 p.m. - 3:30 p.m.
Moderator: Roberto Sicconi, CTO, TeleLingo CSO at Infermedica
Jonathan Bloom, Voice User Interface Designer, Jibo, Inc.
Dr. Ahmed Bouzid, Co-founder & President, The Ubiquitous Voice Society
Crispin Reedy, Director, User Experience, Versay Solutions

As virtual assistants and robots proliferate and improve their communication capabilities, conversations will evolve into hybrid communications between human and nonhuman actors. Panelists discuss the elements required to attain a truly conversational user interface, such as the ability to listen in to conversations, understand the topics, and correctly decide when to chime in politely with relevant information. Will technologies such as deep learning, dialogue management, reasoning, emotion detection, and synthesis allow us to achieve this reality?


C301 - An Examination of Speech-Enabled Technologies in the Car

10:45 a.m. - 11:30 a.m.
Dr. Juan E. Gilbert, Andrew Banks Family Preeminence Endowed Professor & Chairman, Computer & Information Science & Engineering Department, University of Florida

This presentation details results of studies using speech to reduce driver distraction. Gilbert’s team has researched the use of speech to send and receive messages, get vehicle information, and to communicate with others in and outside the vehicle. The primary goal of this research is to determine if speech can be used in a true hands-free, eyes-free manner while driving without increasing distraction. Gilbert presents innovative uses of speech in the vehicle and results from driver distraction research.

C302 - Car Makers Challenge NHTSA Driver Distraction Guidelines

11:45 a.m. - 12:30 p.m.
Tom Schalk, Vice President of Voice Technology, Sirius XM

Consensus on how to best mitigate driver distraction is rapidly evolving. The National Highway Traffic and Safety Administration (NHTSA) is soon expected to announce driver distraction guidelines for portable devices used within the vehicle. However, NHTSA guidelines may be de-emphasized due to proposed changes in how car manufacturers validate the safety of in-vehicle multimodal interfaces, including speech and gesture. Learn about proposed methods for assessing driver distraction and about the test to measure distraction levels for any human-machine interface.


12:30 p.m. - 1:45 p.m.

C303 - Innovations in Monitoring Driver Attention

1:45 p.m. - 2:30 p.m.
Roberto Sicconi, CTO, TeleLingo CSO at Infermedica

Humans are poor judges of their own ability to drive safely, particularly under stressful conditions. LingoFit is a mobile app designed to monitor the attention of a driver and compare it with the level of attention required by traffic conditions, weather, speed, proximity of obstacles, driving skills and experience, and drowsiness, among other factors. LingoFit provides smart, context-aware, and nonjudgmental feedback about drivers’ attentiveness when margins approach unsafe levels to help eliminate distracted driving.

C304 - PANEL: Designing for Drivers

2:45 p.m. - 3:30 p.m.
Moderator: Jenni McKienzie, Voice Interaction Designer, SpeechUsability
Lisa Falkson, Senior VUI/UX Designer, CloudCar
Alexandre Francis, Senior User Experience Designer, Genesys

Designing the voice user interface of an in-vehicle speech application obviously has intrinsic requirements that are different from those of customer-service IVR applications. Panelists discuss strategies for creating an effective user experience for systems designed to be used while driving, which may include both graphical user interfaces that are designed to reduce glance time and a simple and straightforward voice user interface. Join us to debate guidelines and best practices for multimodal in-car infotainment systems.



D301 - FIDO & You: How FIDO Standards Enhance Security & Protect Against Threats

10:45 a.m. - 11:30 a.m.
Paul Grassi, Senior Standards & Technology Advisor, Applied Cybersecurity Division, NIST

The FIDO (Fast Identity Online) Alliance’s focus on strong authentication is coming of age just when online security needs it the most to protect against threats. In this session, hear FIDO examine the current threat landscape and how FIDO is helping organizations to enable simple, strong authentication for their customers through biometrics such as speech recognition. Attendees leave knowing more about the FIDO Alliance and its steps in reducing fraud along with what’s to come along with how to get involved.

D302 - PANEL: The Future of Speech Standards

11:45 a.m. - 12:30 p.m.
Moderator: Bruce Pollock, Vice President, Strategic Growth and Planning, West Interactive
Dr. Deborah Dahl, Principal, Conversational Technologies
Dr. Daniel C Burnett, President, StandardsPlay
Brian Susko, Vice President, Software Engineering, True Image Interactive

Which emerging standards, such as WebRTC, SCXML and discovery and registration of multimodal modality components, should SpeechTEK attendees be aware of? What new standards and extensions for existing standards are needed to accelerate the development of new applications using speech technologies? Which standards would enable virtual agents to communicate with one another? What new speech standards are needed, such as statistical language models or JavaScript APIs in the browser? Which standards organizations should be involved? How can standards accommodate advances in spoken dialogue technology, such as statistical dialogue management or incremental speech processing?


12:30 p.m. - 1:45 p.m.

D303 - PANEL: The AVIxD Design Guidelines

1:45 p.m. - 2:30 p.m.
Moderator: Jim Milroy, Human Factors Solutions Consultant, West Interactive Services
Kristie Goss-Flenord, Consultant, Human Factors, Convergys
James R. Lewis, Senior Human Factors Engineer, IBM Corporation AVIxD
Jenni McKienzie, Voice Interaction Designer, SpeechUsability

The Association for Voice Interaction Design (AVIxD) has recently published a set of guidelines for voice user interface design in the IVR context. Experienced practitioners compiled guidelines that represent the collective best practices knowledge of the voice design community. Join this session for a discussion of the process of creating and vetting the guidelines, demonstrations of how to use guidelines to support design decisions, and considerations for future enhancements to the guidelines.

D304 - Knowing What Not to Do

2:45 p.m. - 3:30 p.m.
Kristie Goss-Flenord, Consultant, Human Factors, Convergys

Guidelines for how to design voice interactions are becoming readily available, but even the best of these guidelines lacks recommendations on what not to do. This presentation fills that gap by presenting design techniques that simply don’t work in specific scenarios and discussing why each approach has failed miserably. The audience is invited to recommend the best possible solution for the scenario and share their own “worst practices.”


Co-Located With

Platinum Sponsors

Gold Sponsors

Corporate Sponsor

Break Sponsor

Tuesday Keynote Lunch Sponsor

Media Sponsors

Conference Videos

Watch the 2016 preview below or see videos from previous SpeechTEK events by clicking here.