Thursday, May 26, 2016

SpeechTEK University

STKU-4 - Deep Neural Networks in Speech Recognition

9:00 a.m. - 12:00 p.m.

David L Thomson, VP Speech Technology, CaptionCall

Deep neural nets (DNNs) have transformed speech recognition technology, offering a new level of accuracy. Neural networks showed great promise in the mid-1990s, but results proved disappointing and the approach was all but dead until new DNN methods began to power breakthroughs in performance and training speed. This tutorial explains how DNNs work and why they are so much better than previous methods. We review the latest techniques from several research centers and explain how much accuracy improves with each method. The session includes live demos and practice exercises. Participants are encouraged to bring a smartphone or laptop for part of the training. Attend this session to gain an understanding of DNN fundamentals, how DNNs fit into speech recognition systems, the role of open source tools, and where the next advances are expected to appear.

STKU-5 - Technological and Design Challenges of Multimodal User Interfaces

9:00 a.m. - 12:00 p.m.

Dr. Nava A Shaked, Head of Multidisiplinary studies, HIT Holon Institute of Technology. Israel

Mobile devices provide powerful computation and multiple sensors that enable users to perform tasks combining the modalities of voice, text, gesture, touch, type, and more. A multimodal user interface requires the integration of several recognition technologies together with sophisticated user interfaces for data input and output. The workshop discusses both the technology and usability aspects of interactive multimodal user interfaces. Learn which technologies are used in various types of user-device interactions as well as the available multimodal architectures and their design challenges. We discuss integration and deployment issues and give examples of real-world systems including automotive, wearables, and smartphones.

STKU-6 - Usability Reboot - This workshop has been canceled.

1:00 p.m. - 4:00 p.m.

Susan L. Hura, Principal, SpeechUsability

Jenni McKienzie, Voice Interaction Designer, SpeechUsability

The role of IVR systems has changed dramatically in the past decade, and speech technologies are now being deployed in mobile apps, home automation, entertainment, and automotive contexts. However, many organizations rely on the same usability testing methodology they’ve used for years. This workshop demonstrates why it’s vital to reboot your usability program to ensure that you’re collecting user data that’s relevant to new contexts and actionable across platforms. Learn new techniques for recruiting test participants, selecting scenarios, and collecting data in, and out of, the usability lab. Attendees also plan and run a live usability test to deepen their new skills.

STKU-7 - Designing for Spoken Natural Language Understanding & Dialogue

1:00 p.m. - 4:00 p.m.

David Attwater, Senior Scientist, Enterprise Integration Group

Spoken natural language interfaces are becoming ubiquitous. They can be found in IVRs, on smartphone apps, and on set-top box and gaming devices. This session introduces attendees to the underlying technologies used for such interfaces and addresses design issues associated with spoken natural dialogue. Emphasis is placed on conversational techniques required to build successful natural spoken interfaces including building robust NLU grammars, designing semantic schemas, scripting for responses, and dealing with ambiguous responses. People interested in natural language for customer service applications should be particularly interested in this session, but the techniques addressed apply equally to the emerging fields of smartphone intelligent assistants and other device-oriented applications.

STKU-8 - Developing Multimodal Applications for New Platforms

1:00 p.m. - 4:00 p.m.

Dr. Deborah Dahl, Principal, Conversational Technologies

Multimodal interfaces, combining speech, graphics, and sensor input, are becoming increasingly important for interaction with the rapidly expanding variety of nontraditional platforms, including mobile, wearable, robots, and devices in the Internet of Things. User interfaces on these platforms will need to be much more varied than traditional user interfaces. We demonstrate how to develop multimodal clients using standards such as WebRTC, WebAudio, and Web Sockets and the Open Web Platform, including open technologies such as HTML5, JavaScript and CSS. We also discuss integration with cloud resources for technologies such as speech recognition and natural language understanding. Attendees should have access to a browser that supports the Open Web Platform standards, for example, the current versions of Chrome, Firefox, or Opera. Basic knowledge of HTML5 and JavaScript would be very helpful.