|
|
|
SpeechTEK 2008 - Final Program
August 18-20, 2008 • New York Marriott Marquis • New York, NY
|
|
|
|
|
|
SUNRISE SEMESTER
|
Discussion 3 – Speaker Identification and Verification
8:00 a.m - 8:50 a.m
Dr. Judith Markowitz, President - J. Markowitz Consultants
|
An open meeting of the VoiceXML Forum's Speaker Biometrics Committee,
which is working on extending VoiceXML to include Speaker
Identification and Verification (SIV). Anyone interested in SIV is
welcome to attend. Discussion topics will include an update from the
VoiceXML Forum's Biometric committee followed by a Q&A.
|
|
Discussion 4 – It Ain't Shakespeare ... Or Is It?
8:00 a.m - 8:50 a.m
Alexandra Auckland, Voice Interaction Designer - Sotto Voce Consulting Dr Melanie Polkosky, Human Factors Psychologist/Consultant - IBM
|
Like playwrights, VUI designers write dialogue that is meant to be
spoken out loud. Yet, recent research reveals that less than 10% of
designers in the field have an educational background in voice
performance, acting, coaching, voice pathology, or pedagogy. In this
lively session, we will share techniques about what to look for in
auditioning talent, how to write dialogue that translates well to the
spoken language, and how to effectively coach a voice actor so the
resulting application becomes more effective, natural, and usable.
|
|
Discussion 5 – W3C Multimodal Working Group
8:00 a.m - 8:50 a.m
Dr. Deborah Dahl, Principal - Conversational Technologies
|
Increasingly powerful mobile devices, along with improvements in
displays and speech technologies, are making possible innovative,
compelling, and robust applications that allow users to combine speech,
graphics, ink, and motion. Emerging standards for multimodal
interaction support and simplify the development and deployment of
these applications. Please join us for a discussion of current and
planned work in the World Wide Web Consortium's Multimodal Interaction
Working Group and learn how to get involved in the W3C standards process.
|
|
|
TRACK A: MEETING BUSINESS GOALS WITH SPEECH TECHNOLOGY
|
Preparing for the New Consumer
9:00 a.m - 10:00 a.m
Mr Lior Arussy, President - Strativity Group
 |
Despite technological innovations, companies are failing to meet increasing customer expectations. To stay competitive, organizations must meet customers’ growing demands by incorporating new ways of thinking, connecting, and doing business that will create profitable and delightful customer experiences. Lior Arussy is a renowned author; business visionary; creative catalyst; and president of Strativity Group, a consultancy which advises Global 2000 companies and emerging businesses around the world. Arussy will address the biggest customer strategy issues facing organizations today, such as the new rules of customer engagement, and offer suggestions on how to better prepare for the new consumer.
|
|
Break in the Exhibit Hall (Exhibit hall opens)
10:00 a.m - 10:45 a.m
|
A201 – Speech: Thinking Out of the Box
10:45 a.m - 11:30 a.m
Dr. Judith Markowitz, President - J. Markowitz Consultants Whitney Quesenbery, Consultant/Researcher - WQ Usability Steve Chirokas, Executive Director, Marketing - VoltDelta
|
What are the bleeding edge possibilities for speech technology? In this panel we take three steps back and reexamine speech from the outside to get a new perspective on where we stand today and where we could be tomorrow. Join us for a forward-thinking conversation about how speech can be an empowering technology for tomorrow’s automated, mobile interfaces. |
|
A202 – Video in the Call Center and Network
11:45 a.m - 12:30 p.m
MODERATOR: Dr. Valentine Matula, Director Multimedia Research - Avaya
|
Is there a place for video alongside your speech solution? Experts in this session make the case for how video technology offers new opportunities in the call center and next-generation networks. Come and learn how video will change the ways we approach self-service, and the new business opportunities video will enable. |
Video-Based Call Centers and Self Service Applications: Is There an Untouched Need Out There?
Bob Cooper, CEO - Swampfox Inc.
How will the customer experience change when video-based call centers and self-service applications become available? What new services that we are not considering today will become commonplace and what do they mean to your call center? This session will look into a few scenarios and discuss the technical hurdles that must be overcome. You will also learn which international locales are best prepared to deploy these technologies?
|
Opportunities for Video in the Next-Generation Network
Rob Marchand, Senior Director, Product Management, Genesys Telecommunications - Alcatel Lucent
This session will discuss opportunities for the deployment of speech-enabled video applications in the enhanced-services and managed-service environments. The discussion will include overviews of real-world deployments, plus a review of the challenges that remain. Topics will cover architecture, applications, and standards, including SIP and VoiceXML.
|
|
Attendee Lunch Sponsored by Tellme - Remember Me: A Case Study on Personalized Self-Service
12:30 p.m - 1:45 p.m
Tracy Griffith, Sr. Manager Customer Technology Reservations & Premium Services - American Airlines
|
In 2007, American Airlines launched an industry-leading telephone service for its frequent fliers to deliver proactive information based on the callers' upcoming needs. A year later, find out how American Airlines did it and whether the investment has paid off - both for travelers and for American Airlines. This Customer Case Study will be presented as part of the conference lunch sponsored by Tellme on Tuesday, August 19. AAdvantage members can enroll in "Remember Me" with the instructions at www.aa.com/callaa.
|
|
A203 – Cross-Channel User Experience
1:45 p.m - 2:30 p.m
MODERATOR: Eduardo Olvera, Senior User Interface Designer, Professional Services - Nuance Communications
|
The days of having a speech strategy separate from an overall user experience strategy are over. For a strategy to be meaningful, organizations need to consider speech technology as one element in a multichannel customer contact plan that encompasses all of the ways in which customers might interact. This session will provide valuable lessons in understanding how speech fits in with the Web, printed material, live customer service, and other channels through which organizations communicate with their customers. |
Is Your Caller Experience Web-Aware?
Dr. Lizanne Kaiser, Sr. Principal Business Consultant - Alcatel-Lucent / Genesys
The rapid expansion and penetration of the Web has dramatically changed customer behavior not only online, but also during phone calls. Despite this, automated phone systems are often still designed as siloed channels. Speech systems that lack Web awareness provide incoherent customer experiences, damaging customer loyalty and business benefits. Drawing from client case studies, this presentation will explore the impact of the Web on customer behavior and expectations, metrics for assessing the Web-awareness of your caller experience, design guidelines for creating Web-aware speech systems, and predictions about the evolving convergence of the Web and phone, along with the ramifications for speech technology.
|
Speech and the Cross-Channel Customer Experience: Leveraging Speech to Improve Branding, Customer Satisfaction, and Self-Service Rates
Elaine Cascio, Vice President - Vanguard Communications Corp
Customers conduct business with companies through many different channels, but the experience is often disjointed and inconsistent. This creates customer confusion, and frustration or, even worse, abandonment for the competition. They’ll also tell at least five others about their dissatisfaction. In this session, you’ll learn how your speech application is a critical component to building a positive cross-channel customer experience. We will examine how speech can contribute to the overall multimedia brand experience and build customer loyalty, plus we’ll look at real-life examples of organizations that provide seamless cross-channel service (as well as some that don’t). Finally, we will outline the critical steps you should take to ensure your customer has an engaging, seamless, and positive experience.
|
|
A204 – Speech In CPG/Retail
2:45 p.m - 3:30 p.m
MODERATOR: Sunil Issar, Director, Architecture - Convergys Corporation
|
Consumer packaged goods companies and retailers rely on speech technology in a variety of ways. This session will focus on two: warehousing and customer self-service solutions. One presentation will feature Coca-Cola’s warehouse voice-picking system, which, based on inexpensive IP phones and network-based speech recognition, has yielded significant cost reductions. The second presentation will explain how human guides, working in the background, have improved service for a retail self-service application.
|
VoIP Creates New Approach to Speech-Based Warehousing
Michael Jacks, Senior Manager, Logistics & Transportation - Coca-Cola Enterprises
Coca-Cola Enterprises created the first-ever deployment of warehouse voice-picking based on inexpensive IP phones and network-based speech recognition. The results? Significant reductions in per-device costs and total cost of ownership. This presentation will provide insights into what drove this innovation, the results achieved, and future plans to leverage speech capabilities on a services-oriented architecture (SOA) basis throughout the company.
|
Guided Self-Service Solution for Call Centers
Tom Scott, CIO & Sr. VP Operations - Spiegel Brands
The Spiegel call center has replaced the traditional menu-driven touchtone approach to receiving and routing calls with guided self-service. This session takes an up-close look at the new solution, which enables a behind-the-scenes human guide to monitor and assist four or more simultaneous calls, resulting in a better caller experience with reduced ordering and customer service costs.
|
|
Break in the Exhibit Hall
3:30 p.m - 4:15 p.m
|
A205 – The VUI Backlash
4:15 p.m - 5:15 p.m
MODERATOR: Dr. Lizanne Kaiser, Sr. Principal Business Consultant - Alcatel-Lucent / Genesys
|
Touchtone IVRs used to be the technology people loved to hate—an object of ridicule in popular culture. Speech recognition was billed as the solution to the ills of touchtone, but now there are rumblings that speech-enabled IVRs ready for the cultural hall of shame, too. In this session we will examine what the general public really thinks about speech, and why the technology hasn’t lived up to the initial hype. The speakers will also discuss the reasons for the public’s dissatisfaction and ways to alleviate their distress. |
Why People Really Hate IVRs and What to Do About It
Dr. Ahmed Bouzid, Senior Director of Product Management - Angel.com Incorporated
IVR systems are generally perceived by users as obstacles installed by companies to keep callers from reaching expensive human agents, rather than helpful tools that can effectively serve callers’ needs. IVRs are not only failing to do their jobs, but they are also pushing some of users’ most sensitive hot buttons. This session will identify these emotional triggers and offer some clear guidelines on ensuring the deployment of highly usable IVR solutions.
|
Get Human, Get Real
Simonie J. Wilson, Speech Scientist - Convergys Corporation
Some key issues have been missed in the ongoing discussion about GetHuman, a movement founded by Paul English and based on his belief that a live agent is always better than an IVR of any kind. The truth is, it’s not that simple. From a desire for privacy, to needing assistance in the middle of the night, to a dislike for offshoring, callers have many reasons to prefer automated systems instead of live agents. We’ll discuss how to improve the automated part of a call, and maybe we can offer some suggestions for the GetHuman initiative.
|
Have Your Cake and Eat It Too
Nick Ezzo, Director of Marketing - TuVox
Many people believe that providing an excellent caller experience is an unnecessary expense—that "caller experience" and "business benefits" are contradictory terms. Based on data from actual implementations, this session will methodically build a case that proves a great caller experience leads to direct business benefits, including faster calls, happier customers, productive agents, and lower phone bills. Yes, you can have your cake and eat it, too.
|
|
Reception
(9th Floor)
5:30 p.m - 7:00 p.m
|
|
TRACK B: VUI DESIGN PRINCIPLES AND TECHNIQUES
|
Preparing for the New Consumer
9:00 a.m - 10:00 a.m
Mr Lior Arussy, President - Strativity Group
 |
Despite technological innovations, companies are failing to meet increasing customer expectations. To stay competitive, organizations must meet customers’ growing demands by incorporating new ways of thinking, connecting, and doing business that will create profitable and delightful customer experiences. Lior Arussy is a renowned author; business visionary; creative catalyst; and president of Strativity Group, a consultancy which advises Global 2000 companies and emerging businesses around the world. Arussy will address the biggest customer strategy issues facing organizations today, such as the new rules of customer engagement, and offer suggestions on how to better prepare for the new consumer.
|
|
Break in the Exhibit Hall (Exhibit hall opens)
10:00 a.m - 10:45 a.m
|
B201 – How Do You VUI?
10:45 a.m - 11:30 a.m
MODERATOR: Peter B Krogh, Director of Solutions Architecture - SpeechCycle
|
There’s more than one way to design a good voice user interface. Join us for a session in which VUI experts showcase new ways of thinking about VUI issues. Learn how the precise wording of VUI prompts can vastly influence the way users respond, and how new VUI standards will impact our industry. Come and participate in what promises to be a spirited debate regarding some of the most important issues in VUI design today. |
Coffee? Tea? Yes, Please
David Suendermann, Principle Speech Scientist - SpeechCycle Ethan Levine, Senior User Experience Designer - LogicTree
When designing a troubleshooting application, much effort is put into finding an optimal balance between eliciting in-grammar responses and brevity of dialogue. This session will present data from calls to live systems that used three styles for eliciting responses: giving the caller explicit instructions as to the set of valid responses; phrasing a question to include the set of valid responses, though not explicitly stating how the caller should respond; and phrasing a question without any guidance. Of note, certain formulations encouraged ambiguous responses; we will discuss relevant avoidance and recovery strategies for these.
|
Getting Serious About IVR Dialogue Standards A Framework for Action
Bruce Balentine, EVP & Chief Scientist - Enterprise Integration Group Ken Rehor, Voice Technology Group - Cisco
User interface standards offer both utility and disruption to the many players involved in IVR, including end users, enterprise stakeholders, call center agents, technology vendors, and VUI designers. Conflicts of interest and philosophy constitute a culture of resistance that has reduced IVR quality and raised costs worldwide. The goal of this presentation is to propose a framework for which an organized and representative community of stakeholders within the IVR industry can go about agreeing on a standardized set of IVR behaviors for telephone-based dialogues.
|
|
B202 – Inaugural Meeting of the Association of Voice Interaction Design
11:45 a.m - 12:30 p.m
MODERATOR: Susan L. Hura PhD, Principal - SpeechUsability
|
At SpeechTEK 2007, a group of user interface designers met and agreed that our profession needed an organization to promote excellence in voice interaction design, facilitate professional development, and provide an opportunity to share experiences with other designers. That meeting resulted in the Association for Voice Interaction Design (AVID). Join us for AVID’s inaugural meeting, during which you’ll have the opportunity to vote on the proposed charter and elect officers. You can read the charter online at www.avixd.org before the meeting; come prepared with questions and suggestions. |
|
Attendee Lunch Sponsored by Tellme - Remember Me: A Case Study on Personalized Self-Service
12:30 p.m - 1:45 p.m
Tracy Griffith, Sr. Manager Customer Technology Reservations & Premium Services - American Airlines
|
In 2007, American Airlines launched an industry-leading telephone service for its frequent fliers to deliver proactive information based on the callers' upcoming needs. A year later, find out how American Airlines did it and whether the investment has paid off - both for travelers and for American Airlines. This Customer Case Study will be presented as part of the conference lunch sponsored by Tellme on Tuesday, August 19. AAdvantage members can enroll in "Remember Me" with the instructions at www.aa.com/callaa.
|
|
B203 – Business Problems, VUI Answers
1:45 p.m - 2:30 p.m
MODERATOR: David C Martin, Managing Principal, Self Service Solutions EMEA, Professional Services - Avaya Mark Webb, IVR Process Engineer, VUI Design - Humana IVR User Interface Designs Daniel Padgett, Senior Speech Consultant - Voice Partners/VoxGen
|
Hear the inside story about the VUI process from two perspectives. In the first presentation, the customer, Humana, will present the business challenges that led it to seek a speech solution. Next, the solution provider, VoxGen, will present its view of the situation and discuss the methodology it used to develop a highly effective solution. Join us for this unique opportunity to talk with both the customer and the vendor together to understand how to work as a team to solve business problems with speech. |
|
B204 – VUI Design for Multimodal Applications
2:45 p.m - 3:30 p.m
MODERATOR: Jonathan Bloom, Senior Voice Interaction Designer - SpeechCycle
|
This session shows multimodal user interface design in action, giving real-life examples that incorporate speech technologies. The presenters will highlight the challenges of integrating speech with a visual interface and describe the techniques they used to design effective, pleasing user interfaces. View demonstrations of a multimodal voting system and multimodal games with speech controls, then decide for yourself if multimodal is ready for prime time. |
Prime III: A Multimodal Approach to Electronic Voting
Dr. Juan E. Gilbert, IDEaS Professor & Chair, Division of Human-Centered Computing - Clemson University
Prime III is a multimodal electronic voting system designed, implemented, and tested by the Human Centered Computing Lab at Auburn University. The system has caught the attention of Congress, local and national media, lobbyists, etc. This session features a demonstration of Prime III, as well as a discussion about how speech was incorporated into this award-winning application, which could change the way we vote.
|
Talking Games: Toward Speech as a Mainstream Modality
David Thornton, Instructor - Auburn University
This session will describe the results of an ongoing study of speech-based cursor control mechanisms—research that is intended to provide user data to influence the design of future systems: involving real-time demands, very small targets, and moving targets. One such application of this research is in the area of video games. These findings also have implications for physically impaired users whose primary or only control modality is speech.
|
|
Break in the Exhibit Hall
3:30 p.m - 4:15 p.m
|
B205 – Driving the Conversation Forward in the Face of Errors
4:15 p.m - 5:15 p.m
MODERATOR: Jim Milroy, Director, User Experience - West Interactive
|
Errors regularly occur in human-to-human communication, but we deal with them so seamlessly that they rarely interrupt the positive flow of conversation. The way we deal with conversational errors in VUIs, however, is rarely as smooth and often begins a needless progression toward the failure of the interaction. That's why VUI designers must plan for dealing with these inevitable errors. This session will explore new ways to look at error conditions and methods for dealing with them, with the goal of building VUIs that deemphasize errors and move the conversation forward. |
A New Perspective on Speech Recognition Errors
Gregory Simsar, Vice President, Speech Services - Syntellect, Inc.
Many types of speech communication behaviors that humans handle effortlessly are not even detected by today’s speech recognition systems, let alone handled gracefully. Only by taking a new perspective on the types of errors that are encountered by speech recognition systems—and by mapping them to the types of errors that occur in human-to-human interaction—can we start to make significant improvements in how errors are handled. This presentation builds on a SpeechTEK VUI Workshop and subsequent article in Speech Technology magazine about a new taxonomy for speech recognition errors.
|
The Subtle Science of Failing: How to Retry
Jonathan Bloom, Senior Voice Interaction Designer - SpeechCycle Dr. Jackson Liscombe, Speech Engineer - SpeechCycle, Inc.
Just like humans, speech recognition systems do not always understand their conversational partners. How should these systems recover when recognition confidence is low? Up until now, designers have based their retry strategy mostly on tradition, bombastic opinion, anecdotal evidence, or, at best, one-off opinions from usability test participants. In this presentation, we will discuss research based on tens of thousands of calls, each hitting one of four different retry styles: an apologetic preamble, a preamble without an apology, an abbreviated retry, or a retry avoiding explicit references to self. We will base the relative success of each strategy on turn success rate, average number of retries required per turn, percentage of out-of-grammar utterances, automation rates, and caller satisfaction.
|
Refocusing Caller Intent: Approaches for Building Fault-Tolerant Voice User Interfaces
Jessica Peterson, Speech Technology Consultant - Versay Solutions
Based on real-world deployments, this session will present detailed aspects of a framework for natural language routing applications. You will learn how to adapt confirmation states to specific types of initial caller requests to elicit usable input that leads to successful call routing, as well as strategies for handling callers who provide input to a natural language main menu that cannot be successfully mapped to a business category. You’ll gain a new approach that creates a second chance to harness the benefit of natural language recognition when warranted, with the goal of creating a smarter speech application and improving the overall user experience.
|
|
Reception
(9th Floor)
5:30 p.m - 7:00 p.m
|
|
TRACK C: ADVANCED SPEECH TECHNOLOGIES SYMPOSIUM
|
Preparing for the New Consumer
9:00 a.m - 10:00 a.m
Mr Lior Arussy, President - Strativity Group
 |
Despite technological innovations, companies are failing to meet increasing customer expectations. To stay competitive, organizations must meet customers’ growing demands by incorporating new ways of thinking, connecting, and doing business that will create profitable and delightful customer experiences. Lior Arussy is a renowned author; business visionary; creative catalyst; and president of Strativity Group, a consultancy which advises Global 2000 companies and emerging businesses around the world. Arussy will address the biggest customer strategy issues facing organizations today, such as the new rules of customer engagement, and offer suggestions on how to better prepare for the new consumer.
|
|
Break in the Exhibit Hall (Exhibit hall opens)
10:00 a.m - 10:45 a.m
|
C201 – Natural Language Processing: Automaticity
10:45 a.m - 11:30 a.m
MODERATOR: K.W.'Bill' Scholz, President - NewSpeech LLC
|
The use of statistical natural language processing continues to expand, but its growth pace is inhibited by the cost of language model development and maintenance. This session describes a framework that automatically adapts and changes automatic speech recognition language models over time and considers the evolution of applications from pure speech recognitions systems, to mixed-initiative systems with less rigid behavior, and to voice systems that incorporate proven technologies to become not only smarter but less complex to maintain. |
CAVA: Continuous Automatic Vocabulary Adaptation
Mark Pfeiffer, Vice President Business Development & Communications - SAIL LABS Technology AG
CAVA from SAIL LABS is a framework that automatically adapts and changes automatic speech recognition language models over time to current affairs. The user determines areas of interest, that are to be harvested for domain, or topic specific data, which subsequently form the basis of automatic language model creation. Data sources are combined according to their relevance to yield a series of domain, time and topic specific language models. This mix of language model data and sources then add and subtract words and terms depending on their relevance in the current deployment environment. This processed data is then optionally quality checked by human interaction or runs unattended. Key ideas include the automatic adaptation of the language model to the interests of the user which allows deployed systems to adapt to dynamically and quickly changing environments without interaction by the vendor of the system, as well as identification of the ideal point in time to rebuild a language model.
|
Advancing Toward Intelligent Agents
Emmett Coin, Speech Scientist - ejTalk
Voice applications began as almost pure speech recognition systems: speaking choices, numbers, etc. Primitive directed dialogues followed. Now many applications support some sort of mixed initiative as they progress toward less rigid behaviors. Sadly, these methods are already failing to meet the expectations of users, not to mention leading to paralyzing complexity for developers. Going forward, voice systems will incorporate other existing and proven technologies to make applications smarter. These technologies have the potential to dramatically reduce the complexity of development for today’s sophisticated voice applications. This session will compare and contrast the benefits and limitations of these technologies ranging across natural language processing, rule-based and case-based reasoning, statistical, and connectionist (neural net) approaches.
|
|
C202 – Enhanced VUI : Context-Sensitive Dialog
11:45 a.m - 12:30 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - ATX Group, Inc.
|
Natural language understanding (NLU) technologies can significantly improve caller experience and improve business metrics such as routing accuracy and call completion rates. The ultimate goal is to deliver intelligent responses to natural speech input. Based on the current state of speech recognition technologies, NLU technologies need to be used selectively to achieve the best performance. For example, it may be better to use a menu followed by an open-ended prompt instead of the reverse. This session dives deep into NLU technologies and how to build robust statistical models for optimal performance. Interesting application performance results are discussed, including effective handling of spontaneous speech. |
Building Robust NLU Applications
Sunil Issar, Director, Architecture - Convergys Corporation
Natural language understanding (NLU) technologies can significantly improve caller experience with s and improve business metrics like routing accuracy and call completion rates. These technologies allow the caller to speak naturally. In general, based on the current state of speech recognition technologies, NLU technologies need to be used selectively to achieve the best performance. For example, it may be better to use a menu followed by an open-ended prompt. We will describe when it is appropriate to use NLU technologies and how to build robust statistical models for optimal performance. We will also describe results comparing performance of directed dialogue and statistical grammars. Key Ideas: NLU can improve business metrics. NLU technologies need to be used selectively for optimal performance. And, VUI design and statistical models need to handle spontaneous speech, which introduces additional complexities.
|
Intelligent Responses to User Input
Susan Boyce, Principal User Experience Manager - Microsoft Tellme
Customizing voice applications for individual users can have several beneficial effects. Accurately anticipating the reason for a call can streamline the interaction. Other characteristics of the caller, such as location or previous calling history, can be used to enhance recognition performance. This presentation will review examples of effective use of personalization in voice applications and set forth some guidelines.
|
|
Attendee Lunch Sponsored by Tellme - Remember Me: A Case Study on Personalized Self-Service
12:30 p.m - 1:45 p.m
Tracy Griffith, Sr. Manager Customer Technology Reservations & Premium Services - American Airlines
|
In 2007, American Airlines launched an industry-leading telephone service for its frequent fliers to deliver proactive information based on the callers' upcoming needs. A year later, find out how American Airlines did it and whether the investment has paid off - both for travelers and for American Airlines. This Customer Case Study will be presented as part of the conference lunch sponsored by Tellme on Tuesday, August 19. AAdvantage members can enroll in "Remember Me" with the instructions at www.aa.com/callaa.
|
|
C203 – Enhanced VUI: Improving Speech Applications: Context and Evaluation
1:45 p.m - 2:30 p.m
MODERATOR: K.W.'Bill' Scholz, President - NewSpeech LLC
|
As speech applications grow in sophistication, they are moving beyond rigid preconceived dialogue flow toward user interaction that responds dynamically to the overall context of the conversation. Speakers in this session consider recent success in building applications with this evolving behavior and will review experiments designed to evaluate the benefits of this growing sophistication in user interaction using a large subject pool. |
Shifting the Intelligence Burden: the Lmits and Possibilities of VUI Design
Dr. Ahmed Bouzid, Senior Director of Product Management - Angel.com Incorporated
Voice User Interfaces are restrictive in three crucial ways: they are time linear—you must patiently listen to one word before you can hear the one that follows it; unidirectional—when you hear something you can’t easily go back and listen to it again; and invisible—you can't easily figure out where precisely you are in the interaction and what exactly the system expects you to do next. This session will cover several types of contexts that can intelligently inform management of an interaction, such as user profile, recent caller history, call initiation context, and call population distribution. Being aware of the context of the call can help facilitate numerous intelligent adaptations.
|
The Intelligent Customer Front Door (iCFD)
Phil Shinn PhD, VUI Designer/Speech Scientist - Morgan Stanley Smith Barney
The iCFD not only greets a caller, but also gathers their intent, adds contextual information from their profile and history, and then makes a decision based on business rules on how to route the caller to the most suitable resource (either automated or live agent) to most effectively resolve the interaction. This dynamic approach is different from traditional static IVR applications in that it utilizes a blended strategy that considers an individual caller's value, the current state of call center queues, information from CRM databases to construct, on-the-fly, a unique and personalized caller experience. This talk with focus on the methods and features used to build a iCFD, and discuss some case studies.
|
|
C204 – Advances in Automotive Speech
2:45 p.m - 3:30 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - ATX Group, Inc.
|
The human/machine interface in the vehicle is evolving quickly and now allows drivers to use speech to control music, phones, navigation systems, and other functionalities, making it possible to be more productive while driving. This session will reveal the latest in embedded natural language interaction and dialogue management technologies from a major research center, and how it can be applied in the vehicle. Beyond the user interface itself, also critical for speech enablement is proper microphone configuration and specialized acoustic models targeted at handling far-field speech under noisy driving conditions. This session is a must for those interested in what’s up and coming in automotive speech. |
Conversing with Your Car
Roberto Sicconi, Program Director, T.J. Watson Research - IBM
This presentation will focus on automotive interfaces that illustrate the latest in embedded natural language interaction and dialogue management technologies. Examples of this are systems that will enable users to converse naturally with devices such as music players and GPS navigation systems to accomplish their objectives easily and efficiently. Key natural language understanding and dialogue management features include state-of-the-art statistical language modeling, natural language understanding technology, and free-form recognition of multiple items in a single request. Excruciating menus are avoided by direct access to most functions and without having to remember specific commands that need to be used at specific places in a dialogue.
|
Voice Automation in the Vehicle: Deployments and Trends
Dr. Thomas Schalk, Vice President, Voice Technology - ATX Group, Inc.
The automobile is designed to be safe to drive and speech interfaces fit nicely into the driving experience. Today, speech interfaces allow drivers to use speech to control music, phones, navigation systems, and other functionality, making it possible to be more productive while driving. This presentation will analyze significant automotive speech deployments and include topics such as: microphone configuration, acoustic models, language models, and multi-modal interface requirements.
|
|
Break in the Exhibit Hall
3:30 p.m - 4:15 p.m
|
C205 – Ancillary Transcription Techniques
4:15 p.m - 5:15 p.m
MODERATOR: K.W.'Bill' Scholz, President - NewSpeech LLC
|
Recognition technology has matured to the point that recorded telephone-quality audio from unknown speakers can be transcribed with sufficient accuracy to support commercial applications such as medical data transcription, broadcast news captioning, and near-real-time conversion of voicemail to text. Recent improvements in transcription quality are achieved through the use of ancillary information such as caller-specific data serving to narrow that speaker’s domain, speaker-dependent acoustic models from frequent callers, and the use of CFGs to identify words and phrases that occur with high frequency in the target domain. This session will review the most beneficial ancillary factors contributing to transcription accuracy and illustrate the effects of their incorporation. |
Ancillary Transcription Techniques
Michael Picheny, Manager - IBM
As applications of large vocabulary speech recognition proliferate, there are increasing opportunities to use various types of side information to enhance recognition accuracy. This presentation will explore a range of side information, ranging from the talker's identity to more subtle information from extrinsic sources, such as questionnaire responses and cross-language constraints. The talk will also address the issues in incorporating various additional information sources in practical deployable systems.
|
Speech and Speaker Recognition Systems Can Make Use of More Information
Jordan Cohen, Senior Scientist - SRI
Speech and speaker recognition systems can make use of more information than is normally included in an acoustic model. We have been experimenting with the use of back-channel information, word choice, channel modeling, and other measures to assist in performing better speech recognition. In addition, we have started a project to collect data in which speaker style is explicitly controlled. Appropriately modeling the aspects of style is expected to enhance our recognition performance substantially.
|
|
Reception
(9th Floor)
5:30 p.m - 7:00 p.m
|
|
TRACK D: DEVELOPMENT AND DEPLOYMENT
|
Preparing for the New Consumer
9:00 a.m - 10:00 a.m
Mr Lior Arussy, President - Strativity Group
 |
Despite technological innovations, companies are failing to meet increasing customer expectations. To stay competitive, organizations must meet customers’ growing demands by incorporating new ways of thinking, connecting, and doing business that will create profitable and delightful customer experiences. Lior Arussy is a renowned author; business visionary; creative catalyst; and president of Strativity Group, a consultancy which advises Global 2000 companies and emerging businesses around the world. Arussy will address the biggest customer strategy issues facing organizations today, such as the new rules of customer engagement, and offer suggestions on how to better prepare for the new consumer.
|
|
Break in the Exhibit Hall (Exhibit hall opens)
10:00 a.m - 10:45 a.m
|
D201 – Securing CCXML and VoiceXML Applications
10:45 a.m - 11:30 a.m
MODERATOR: James A. Larson, Vice President - Larson Technical Services Dan York, Director of Conversations - Voxeo
|
How secure are your speech applications? As the usage of both VoiceXML and CCXML continues to explode, and VoIP usage continues to grow dramatically, especially within enterprise environments, it is increasingly important that you ensure that applications and services are not open to attack. Learn about the potential vulnerabilities in a system using VoiceXML or CCXML, what you can do to secure these systems, and how you can develop a strong architecture. |
|
D202 – Personalization and Context
11:45 a.m - 12:30 p.m
MODERATOR: R.J. Auburn, Chief Technology Officer - Voxeo
|
Many consumer Web sites offer a personalized experience. But what about over the phone? This session will review approaches to using context to improve the user interface dialogue. The first presentation describes new research on the contextual handling of transfer requests. The second presentation describes recent applications for the airline, entertainment, and financial services industries, showing advanced personalization features, ANI authentication, and context-specific menus so callers are offered appropriate choices at any point. |
Optimizing a Caller’s Request for a Live Agent
Patrick Nguyen, Chief Technology Officer - Voxify
How your self-service application handles that desperate cry for an agent is a careful balancing act. The difficulty lies in conflicting objectives—contact centers want to help callers facing problems but avoid unnecessary transfers. If the application transfers without analyzing the situation the caller might face long hold times and more frustration. New research on the contextual handling of transfer requests have resulted in a flexible approach for dialogue design where the speech application facilitates rather than impedes the completion of callers’ goals.
|
“Welcome Back, Steve,” Said the Friendly... Computer?
Steven Pollock, Executive Vice President & Co-Founder - TuVox
It’s amazing how quickly you get used to personalized customer service. Most consumer Web sites offer a personalized experience that is highly tailored with readily accessible account information. Can this happen over the phone? This session will describe a recent application at a major American airline with advanced personalization features, including ANI authentication, greet-by-name, and context-specific menus so that callers are offered appropriate choices at any point.
|
|
Attendee Lunch Sponsored by Tellme - Remember Me: A Case Study on Personalized Self-Service
12:30 p.m - 1:45 p.m
Tracy Griffith, Sr. Manager Customer Technology Reservations & Premium Services - American Airlines
|
In 2007, American Airlines launched an industry-leading telephone service for its frequent fliers to deliver proactive information based on the callers' upcoming needs. A year later, find out how American Airlines did it and whether the investment has paid off - both for travelers and for American Airlines. This Customer Case Study will be presented as part of the conference lunch sponsored by Tellme on Tuesday, August 19. AAdvantage members can enroll in "Remember Me" with the instructions at www.aa.com/callaa.
|
|
D203 – Speech in Healthcare
1:45 p.m - 2:30 p.m
MODERATOR: Raj Tumuluri, President - Openstream Inc.
|
Speech technology is being used in healthcare for more than routine appointment setting and medical dictation. The first case study in this session shows how a natural language VUI enables bidirectional access to the entire text-based component of electronic medical records. The second presentation highlights how healthcare companies can use speech technology for outbound calls that replace live nurse calls.
|
ICIPS: Integrated Clinical Information Phone Service
Dr. Val Nenov, Adjunct Professor, Division of Neurosurgery - UCLA Medical Center
ICIPS is an ongoing R&D project at the UCLA Medical Center. Its main objective is to provide bidirectional access to the entire text-based component of the electronic medical records using a natural language VUI. Learn how this application was designed, implemented, and deployed, and how it impacts the operation of a medical center. This presentation will summarize the strengths and weaknesses of the VUI as compared to an existing GUI.
|
When Is a Virtual Nurse Call Better Than a Live One?
David Englehardt, President - Advanced Technical Support
Current healthcare protocols often require regular patient follow-up after a hospital discharge or as part of a clinical study. Typically, doctors rely on nurses to call patients to monitor recovery or to gather data for statistical analysis. These calls are expensive and time consuming compared to an IVR application that can replace nurse calls. Automated calls were found to be just as effective as live nurse calls and had additional benefits—more consistent, reliable calls were possible. This presentation will discuss the study protocol and the results of the trial. Virtual nurse calls now are a proven, winning strategy for clinical studies and patient care.
|
|
D204 – Speech in Government/Utilities
2:45 p.m - 3:30 p.m
MODERATOR: Mr. David L Thomson, PMTS - AT&T Labs
|
A clear sign that speech has hit the mainstream is when late technology adopters, such as governments and utility companies, are implementing speech solutions. The first case study in this session will describe the development and deployment of a new speech application to provide customer self-service to its San Francisco residents. The second case study discusses why NSTAR chose speech to replace the previous menu structure, integrating with other call center technologies, the importance of load and performance testing and customer reaction.
|
Developing Call Center Applications
Marge Vizcarra, Customer Services Manager - San Francisco Public Utilities Commission James Whitten, Assistant Section Manager, Customer Contact Center, Customer Services - San Francisco Public Utilities Commission
The San Francisco Public Utilities Commission will describe the development and deployment of the new speech application to provide customer self-service to its San Francisco customers. The goals of the new system are to increase use of automated offerings, free up agents to assist callers that can only be assisted by a live agent, save agent transfer calls for customers really needing assistance with an easy tool for self-service and, to ensure that a live person is always reachable.
|
Improving Customer Experience Implementing a Speech-Enabled IVR at NSTAR
Michael Roberts, Business Integration Manager - Telecom, Customer Interaction Center - NSTAR
In this presentation, attendees will learn: why NSTAR chose speech technology to replace the existing menu structure, how the interoperability with other call center technologies affected the implementation, how load and performance testing ensured a successful implementation, and what reaction customers had to the new system. NSTAR’s telecommunications team learned many lessons during the rollout of the project on both a business and technology level. Getting management’s commitment early in the process and reviewing existing technology for any interoperability issues are two of lessons learned we will share during this presentation.
|
|
Break in the Exhibit Hall
3:30 p.m - 4:15 p.m
|
D205 – Speech in Telecom
4:15 p.m - 5:15 p.m
MODERATOR: Dr. Judith Markowitz, President - J. Markowitz Consultants
|
Telecommunication companies face a number of issues in applications that provide services to customers. The first presentation evaluates alternative designs for a technical support application. The other two presentations describe how speaker verification can minimize security and privacy issues.
|
A Champion-Challenger Experiment for Designing a Natural Language Question
Manohar Kesireddy, Business Solutions Architect - Verizon Digital Media Services
Verizon FiOS TV wanted to determine the best design for a natural language question and back-off menu for a technical support application. The company created three different approaches to challenge the then existing design. All four were deployed in a limited production release and evaluated based on two metrics: number of callers successfully matching their need to a technical support topic and automation rate. One of the designs was a clear winner, improving both accuracy and automation by 10 percent. The characteristics of the winning design will be presented.
|
Improve the Customer Experience with Voice Authentication
Laura Phipps, General Manager and Executive Vice President - Leaco Rural Telephone Cooperative
Leaco is a telecommunications services provider that supplies wireless, wireline, and Internet services to approximately 175,000 bilingual and monolingual customers in Southeastern New Mexico. In November 2007, Leaco began rolling out customer-facing speaker-verification services to 12,000 of its customers. The system is designed to address the following issues: • Improve the customer experience; • Regulatory compliance (with CPNI); • Customer concern about security; and • Brand differentiation.
|
At Bell, My Voice Is My Password
Fred MacKenzie, Senior Business Solutions Advisor - Bell Canada Charles Giordano, Associate Director, Privacy Marketing Strategy - Bell Canada
Bell Canada is the largest telecommunications services provider in Canada. It has rolled out a nationwide, customer-facing system, starting with residential wireline and wireless customers. By the start of 2008, Bell Canada had enrolled 500,000 customers. The company chose speaker verification for the following reasons: • Secure the privacy of customer data; • Make privacy more convenient; and • Reduce the average length of call-center calls.
|
|
Reception
(9th Floor)
5:30 p.m - 7:00 p.m
|
ITIResearch.com
A collection of market research and reports for executive management and business & IT professionals
|
|
Gold
Silver
Media
|