SpeechTEK.com home
SpeechTEK 2008 - Final Program
August 18-20, 2008 • New York Marriott Marquis • New York, NY
SpeechTEK 2008 - Monday, August 18
SUNRISE SEMESTER
Discussion 1 – Emotion in Speech Technology
8:00 a.m - 8:50 a.m
Paolo Baggia, Director of International Standards - Loquendo

Join us for a short overview of how emotions impact on speech applications, including speech synthesis, speech recognition and facial expressions. Possible discussion questions include the following:

  • What are the types of emotion that would be useful to detect during speech recognition?
  • How would an application respond when it detects various emotions?
  • How would an application express emotions?
  • How can existing applications be enhanced by emotions in speech technology?
  • What new applications are enabled by emotions in speech technology?
Discussion 2 – The Effectiveness of Marketing Messages, Legal Verbiage, and Website References in IVR Systems
8:00 a.m - 8:50 a.m
Renae Rogers, Dialogue Designer - West Interactive

Keep it simple don't forget to disclose revenue, revenue, revenue if only everyone would use the Web Marketing messages, legal verbiage, and web references in speech applications all compete for the time your customer has to listen and speak. Huh? Improper placement and verbiage can distract the caller and reduce the effectiveness of even the best speech application. Learn how customers feel about these messages and some practical lessons to maximize their affect while not hurting your prospects for voice self-help.
TRACK A: MEETING BUSINESS GOALS WITH SPEECH
Keynote: Speech in the Mainstream and Beyond
9:00 a.m - 10:00 a.m
Ray Kurzweil, Author of The Age of Spiritual Machines: When Computers Exceed Human Intelligence and The Age of Intelligent Machines

We're living in a world of rapid technological innovation and increasing pervasiveness of information technology.  Consumers not only appreciate these advances in their professional and personal lives, they are demanding it. What does this mean for those who are driving development and acceptance of speech technology? Glean insight from one of the most distinguished speech technology innovators of our time, Ray Kurzweil, who will share his views on how speech technology will work in conjunction with other emerging technologies to bring us to an age of intelligent machines.


A101 – Mainstream Speech 2008: The State of the Industry
10:15 a.m - 11:00 a.m
MODERATOR: Tim Moynihan, Vice President of Marketing, Enterprise Solutions - EMPIRIX

In 2008 speech is now a mainstream technology, used by millions across the world every day to get information, perform transactions, and manage their daily lives. What do end-users really think of speech technology? What are the hot buttons in the speech vendor community? What do organizations using speech have to say about the technology? Panelists in this session have completed ambitious surveys regarding the speech industry covering clients, vendors, and end users. The resulting data can tell us a great deal about the current state of the speech industry and where we need to be tomorrow to continue to grow and thrive. Come with questions and be prepared for a spirited debate of the findings on the state of speech 2008.



Year 2 Speech in the Mainstream - Stakeholder Views From 360 Degrees
Mike Bergelson, Director, Strategy, Voice Technology Group - Cisco
Tim Pearce, Global Solution Manager, Self Service - Dimension Data
Download Presentation

Are vendors and end users on the same page about speech recognition applications? This discussion picks up from where SpeechTEK 2007 left off, showcasing the results of the global Alignment Index, which tracks the views of 1,200 speech-app consumers and 128 speech vendors to see how closely they're aligned.


VoiceXML and IVR Adoption for Self Service Portals: The European Mobile Operators Perspective
Bonnie Crater, SVP Marketing - VoiceObjects, Inc.
Download Presentation

According to research by Datamonitor, VoiceXML revenues and port shipments in both North America and EMEA surpassed those of traditional IVR for the first time in 2007. VoiceObjects recently sponsored research to “take the temperature” of the European Union’s highly competitive market of mobile network operators. This presentation will highlight those survey findings, including operators’ customer service challenges their plans for improving customer satisfaction, and how they will differentiate themselves through self-service offerings as they move from the customer acquisition to customer retention phase in a rapidly consolidating market.


State of the Speech Industry - Stakeholders' Snapshot View
Monique Bozeman, Marketing Consultant - Monique Bozeman Consulting
Download Presentation
A102 – Planning for Speech
11:15 a.m - 12:00 p.m
MODERATOR: Dr. Nava Shaked, CEO - Brit Business Technologies Ltd (BBT)

Your organization has decided to make the leap into speech technology.Congratulations! But what happens next? This session will arm you with information about the steps you should take to get ready for a speech project. Learn what data will ease the transition to speech and who on your team you will need to keep a speech project on track. Speakers in this session will ensure that you’re ready for speech when your project begins.



Speech Is Coming: What Do I Do Now?
Catherine Zhu, Principal Consultant, Self Service Solutions, Avaya Professional Services - Avaya
Download Presentation

The decision to move to speech has been made, and now it's just a matter of logistics before the project can officially start. How can you profitably use this time? Learning speech technology basics, beginning interdepartmental communication, and gathering data about your business and potential callers are just a few factors to consider for managing a smooth transition. This session will explore proactive steps to take as a company when you know speech is coming.


Preparation Leads to Success
Carrie Nelson, Speech Solutions Team Technical Lead - Nortel
Download Presentation

What do customers need to know in preparation for a speech application deployment? An awareness of the caller base and a comprehensive understanding of your back-end data are just the beginning. Other elements, such as involving the right players and setting success criteria at the start of the project, also play a huge role. A detailed discussion of these items and more add up to the top 10 tips for deploying speech.

Attendee Lunch Sponsored by VoiceObjects
12:00 p.m - 1:15 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.
Download Presentation

A103 – Preventive Medicine: Keeping Your Speech System Healthy
1:15 p.m - 2:00 p.m
MODERATOR: Sunil Issar, Director, Architecture - Convergys Corporation

Your speech application is up and running.Now you can relax, right? Wrong! Speech applications are dynamic entities that require lots of ongoing TLC to perform at peak effectiveness. Experts in this session will teach you how to monitor and manage your speech application throughout its lifecycle, plus how to maintain its stellar performance without breaking a budget. This session also will tell you what you need to know to keep your speech application on track for the long haul.



Now what? Managing and Maintaining Speech Deployments
Fran McTernan, Managing Principal - Avaya
Download Presentation

Businesses commonly turn to technology providers and vendors to get their initial speech deployments up and running. But once the system is earning its keep, how should you maintain it? What skills does a team need to keep up with changing and growing business needs? This session is geared toward IT managers responsible for managing and maintaining speech systems, focusing on the high-level skills and knowledge required to do so.


Strategies to Cost Effectively Maintain Self-Service Phone Portals
Ingo Bors, Industry Consultant - Ibostar Consulting Services
Download Presentation

Self-service phone applications often go unmodified due to increasingly complex IT and business process requirements. The result is a limited ability to adapt voice self-service dialogues in real time based on a caller’s behavior and a restricted flexibility to accommodate change requirements. This presentation will discuss business conditions driving modifications to self-service phone portals, presenting examples of how to use a modular approach and services-oriented architecture (SOA) for maintaining them.

A104 – Agile Speech Projects
2:15 p.m - 3:00 p.m
MODERATOR: Phillip Hunter, User Experience Designer, Microsoft Tellme - Microsoft

Agile software development is gaining wider acceptance every day for the benefits it offers over traditional waterfall development methods. Agile development allows an organization to have code up and running more quickly and to be more responsive to changes by developing software in discrete chunks that can be easily modified. How do agile development processes impact speech projects? Is it possible to design a VUI that fits within the constraints of agile development? Speakers in this session will explore the benefits and challenges of agile development for speech projects.



Adopting Software Engineering Practices for VUI Development
Charles Lewis, Senior Technical Trainer - Bloomberg LP

As call flows become more complex, the discipline of software engineering can offer guidance for the creation and maintenance of robust, scalable products. This session will describe the successful application of standard software development practices, including source-code control, coding standards, code walk-throughs, and the application of object-oriented programming principles to the development and ongoing improvement of highly complex call flows. The application of these processes and practices to VUI development presents unique challenges that will be discussed.


Flexible Design in Support of More Rapid Deployment
Leslie Carroll Walker, Principal VUI Designer - ethosIQ

As customers move away from waterfall methods of deployment, designers may need to have several different design initiatives under way simultaneously. This presentation will look at some practical approaches to the design process in support of more rapid deployment models. Documentation options, project management options, and other workarounds will be offered and discussed. This presentation will be particularly useful for consulting designers who are working on large, ongoing initiatives, but should also prove helpful for smaller initiatives and in-house designers.

A105 – Speech In Financial Services
3:15 p.m - 4:00 p.m
MODERATOR: Ken Rehor, Voice Technology Group - Cisco

Businesses use analytics technology to extract information from customer usage recordings and logs to gain insight about interactive systems and to identify trouble spots and opportunities in user interactions. In this session, financial services firms describe how they use analytics to identify customer behavior and experience so they can improve customer treatment and, ultimately, increase their financial benefit.



Understanding the Impact of Customer Experience on your Business
Glen Graham, SVP - Business Operations - Bank of America

This presentation will explain how Bank of America examines customer experiences across multiple channels (IVR, agent, Web, etc.) to identify the exact events resulting in customer confusion and frustration. Learn how Bank of America uses this information to prioritize enhancements that maximize user adoption and customer retention.


Leveraging Speech Analytics to Integrate the Voice of the Customer Into the Business
Ricardo de Carvalho Destro, Technology Director - Volans Technology

A TTS based solution for ATM equipments will be presented in this session. This solution aims to allow blind customers (or with low vision), to use ATM machines for banking transactions as withdraws, balance etc. This multi-platform solution was designed to support many languages and to be customized with low impact of development in the Bank’s transactional applications.

A106 – The Business of Speech Analytics
4:15 p.m - 5:00 p.m
MODERATOR: Mr Dan Miller, Senior Analyst - Opus Research, Inc.
Michael Codini, Chief Technical Officer & Co-Founder - VoiceObjects, Inc.
Greg Borton, VP of Analytics - Nuance Communications
Cliff LaCoursiere, SVP Business Development - CallMiner

As self-service phone applications mature and become broadly deployed as a means of reducing costs and improving customer self-service, it’s critical for companies deploying these applications to understand how their customers are actually using their systems. Despite the tremendous interest in business intelligence and analytics among mainstream IT, call centers and speech industries remain woefully behind. This panel will investigate the role of analytics in three key areas of call center technology: self-service, speech, and customer care. Attendees will learn how new business intelligence and analytics technologies are helping customers build better applications, improve customer service, and drive business value.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m

TRACK B: VUI DESIGN PRINCIPLES AND TECHNIQUES
Keynote: Speech in the Mainstream and Beyond
9:00 a.m - 10:00 a.m
Ray Kurzweil, Author of The Age of Spiritual Machines: When Computers Exceed Human Intelligence and The Age of Intelligent Machines

We're living in a world of rapid technological innovation and increasing pervasiveness of information technology.  Consumers not only appreciate these advances in their professional and personal lives, they are demanding it. What does this mean for those who are driving development and acceptance of speech technology? Glean insight from one of the most distinguished speech technology innovators of our time, Ray Kurzweil, who will share his views on how speech technology will work in conjunction with other emerging technologies to bring us to an age of intelligent machines.


B101 – What End Users Really Want
10:15 a.m - 11:00 a.m
MODERATOR: Dr. Juan E. Gilbert, IDEaS Professor & Chair, Division of Human-Centered Computing - Clemson University

In speech projects, requirements are sometimes thought of as a laundry list of items to be handed from the client to the vendor building the speech application. Because clients don’t know VUI, the requirements are sometimes less than sensible. Moreover, requirements rarely include the perspective of the application’s end user, resulting in systems that are poorly adopted and, thus, fail to meet client goals. The experts in this session will explore why requirements assembled in a vacuum are insufficient and discuss methods for discover truly meaningful requirements.



The Big Gap: What Customers Want Versus What Customers Get
Dr. Moshe Yudkowsky, President - Disaggregate
Download Presentation

When someone calls your office and you're not there, he might want to know when you'll be back, whether you received a file he sent you, and what you thought about his proposal. What this person gets, however, is something entirely different: voicemail. In this session, we'll discuss the huge gap between what customers want and what customers get, and offer ideas on how to create services that actually deliver what they want.


Unearthing the Real Requirements
Jenni McKienzie, Voice Interaction Designer - Travelocity

VUI designers often receive puzzling requests from clients. Valid reasons typically drive their requests, but the businesses haven’t come up with the right solutions. We’ll dissect several real requests for IVR modifications to uncover their underlying problems, and then we’ll examine better alternatives than the ones proposed. Finally, we will look at how to go back to these clients and sell them on your plans.

B102 – Managing Client/Vendor Relationships
11:15 a.m - 12:00 p.m
MODERATOR: Darla Tucker, Director, Strategic Customer Solutions - Convergys

VUI designers are the keepers of a great deal of highly specialized knowledge about speech technology and the way users interact with it. Still, clients regularly challenge their decisions and insist on prompts and call flows that make VUI designers cringe. This session is about finding the balance between providing expert guidance to clients and simply accepting requests that will compromise the VUI design. You will learn how to decode clients’ requests to understand their real intent, include them in the design process, and gently steer them toward solutions that will meet their ultimate goals.



Caution Ahead: Voice User Interface Is a Two-Way Street
David C Martin, Managing Principal, Self Service Solutions EMEA, Professional Services - Avaya

The goal of this presentation is to facilitate critical thinking about deploying user-centric business solutions. The discussion will cite recent applications in which business processes had the potential to detract from the user experience, focusing on how a well-designed user interface accounts for both business and customer goals, and how the attitudes of business stakeholders and VUI designers can affect the success of a speech application. You also will be encouraged to challenge yourself and your vendors to make courageous design decisions that will empower customers with excellent self-service.


It’s OK to Tell Clients They’re Wrong
Jessica Stevens, Senior Engineer, VUI Design - Intervoice
Jenny Burr, Sr. Manager, Speech Science - Convergys Corporation
Download Presentations

Dealing with customers is a large part of our jobs in VUI design and tuning. We have all run into differences with our customers, and both our success stories and cautionary tales can be informative. This session will provide real-world examples of trying to find that sweet spot between business and caller needs, demonstrating the importance of making the effort to represent our expertise in the industry and reminding clients why they hired us in the first place.

Attendee Lunch Sponsored by VoiceObjects
12:00 p.m - 1:15 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.
Download Presentation

B103 – Reverse–Navigation, Hands-On Session
1:15 p.m - 2:00 p.m
MODERATOR: Jenni McKienzie, Voice Interaction Designer - Travelocity
Peter B Krogh, Director of Solutions Architecture - SpeechCycle

Download Presentation

How do you navigate to a previous state in a VUI? Unlike Web pages, VUIs have no button that transparently takes the user back a single step in an interaction. Does a simple speech analog to the back button make sense in a VUI? How—and when—should this functionality be offered to users? Should reverse navigation be universal, or are specific instantiations required for different applications? You will explore the nuances of reverse navigation in this hands-on session. After a brief introduction, be prepared to work together in small groups to explore this issue and present solutions to the audience.

B104 – Universal Commands
2:15 p.m - 3:00 p.m
MODERATOR: Charles Galles, Principal Member, Technical Staff - AT&T

Should all speech applications share a set of commands? If we offer users help universally throughout a speech application, does it do any good? This session explores the highly debated issues of whether we need universal commands across applications, how we should determine what these commands should be, and whether universal commands truly benefit users of speech applications. Join us as two VUI practitioners share evidence about how to handle universals in VUI design.



Should Universal Commands Be Universal?
Rita Dhruve, Speech Solutions Specialist - Nuance Communications

A small set of commands is becoming a de facto standard for universal commands in VUI design. However, guidelines for how and when to offer universal commands, and what response should be expected of the system, don’t exist. This session will present examples from current speech applications to demonstrate how to successfully implement a set of universal VUI commands.


Does Universal Help Help?
James Mesbur, Voice Interaction Designer/Speech Scientist Engineer - SpeechCycle
Karen Molye, Voice Interaction Designer - SpeechCycle

Universal commands are intended to give a caller control over a speech interaction and provide a consistent approach to tasks. But universal help, as it is typically implemented, has never delivered on this promise. As a result, we have implemented a set of context-sensitive help options, offered only when extra help makes sense. We’ll discuss how this approach allows for simpler grammars, lets callers educate themselves when they feel it’s necessary, and results in a marked increase in dialogue object success, call completion, and caller satisfaction.

B105 – Lessons in Multimodal Usability
3:15 p.m - 4:00 p.m
MODERATOR: Catherine Zhu, Principal Consultant, Self Service Solutions, Avaya Professional Services - Avaya

Applications that use speech plus other modalities have arrived, presenting special challenges to user interface designers. Ideally, multiple modalities work in concert, allowing the user flexibility in input and output of information. When multimodal goes wrong, however, the modalities can interfere with each other, competing for the user’s attention and making multimodal less effective than a unimodal application. Speakers in this session share their experiences in multimodal design and offer recommendations for helping multimodal applications live up to expectations.



Multimodal Speech Usability Lessons
Eduardo Olvera, Senior User Interface Designer, Professional Services - Nuance Communications

Multimodal user interfaces can combine visual, touch, gesture, and location-based features. But what happens when you add speech recognition capabilities to designs that are so new they lack a clear set of usability guidelines? Based on actual multimodal speech usability testing, this session will discuss the challenges of multimodal design and provide recommendations for successful implementations.


You Can’t Get There From Here, But Why Not?
Matt Prather, Staff Engineer, VUI Design - Intervoice

In an increasingly multimodal world, users expect a voice interface to do the same things, in the same ways, with the same results, as the desktop and Web-based interfaces they’re already using. But, all too often, functionality can be radically different across a company’s various interfaces, which leaves users perplexed, annoyed, and wondering why they can’t get there from here. Learn why interface symmetry is desirable, as well as how to plan for it, champion the cause, and educate both callers and clients as to the possibilities and limitations.

B106 – How May I Help You?
4:15 p.m - 5:00 p.m
MODERATOR: Dr. Elizabeth A. Strand, Director of UX Strategy - Microsoft Tellme

Statistical language models allow us to pose open-ended questions and let users provide more free-form responses. This sounds simple, but for “How may I help you?” types of questions, the devil is in the details. The question’s exact formulation can have a huge impact on how users respond; conversely, knowing how users respond can help craft more effective open-ended questions. In this session, we will explore the details of writing open-ended prompts and analyzing users’ responses.



People Say the Darnedest Things
Charles Galles, Principal Member, Technical Staff - AT&T

Open prompts (such as "How may I help you?") produce interesting responses. This discussion analyzes responses based on caller profiles and VUI design issues across several applications. The data presented will help you understand how to deliver an excellent caller experience, drive customer loyalty, and strengthen your brand.


Natural Language Questions for Technical Support Applications: A Winning Strategy
Mary Constance Parks, Principal Interaction Designer - Nuance Communications
Download Presentation

For a recent technical support application, the goal was to improve the accuracy of an existing natural language question and back-off menu. Three different designs were created to challenge the existing flow, one of which significantly improved both accuracy and automation. The characteristics of the winning design will be presented, along with takeaways for designing technical support applications and Natural Language questions.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m

TRACK C: ADVANCED SPEECH TECHNOLOGIES SYMPOSIUM
Keynote: Speech in the Mainstream and Beyond
9:00 a.m - 10:00 a.m
Ray Kurzweil, Author of The Age of Spiritual Machines: When Computers Exceed Human Intelligence and The Age of Intelligent Machines

We're living in a world of rapid technological innovation and increasing pervasiveness of information technology.  Consumers not only appreciate these advances in their professional and personal lives, they are demanding it. What does this mean for those who are driving development and acceptance of speech technology? Glean insight from one of the most distinguished speech technology innovators of our time, Ray Kurzweil, who will share his views on how speech technology will work in conjunction with other emerging technologies to bring us to an age of intelligent machines.


C101 – Advanced ASR: A Global Perspective
10:15 a.m - 11:00 a.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - ATX Group, Inc.
Roberto Pieraccini, Chief Technology Officer - SpeechCycle

Download Presentation

This session provides a global overview on the state of research in the many areas related to speech and language, such as speech recognition, language understanding, spoken dialogue systems, speech summarization, speech-to-speech translation, audio indexing, speech synthesis, speaker verification, and supporting technologies. Learn about the wider speech technology research community, the conferences, the organizations, and the publications. Finally, learn about breakthroughs in advanced technologies where academic research is currently making progress and their potential impact in the commercial world.

C102 – Application of Advanced ASR
11:15 a.m - 12:00 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - ATX Group, Inc.

This session focuses on speech recognition technology used to complete challenging tasks such as automated directory assistance, address entry, understanding newscasts, and the handling of conversational speech in mobile devices. In addition to exploring key advances in core speech recognition technology, performance results from real-world deployments will be presented and analyzed to show trends in automation rates, usage patterns, and where the technological gaps arise.



Using Phonetic Audio Search Technology to Streamline Video Search and Editing
Marsal Gavalda, Vice President of Incubation - Nexidia Inc.
Download Presentation

The current explosive growth of online video necessitates improved ways to manage and navigate such audiovisual content. This presentation will discuss the role that semantic computing can play in the editing, publishing, syndication, and discoverability of online videos. The session will present a concrete, successfully-deployed application that applies speech, language, and semantic technologies to automatically convert a newscast from its full-length broadcast (long form) to segments containing single stories (Web clips) and prepares such clips for online publishing and semantically-aware syndication. Technologies discussed include: audio/speech, voice and music detection, phonetic indexing and search, language/semantic technologies, document classification, and search query analysis and validation.


Performance of advanced speech technology
Vlad Sejnoha, Chief Technology Officer - Nuance Communications
Download Presentation

Steady advances in core recognition technology over the past decade have resulted in an impressive expansion of speech application capabilities. The next few years promise to be even more dramatic, with speech interfaces poised to jump to new levels of power and usability. This presentation will explore factors shaping the next generation of speech technology, including:
• The unprecedented quantity of user data being generated by current deployments;
• The difference between core and user-perceived accuracy;
• The drive toward lowering application development and maintenance cost;
• Novel and challenging use cases such as Web search and voicemail-to-text; and
• New demands on ASR flexibility stemming from deployments distributed mobile environments.

Attendee Lunch Sponsored by VoiceObjects
12:00 p.m - 1:15 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.
Download Presentation

C103 – Success With Machine Translation
1:15 p.m - 2:00 p.m
MODERATOR: K.W.'Bill' Scholz, President - NewSpeech LLC

As the importance of timely access to information grows, the necessity for overcoming language barriers becomes ever more prominent. Increased use of statistically based computational methods and techniques evolved from coupling speech recognition and understanding to create solutions able to perform speech-to-speech translation in near real time. This session explores key research in this area and provides cogent examples of real successes in machine translation.



Speech Technology and Automated Translation: A Winning Combination
Hannah Grap, Marketing Communications Manager - Language Weaver, Inc.
Download Presentation

Speech technology and translation technology have advanced to new levels in recent years, and it is now possible to deliver real-time human communication across multiple languages. This presentation will look at existing speech and automated translation solutions for video broadcast monitoring and speech-to-speech translation, along with requirements for success to ensure usable translations of spoken information. New opportunities for pairing these technologies in customer support and intelligence applications will also be discussed.


Translingual Automatic Language Exploration
Imed Zitouni, Senior Research Scientist, IBM Research - IBM
Download Presentation

People are interested in easily searching and monitoring a wide range of foreign media in real time without language barriers. IBM made a big step toward this goal with the development of the Trans-lingual Automatic Language Exploration System, codenamed TALES. TALES performs video capture, automatic speech-to-text conversion, machine translation of foreign text to English, and information extraction. This presentation shows some of the technical challenges in building the TALES platform with an emphasis on cross-lingual information propagation and name spelling normalization to improve machine translation and information extraction components.

C104 – Advanced Dialogue's Growing Sophistication
2:15 p.m - 3:00 p.m
MODERATOR: K.W.'Bill' Scholz, President - NewSpeech LLC

Speech applications continue to grow in complexity and sophistication, requiring both the deepening and broadening of context to meet the requirements for troubleshooting, call routing, and detailed information retrieval. To meet these needs, the dialogue design process has grown significantly in sophistication, requiring the integration of powerful data, grammar, and code infrastructures that support robust and flexible interaction—even utilizing inference and reasoning capability borrowed from the AI community. This session explores instances of this growing sophistication in dialogue design and looks at the evolution toward an integrated model for advanced dialogue services supported by a major standards organization.



More Than Call Steering - Managing Dynamic Contextual Complexity
Phillip Hunter, User Experience Designer, Microsoft Tellme - Microsoft

The last few years have seen growth in the need for highly complex speech applications characterized by either very deep (insurance or troubleshooting) or very broad (call routing and stocks) contexts. But when customer and caller demands mean enabling broadly more complexity with increased simplicity of access, interface and application design and construction must adapt accordingly. Highlighted in a customer deployment involving call steering and multiple other applications, SpeechCycle discusses advances in powerful data, grammar, and code infrastructures that support robust and flexible interactions allowing callers to request hundreds of unique tasks, some of which involve dozens of turns and variously arranged contexts.


Architectures for Advanced Dialogs
Mr. David L Thomson, PMTS - AT&T Labs
Download Presentation

The VoiceXML Forum Tools Committee is exploring ways to introduce standards to development of advanced speech services. (We define "advanced" as a system endowed with a higher level of reasoning than typically exists in menu-based or finite state machines.) This has been an area of active research and new methods have been deployed in impressive trials and services. Through a series of conference calls and workshops, we have been reviewing architecture options and are exploring what new or existing standards can be used. By standardizing system components, we hope to help reduce the development cost for building advanced speech services. The presentation will review methods for supporting advanced dialogues and will outline proposed architectures and standards under consideration.

C105 – Natural Language Processing Techniques
3:15 p.m - 4:00 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - ATX Group, Inc.

This session will examine robust linguistic dialogue tools applicable to voice application development. Proper use of these newly developed tools makes it easy to realize a cost-effective implementation of caller query handling and executing transactions in a natural language fashion. Additionally, this session will discuss adaptive techniques designed to dynamically adjust prompting in terms, speaking rate, and content. Ultimately, applying the methodology and products discussed in this session results in an improved user experience, better automation rates, and increased IT efficiency.



Improved Customer Experience with a Natural Language Solution
Peter Trompetter, Vice President, Global Development - GyrusLogic

The presentation will emphasize the benefits of ASR/VoiceXML application developments with the use of robust artificial intelligence (AI) tools for easy and cost-effective implementation of answering a caller’s questions, executing transactions in a conversational dialogue and/or natural language fashion. The solution presented is a linguistic AI conversational dialogue product suite to complement the ASR and/or VoiceXML application developments achieving the caller's first contact resolution.


Adaptive Audio IVR Software For IT Efficiency
Daniel O'Sullivan, President/CEO - VUI Cloud
Download Presentation

Adaptive audio IVR software can dynamically adjust the speaking rate (in words per minute) and audio message content of voice applications on individual caller skills. This personalizes the call experience as it happens, creating more responsive and more productive customer experiences. The process emulates what humans do naturally to communicate more effectively with each other during normal conversation. The benefits of adaptive audio include decreased average handle times and increased average handle rates, ultimately resulting in increased customer satisfaction.

C106 – Virtualization and Intelligent Integration
4:15 p.m - 5:00 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - ATX Group, Inc.

Large scale deployments are evolving through virtualization and intelligent integration of multiple speech technologies. The idea is to enable each deployed speech application to operate in an independent space with its own speech resources, configuration and reporting. Virtualization is ideal for large companies, carriers and hosting and any environment in which there are many applications because it provides a natural efficiency for developers, user interface designers, and operational staff. Virtualization simplifies the deployment of multiple speech recognizers, multiple development tools, multiple applications, and teams of developers.



IVR Virtualization - Platforms for Parallel Speech Development and Deployment
Rob Kassel, Vice President, Marketing and Product - Holly Connects
Download Presentation

Virtual IVR technology enables each speech application deployed to a platform to operate in an independent space with its own speech resources, configuration, and reporting. Virtualization is ideal for large companies, carriers and hosting, and any environment in which there are many applications because it provides a natural efficiency for developers, user interface designers, and operational staff. Virtualization simplifies the deployment of multiple speech recognizers, multiple development tools, multiple applications, and teams of developers.


The challenges of mixing speech technologies in the development and deployment of speech applications
Dr. Nava Shaked, CEO - Brit Business Technologies Ltd (BBT)

We are often asked by our customers both to enhance system efficiency and to maximize results. These goals can be reached by combining two or more speech technologies such as speech recognition and speech biometrics. Successful cross-technology integration requires not only the core technology interface but also development of the appropriate voice user interface. These will be discussed in the presentation by illustrating real customer problems and issues with some solutions suggested. In addition, the presentation will address regulation and business considerations that result from this approach.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m

TRACK D: DEVELOPMENT AND DEPLOYMENT
Keynote: Speech in the Mainstream and Beyond
9:00 a.m - 10:00 a.m
Ray Kurzweil, Author of The Age of Spiritual Machines: When Computers Exceed Human Intelligence and The Age of Intelligent Machines

We're living in a world of rapid technological innovation and increasing pervasiveness of information technology.  Consumers not only appreciate these advances in their professional and personal lives, they are demanding it. What does this mean for those who are driving development and acceptance of speech technology? Glean insight from one of the most distinguished speech technology innovators of our time, Ray Kurzweil, who will share his views on how speech technology will work in conjunction with other emerging technologies to bring us to an age of intelligent machines.


D101 – Design Methodologies and Tools
10:15 a.m - 11:00 a.m
MODERATOR: Ingmar Kliche, Project Manager - T-Systems Enterprise Services GmbH

A wrench in hand does not make you a mechanic, a hammer does not make you a carpenter, Microsoft Word does not make you a novelist, and speech development tools do not make you a successful speech application developer. In the attempt to realize the promise of speech, our industry has made a continual effort to simplify speech application development and move it from a small number of specialists to a wider development audience. How can systematic design methodologies and usable tools decrease the time and effort to create useful user interfaces? How can developers select tools that will hide complexity yet assist the developer to create world-class VUIs?



A Multimodal Interface for Call Center Agents
Dr. Matthew Yuschik, Executive Board Member - AVIOS
Download Presentation

How can systematic design methodologies decrease the time and effort to create useful multimodal user interfaces? This presentation describes developer experiences designing a multimodal user interface using a standard design methodology, modified to include special steps for designing the voice and graphical components. This presentation describes a formal procedure for evaluating and selecting voice-enable features to include with the GUI, and discusses product development stages and test results.


It’s All Connected: The Dysfunctional Relationship Between the Toolbox and Successful Speech Applications
Leah Eyler, Speech Application Consultant, Customer Interactive Solutions - Dimension Data
Download Presentation

Today speech application developers have a mixed bag of tools targeted at a range of skills resulting in a range of solutions, but very few in support of a comprehensive approach to development. This presentation will discuss the evolution of speech application development tools and the gaps created by using disjointed technologies. Because of the connections between VUI, development, QA and analysis, there is a need for a new generation of tools to reduce the overall complexity of delivering speech applications and strike the right balance between customer controls over an application while maintaining its integrity. Learn what to consider when selecting development tools.

D102 – Speaking and Listening to Mobile Devices
11:15 a.m - 12:00 p.m
MODERATOR: Dr. Ahmed Bouzid, Senior Director of Product Management - Angel.com Incorporated

Speech technologies will be an integral part of the user interface on cell phones, PDAs, and other mobile devices. This session also discusses how collect and use customer information to design and improve user interfaces for speech applications.



Steps to Determine Multimodal Mobile Interactions
Dr. Matthew Yuschik, Executive Board Member - AVIOS

By observing call center agents using a multimodal tool that overlays speech onto an existing GUI, multimodal procedures are identified that reduce the complexity and increase the efficiency of the transaction. This process becomes an early step in migrating an agent-enabled transaction in the call center to a self-service transaction performed by the caller on their mobile hand-held device. Yuschki describes the steps in the design procedure that highlight how call center agents become a testbed for end-user multimodal UIs.


Delivering High-Speed Customer Service
Chris Weeks, Division Vice President, Customer Care - Comcast
Download Presentation

Delivering customer service in the most effective and efficient way is necessary for a successful customer service platform. Developing applications that identify call types, patterns, and repeat customers drives customer calls to quick resolutions. This session will explore using speech application, customer surveys, and caller analysis to understand the voice of the customer and meeting customer needs in a high-speed environment.

Attendee Lunch Sponsored by VoiceObjects
12:00 p.m - 1:15 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.
Download Presentation

D103 – New Standard Languages for Developing Speech Applications
1:15 p.m - 2:00 p.m
MODERATOR: Yves Normandin, CEO - Nu Echo Inc.

The World Wide Web Consortium (W3C) will soon publish recommendations for two new standard XML languages for developing speech applications. The Pronunciation Lexicon Specification is a standard XML interface for speech recognition engines and speech synthesis engines to access lexicons of words and their pronunciations. The State Chart XML (SCXML) describes an XML syntax for representing dialogue flow control using the popular state chart notation. SCXML will be an important component of the forthcoming VoiceXML 3.0 as well as in a variety of nonspeech applications.



How to Improve TTS and ASR Performance
Paolo Baggia, Director of International Standards - Loquendo

The accurate specification of the correct pronunciation is critical to the success of many speech applications, especially when targeted on multilingual applications. For example, the incorrect pronunciation of proper names and place names can confuse and mislead users. The Pronunciation Lexicon Specification (PLS) is a new W3C voice browser standard designed to enable interoperable specification of pronunciation information for both ASR and TTS engines. The presentation will demonstrate how PLS enables the efficient development and deployment of speech applications that share pronunciation lexicons. Extensive examples and case studies will be presented.


Modular Data Components in SCXML
James Barnett, Director - Alcatel Lucent

The State Chart XML (SCXML) describes XML syntax for representing dialogue flow control using state chart notation (an extension of state transition diagrams). SCXML will be an important component of the forthcoming VoiceXML 3.0 as well as be used for a variety of non-speech applications. The latest W3C working draft of the SCXML specification provides a modularization of the language. This talk will provide an overview of this modularization with particular emphasis on its pluggable data model. Specifically, platforms can plug in different data languages, such as XML and ECMAScript, to suit different applications.

D104 – Advanced Techniques for Using Grammars
2:15 p.m - 3:00 p.m
MODERATOR: Rob Kassel, Vice President, Marketing and Product - Holly Connects

Developers must provide grammars that describe the words and phrases that speech recognition engines listen for, thus increasing the speech and accuracy of the speech recognition engine. This session addresses two problems with managing grammars: testing grammars that are generated dynamically from information in databases and other sources of changing information, and adjusting the weights of grammar items without thousands of transcriptions.



Developing Robust and Efficient Dynamic Speech Recognition Grammars
Dominique Boucher, Lead Software Developer - Nu Echo Inc.
Download Presentation

The W3C standard ABNF grammar format is extended for the design and implementation of dynamic grammars. Using this approach, dynamic grammars can be viewed as a seamless evolution of existing static grammars instead of a separate set of resources in a project. These dynamic grammars can be edited, tested, and debugged using the same set of tools that help a speech scientist build effective static grammars, including a feature rich grammar editor, coverage tools, phrase interpretation, semantics stepping, intelligent sentence generation, and so on.


Leveraging Existing Reporting to Rapidly Tune Speech Applications
Pete Slabek, Voice Portal Operations Specialist - United Healthcare

A common misconception in speech is that one can only tune an application with thousands of transcriptions. When standard transcribed data is not available, algorithms generate grammar weights based on standard reporting data. This presentation will cover techniques for using non-traditional data to deliver a rapid round of tuning.

D105 – Integrating Speech Technologies with Enterprise Applications
3:15 p.m - 4:00 p.m
MODERATOR: Rob Kassel, Vice President, Marketing and Product - Holly Connects

Services-oriented architecture (SOA) will be a significant area of technology investment for enterprises. New technologies make it possible for contact centers to tap into the SOA to drive more advanced speech applications with shorter development times and less risk. This session will focus on the technical standards and practices that allow speech-enabled IVRs to benefit from existing and upcoming SOA investments. The session will also emphasize practical tips and real-world examples.



Integrating Speech Applications with Enterprise IT Assets Using Web Services
Michael Codini, Chief Technical Officer & Co-Founder - VoiceObjects, Inc.
Download Presentation

One shortcoming of many speech applications deployed today is the lack of integration between these applications and the rest of an enterprise’s IT infrastructures. As a result, valuable opportunities to enhance speech applications using intelligence contained in an organization’s Web, CRM, and other IT assets are lost and IT efficiency suffers. Web services integration between speech applications and other enterprise IT assets can remedy this situation. Learn how Web services can be applied to call center and other speech-enabled environments and how to fully leverage CRM, BI, and other IT assets via Web services.


How Speech Recognition Works in the Service-Oriented Architecture
John Oh, Technical Lead, Customer Contact Business Unit - Cisco

According to Gartner, spending on service oriented architecture is expected to grow from $14 billion in 2005 to $189 billion in 2009, making it the single most significant area of technology investment for the enterprise. New technologies make it possible for contact centers to tap into the SOA to drive more advanced speech applications with shorter development times and less risk. This session will focus on the technical standards and practices that allow a speech-enabled IVR to tap into the benefits of an existing or upcoming SOA investment. The presentation will include practical how-tos and real-world examples drawn from the experiences of two large financial services companies.

D106 – The Impact of W3C standard languages
4:15 p.m - 5:00 p.m
MODERATOR: James Barnett, Director - Alcatel Lucent

The publication of W3C standard languages, such as VoiceXML and CCXML, has dramatically changed the speech application design process. This session discusses some of the efforts to extend and validate the use of standard languages. Learn how the call control language can work with SIP and VoIP to implement an extensible SIP softswitch. Discover how the VoiceXML Forum’s certification program has impacted the cross-vendor interoperability of VoiceXML by VoiceXML platform vendors.



SIP Applications Using CCXML
R.J. Auburn, Chief Technology Officer - Voxeo

Developers are increasingly using the W3C Call Control language, CCXML, to add call control features to their telephony applications. At the same time, Session Initiation Protocol (SIP) and Voice over IP (VoIP) have found widespread acceptance and deployment in everything from long distance networks to enterprise call centers and in consumer telephony. Learn how SIP, VOIP, and CCXML can work together in next-generation telephony deployments, plus the advantages and disadvantages of a combination of both. Learn how CCXML can be used to implement an extensible SIP softswitch.


VoiceXML Platform Certification: What Every Customer Should Know
Ken Rehor, Voice Technology Group - Cisco

VoiceXML platform certification has helped to stabilize and mature the speech and contact center industry by encouraging the cross-vendor interoperability of platforms, tools and applications. Along the way, certification has become a key speech system RFP criterion for savvy companies adopting speech. This presentation will discuss why VoiceXML platform certification is an indispensable requirement of a complete speech system specification. This presentation will also detail the pitfalls that customers may encounter in choosing self- or vendor- certified platforms. A summary of what is covered and what is not covered by the VoiceXML platform certification will be presented.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m





MarketPlace - Sponsored Links
ITIResearch.com
A collection of market research and reports for executive management and business & IT professionals
Gold
Silver
Media