SpeechTEK.com home
SpeechTek 2007 - Advance Program
August 18-20, 2008 • New York Marriott Marquis • New York, NY
Conference Overview Conference Chairs/Advisory Committee Final Program
Conference at a Glance [PDF] Exhibitor List/Floor Plan Speaker List
  Previous Conferences  
A CD-ROM is available for purchase through The Digital Record,
featuring audio and supplemental materials (such as PowerPoint slides) for most of the sessions at SpeechTEK 2007.
Visit www.digitalrecord.org to purchase the SpeechTEK 2007 Conference CD, or call toll-free 1-800-338-2111.
SpeechTek 2007 - Monday, August 20
TRACK A: MEETING BUSINESS GOALS WITH SPEECH
Soho (7th Floor)
Speech Technology at Google
(Broadway Ballroom)

9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google


Hear about Google’s vision for speech technology. Following months of development and speculation, Google recently released its first speech application, GOOG411. Mike Cohen will describe Google’s experience with GOOG411, discuss Google’s general philosophy and approach to speech services, and review some of the lessons learned thus far. 

A101 – Speech & Self-Service Strategy
10:15 a.m - 11:15 a.m
MODERATOR: Ron Owens, Director, Multimedia Applications PSO - Nortel


Speech-enabled applications in the call center make a myriad of self-service options available to the end user. However, the idea of “if we build it, they will come” has proven false for many organizations deploying speech. Why are some speech applications well-tolerated and some avoided at all costs? What are the factors that cause users to abandon automated systems in favor of live agents? Experts in this session consider speech technology as a part of an overall self-service strategy. Learn techniques for strategic planning, data collection, and analysis that will help create self-service applications that end users actually want to use.

How to Increase Self-Service Containment Without Sacrificing Customer Satisfaction
Nancy Gardner, Senior Engineer, Contact Center Services - Verizon Business

Download Presentation

Want to know why callers are abandoning automated systems? Ask them. At the main transfer points, callers are asked to state the reason for their call. By matching what callers told us to the self-service options they chose, we discovered key application performance issues that led to changes in design, verbiage, and the introduction of “supercharged” grammars.

Organic Growth Through Speech: Cross-Selling & Up-Selling
Dr. Lizanne Kaiser, Sr. Principal Business Consultant - Genesys

How do you grow customer relationships when so many calls are automated? How do you convert service into sales without annoying customers? Explore best practices for promoting organic growth and customer loyalty using speech-automated cross-selling and up-selling. Learn specific techniques for designing timely and relevant offers.

Defining a Telephony Self-Service Strategy
Tony Lorentzen, Vice President, Consulting Services - Viecore

Download Presentation

This session looks at defining a self-service strategy from a holistic perspective: externally from the consumer’s perspective and internally from the business and technical perspective. Learn how to find the pitfalls in the design of a self-service strategy, how to meet the objectives of consumers and call center business and technical teams, and how to use technology to meet the objectives of self-service.

A102 – Beyond Usability: How Good Is Your Speech Application?
11:30 a.m - 12:30 p.m
MODERATOR: Phillip Hunter, User Experience Designer, Microsoft Tellme - Microsoft


Usability is widely recognized as a measure of the quality of a voice user interface, and usability testing is a must-have in all VUI design projects. But does usability tell the whole story? These experts agree that excellent speech applications are more than just easy-to-use. In this session, hear cutting-edge ideas about what to measure beyond usability and how it can improve your speech application.

Beyond Usability: It Ain’t the Only Outcome that Matters!
Dr Melanie Polkosky, Human Factors Psychologist/Consultant - IBM

You’ve heard it over and over again, you’ve tested for it, you’ve thought about it, you’ve designed your application to get it. But when is usability not enough? This session focuses on usability plus other outcomes you need to consider when you’re designing your next application.

Beyond Usability: How Good Is Your Speech Application?
Dr. Silke Witt-Ehsani, Vice President, VUI Design Center, Design Center - TuVox

This presentation offers an overview of best practices for a) how to define speech application success criteria; b) how to instrument a speech application so that the desired numbers can be measured; and c) how success criteria influence the application design. Examples will be shown using several case studies in which different success criteria have greatly influenced the final application.

Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.


More of the world is going mobile and a new generation, the mobile generation, is using their wireless phones for more than just voice communication. The recent introduction of the iPhone is one such example. In this lunch presentation, Beatriz Infante, CEO of VoiceObjects, will introduce you to this mobile generation and show the next generation of applications they expect, not just on the iPhone but on every phone.

A103 – Success Criteria for the Speech Customer Experience
1:45 p.m - 2:45 p.m
MODERATOR: Dr. Lizanne Kaiser, Sr. Principal Business Consultant - Genesys


How do you know if your speech application is living up to your objectives? Is the application meeting the goals you set when you started the project? You’ll only know the answer to these crucial questions if you establish success criteria, tied to specific metrics, before the project begins. In this session, learn how to develop rigorous, meaningful criteria that will allow ongoing evaluation and improvement of your speech applications.

Success Criteria for the Speech Customer Experience
Carrie Nelson, Principal Speech Software Engineering Consultant - Avaya

Download Presentation

What defines a successful speech application? The answer may involve many different elements. Some are measurable analytics, and other aspects are more qualitative, such as caller satisfaction and customer perception. Further, success criteria definitions are not the same for every application. The key challenge is to clearly identify early on the business goals from the customer perspective and use them to drive the definition of success metrics.

Measuring Speech Applications from a Caller Perspective and a Business Perspective: Four Dimensions of Success
Scott Taylor, General Manager, Business Consulting - Nuance Communications, Inc.

Download Presentation

In this session we’ll examine key dimensions of success for speech applications: effectiveness, efficiency, utility, and attractiveness. We’ll examine some of the successful methods employed by customers for measuring these dimensions, including both databased measurement, as well as experiential measurement, through direct customer feedback. We’ll also review strategies for migrating from “the old metrics” to the new metrics.

A104 – New Business Models for Speech
3:00 p.m - 4:00 p.m
MODERATOR: Gregory Simsar, Vice President, Speech Services - Syntellect, Inc.


In years past, the decision to deploy speech was all about cost reduction—companies used speech applications to offload tasks from more-expensive live agents. Many organizations are realizing that this simplistic model does not always work as advertised, and that speech can do more than just reduce costs. Experts in this session detail new ideas for maximizing the value of speech applications and using speech for more than cost savings.

Innovate or Saturate: Applying the Web Model of Innovation to Speech
John Amein, Senior Vice President, Strategic Partnerships - Voxeo, an Aspect Company

Download Presentation

To reach its full potential, speech must enable more than higher automation rates in traditional IVR applications. Triggered by maturing standards and a broadening audience of developers, a new movement of creative speech development is emerging as a significant market segment. Learn how the Web model of innovation has been applied to speech applications.

Role of Speech Recognition in Free Directory Assistance
John Roswech, Senior Vice President of Sales - Jingle Networks, Inc.

With 411 fees rising to $2 or more per call, 1-800-FREE411’s ad-supported free directory assistance has saved millions of consumers millions of dollars in needless charges. With higher success rates and lower costs than before, speech recognition is critical to 1-800-FREE411’s caller experience, making free 411 an exciting new media opportunity.

A105 – Simulating the Personal Touch
4:15 p.m - 5:00 p.m
MODERATOR: John Roswech, Senior Vice President of Sales - Jingle Networks, Inc.
Debbie Harris, Vice President - Ayalogic
Albert Kooiman, Group Product Manager, Unified Communications - Microsoft
Brad Schorer, Senior VP Marketing & Business Development - VoltDelta


Sixty percent of calls fail to achieve productive results. Incessant routing by automated systems keeps callers longing for the good old days of talking to human agents. How can we make good use of automation without losing the personal touch that’s so important to customers? In this session, panelists consider all customer communications as one flow, fusing contact with live agents with automated processes. Attendees will learn from the panelists’ real-world experiences about how customer service organizations are using new technologies to bridge the human-automation divide.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m

TRACK B: VUI FOR VUI DESIGNERS
Empire (7th Floor)
Speech Technology at Google
(Broadway Ballroom)

9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google


Hear about Google’s vision for speech technology. Following months of development and speculation, Google recently released its first speech application, GOOG411. Mike Cohen will describe Google’s experience with GOOG411, discuss Google’s general philosophy and approach to speech services, and review some of the lessons learned thus far. 

B101 – Whose VUI Is It, Anyway? User Versus Business Requirements
10:15 a.m - 11:15 a.m
MODERATOR: Jenni McKienzie, Voice Interaction Designer - SpeechUsability


A voice user interface is a balancing act between the goals of the business and those of the end users. These goals are often in conflict—businesses want to push more calls to self-service, users want total access to live agents—often to the detriment of the success of the application. When should user requirements win out? In what cases are business requirements more important? The experts in this session provide the knowledge you need to answer these questions.

Customers Request the Darndest Things: 10 Challenges for VUI Designers
Eduardo Olvera, Sr. Manager & Global Emerging Technology Lead, UI Design, Professional Services - Nuance Communications, Inc.

Download Presentation

Business owners have business goals, objectives, and requirements. Designers bring experience and advocate user needs throughout the design process. So how can we create outstanding experiences when objectives may seem to clash or customers have preconceptions about “how the system should work”? Explore some common challenges, understand the real issues behind resistance, and discover how to focus instead on successful systems.

Successfully Combining User & Business Goals
Erin Smith, Senior VUI Designer - Genesys

Download Presentation

By the time an application has the go-ahead from executives, requirements are driven by the business and not the caller. Learn how to find out who the caller really is and how to take several steps back to design for the true caller, so your application is actually used and liked. Business requirements are important, but it’s essential to find the right balance.

B102 – Usability Surveys: Practical Techniques
11:30 a.m - 12:30 p.m
MODERATOR: Susan L. Hura PhD, Principal - SpeechUsability
Peter Leppik, President and CEO - Vocal Laboratories Inc.


Download Presentation

Surveys are an important method of getting opinion feedback from users of speech applications. At best, surveys provide quantifiable data that clarifies user opinions, but many do-it-yourself surveys do not achieve this result. In this session, you will learn how to craft surveys that deliver reliable, accurate data to improve the performance of your speech application. Attendees will gain a basic understanding of survey theory, methods, techniques, and analysis.

Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.


More of the world is going mobile and a new generation, the mobile generation, is using their wireless phones for more than just voice communication. The recent introduction of the iPhone is one such example. In this lunch presentation, Beatriz Infante, CEO of VoiceObjects, will introduce you to this mobile generation and show the next generation of applications they expect, not just on the iPhone but on every phone.

B103 – Controlling Prompts for Maximum Usability
1:45 p.m - 2:45 p.m
MODERATOR: Erin Smith, Senior VUI Designer - Genesys
Tom Houwing, Director - voiceandvision B.V.


Download Presentation

Prompts are at the heart of any VUI design. The embodiment of the sound and feel of the application, prompts convey both affective and informational content. In a very real sense, the usability of a speech application is largely determined by the quality of its prompts. Crafting effective prompts is a creative and scientific endeavor, requiring a diverse skill set. This expert VUI designer outlines an approach for writing, recording, coaching, and processing prompts to ensure the highest quality possible.

B104 – You Be the Expert! Speech and the End-to-End Customer Experience
3:00 p.m - 4:00 p.m
MODERATOR: Dr Melanie Polkosky, Human Factors Psychologist/Consultant - IBM
Dr. Lizanne Kaiser, Sr. Principal Business Consultant - Genesys


Download Presentation

Come share your experience! Audience members will actively participate in this session, sharing insights and anecdotes on the do’s and don’ts of how to use speech automation to create a better end-to-end customer experience. End users don’t evaluate speech automation in isolation—they view it as part of an integrated customer service chain. So in designing the optimal VUI, it’s important to take into account what might happen before, during, and after the automated speech interaction in order to create a seamless customer experience.

B105 – Communication Strategies for Speech Projects
4:15 p.m - 5:00 p.m
MODERATOR: Judi Halperin, Principal Consultant and Team Lead, Global Speech Engineering - Avaya Inc.


Speech projects always involve multiple contributors, often with diverse backgrounds and differing levels of understanding of project goals and speech technology itself. The voice user interface designer often sits squarely in the middle of a group of project sponsors, developers, call center and telephony managers, and others who have a stake in the success of a speech project. In this session, experts suggest effective techniques for facilitating communication both within the team delivering the speech application and between the team and project sponsors.

Does Your Customer Know What You Are Doing?
Dr. Maria Aretoulaki, Director / VUI Design Consultant - DialogCONNECTION Ltd (UK)

Download Presentation

This presentation stresses the importance of incremental and modular descriptions of system functionality for targeted and phased reviews and testing. This strategy ensures clarity, consistency, and maintainability beyond the project lifetime and eliminates the need for changes midproject, thus both managing customer expectations and protecting the service provider from ad-hoc requests.

The Habits of Highly Effective Speech Development Teams: What You Don’t Know Might Be Hurting Your Projects
Dr Melanie Polkosky, Human Factors Psychologist/Consultant - IBM

Teaming is an essential, complicated, and stressful aspect of technology development. This session focuses on what makes a team function well, the most common teaming problems in speech projects, and ideas for troubleshooting to make your team highly effective!

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m

TRACK C: ADVANCED SPEECH TECHNOLOGY SYMPOSIUM
Shubert (6th floor)
Speech Technology at Google
(Broadway Ballroom)

9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google


Hear about Google’s vision for speech technology. Following months of development and speculation, Google recently released its first speech application, GOOG411. Mike Cohen will describe Google’s experience with GOOG411, discuss Google’s general philosophy and approach to speech services, and review some of the lessons learned thus far. 

C101 – Advances in Speech Recognition Processing
10:15 a.m - 11:15 a.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - Agero


Advances and improvement in core speech recognition technology are difficult to demonstrate, since accuracy is strongly dependent on application, particular speakers, background noise, and other variables. Beyond accuracy, speech recognition technology can be improved by better handling of complex or “natural” dialogs. Audio channels and speech platforms are important components of today’s speech applications. In this session, speakers explore the advances in core speech technology, audio channel processing, and speech platform integration and go behind the scenes of Vista to expose interesting aspects of the integration of speech technology.

Speech Technology in Vista
Fil Alleva, Engineering GM, Speech @ Microsoft - Microsoft

Windows Speech Recognition (WSR) in Vista is a practical solution for speech-enabled access to Windows-based PCs for users who find keyboard and mouse interfaces to be less productive than they would like. The technology behind WSR includes automated personalization, the Microsoft Speech Recognizer, SAPI 5.3, the accessibility framework, the text services framework, and Windows Desktop Search all being employed to deliver the Windows Speech user experience.

Speech Processing for DRS Versus NSR
Veeru Ramaswamy, Chief Technology Officer - Vianix

There are two methods for compressing and transmitting digital speech for server based automatic speech recognition. Distributed Speech Recognition (DSR) schemes gained popularity in the late 1990s due to limited data channel bandwidth availability. The evolution of higher bandwidth channels and advances in voice compression now allow Network Speech Recognition (NSR) applications to achieve the speech recognition accuracy of DSR in similar bandwidth and provide additional benefits. This presentation compares voice-based NSR with features-based DSR recognition schemes.

C102 – Advances in Text-to-Speech Processing
11:30 a.m - 12:30 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - Agero


Text-to-speech synthesis is getting better, more flexible, and is now used globally in a wide spectrum of speech applications. Advances in standards have improved text-to-speech quality. The Speech Synthesis Markup Language (SSML) provides a standard way to control speech synthesis and text processing parameters. The Pronunciation Lexicon Specification (PLS) is designed to enable interoperable specification of pronunciation information. This session reviews some much-needed clarifications about how text in multiple languages should be annotated and describes work being done to link SSML and PLS more seamlessly.

Applying the Pronunciation Lexicon Specification to ASR & TTS
Patrizio Bergallo, Senior System Architect - Loquendo

Many speech applications demonstrate the need to define the pronunciation of certain words (for instance proper names, locations, etc.) or to expand acronyms/abbreviations, both for ASR and TTS usage. This presentation describes the W3C PLS (Pronunciation Lexicon Specification) that defines lexicon documents to be referenced by SRGS grammars and SSML prompts.

The Internationalization of the W3C Speech Synthesis Markup Language
Dr. Daniel C Burnett, Director of Standards - Voxeo, an Aspect Company

In SSML, how do you mark tones, or use pinyin for pronunciation, or indicate a change in language but not a change in voice? Learn about the changes in SSML that provide improved support for Mandarin, Cantonese, Japanese, Hindi, and other world languages. This session also explains multi-language annotation and how to link with PLS.

Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.


More of the world is going mobile and a new generation, the mobile generation, is using their wireless phones for more than just voice communication. The recent introduction of the iPhone is one such example. In this lunch presentation, Beatriz Infante, CEO of VoiceObjects, will introduce you to this mobile generation and show the next generation of applications they expect, not just on the iPhone but on every phone.

C103 – Advances in Natural Language Processing
1:45 p.m - 2:45 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - Agero


The demand for natural language has reached an all-time high as directed dialog applications continue to be criticized for being inefficient and not flexible enough. There is little dispute that out-of-grammar handling is generally poor when an active grammar is large. In-grammar accuracy for extensive vocabularies has been achieved by using large amounts of speech data to extract statistical information to represent acoustical units. Likewise, statistical approaches have been applied to advance natural language understanding. Most recently, statistical approaches are being applied to voice interface design with the goal of improving user experience. This session reveals some exciting advances in natural language that will affect the future of the user experience.

Creating More Natural Language Interfaces Using Robust Parsing
Krishna Govindarajan, Speech Science Global Discipline Leader, Professional Services - Nuance Communications, Inc.

For the current state-of-the art speech recognition systems, the in-grammar accuracy is quite good, especially for directed-dialog systems. However, due to the variability of how callers respond, a portion of the utterances are not covered by the grammar, i.e., they are out-of-grammar (OOG). OOGs affect the “perceived” accuracy of the system, and are one of the primary items addressed during tuning. This presentation discusses the concepts of “near OOGs,” “far OOGs,” and related concepts.

No Data Like More Data: Experimental Voice Use Interface in Action
Roberto Pieraccini, Speech and Natural Language Technology Expert
Jonathan Bloom, Senior User Interface Manager - Nuance Communications, Inc.

Download Presentation

Today we are extending the data exploitation paradigm to voice user interface (VUI) design. Statistics and machine-learning sciences are now complementing the art of designing the best prompts and interaction strategies with the goal of optimizing automation and improving user experience. Using a few case studies, this presentation shows how to “experimentally” choose among competing VUI designs without disrupting the user experience while optimizing global indicators of performance.

C104 – Speech-to-Speech Translation
3:00 p.m - 4:00 p.m
MODERATOR: K.W.'Bill' Scholz, President - NewSpeech LLC


Recent innovative integration of recognition and synthesis technology has led to the realization of fully automatic speech-to-speech translation. This session explores the latest techniques for implementing automated language translation and considers the technology behind the integration: how to manage out-of-grammar responses, the effects of using robust parsing versus SLMs, and incorporating an open source speech analytics solution called Unstructured Information Management Architecture (UIMA).

Speech-to-Speech Infrastructure Based on UIMA
Jan Kleindienst, Manager, Conversational Interactions and Architect - IBM

Download Presentation

This presentation shows a distributed infrastructure for integration of third-party recognition, translation, and synthesis technologies into speech-to-speech system combinations. The infrastructure is built over the Unstructured Information Management Architecture (UIMA), an open-source framework for speech analytics. The Web infrastructure has successfully been used for the remote automatic evaluation of speech-to-speech systems on pan-European scale.

Integrating Language Translation Software with Speech Recognition
Hannah Grap, Marketing Communications Manager - Language Weaver, Inc.

Download Presentation

As automated language translation technology moves to statistically based computational methods, the timing is right to integrate language translation and speech recognition technologies. Case study examples and demos of existing integrated solutions will give the audience an overview of how to leverage speech applications across languages.

C105 – Voice Search
4:15 p.m - 5:00 p.m
MODERATOR: Dr. Thomas Schalk, Vice President, Voice Technology - Agero


Voice search is perhaps the hottest topic in recent speech deployments. Analogous to searching the Web with text, voice search can encompass a number of services, including directory search and searches for specific information, such as news or sports scores. What are the requirements for achieving effective dialogs when searching by voice? How does dynamic content, such as location-based ads, fit into the voice-user interface? What other analogies are there between voice searching and Web searching? This session is a must for those interested in learning about the trends in voice search.

Optimizing Software Architecture for Voice Search
Leo Chiu, Chief Technology Officer - Apptera

Download Presentation

Voice search is very hard to do well when you consider the millions of different accents, behaviors, and speech patterns a software program would have to decipher. What is the best way to architect the solution so that it has the best chance of providing an effective consumer experience? What are the business considerations for making it work in the real world? In this presentation you will hear thoughts and learnings from the edge of the “voice search” frontier.

Data Mining for Voice Search
Charles Galles, Principal Member, Technical Staff - AT&T

Voice search topics and Web content change all the time. How can an architect prepare the recognizer to recognize fundamentally new words and topics? With all of the activity on the Internet, are there any useful data sources for recognizer training? This presentation will explore how the Web and other data sources may be leveraged to keep your voice search solution current.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m

TRACK D: SPEECH TO GROW YOUR BUSINESS
Majestic Room (6th floor)
Speech Technology at Google
(Broadway Ballroom)

9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google


Hear about Google’s vision for speech technology. Following months of development and speculation, Google recently released its first speech application, GOOG411. Mike Cohen will describe Google’s experience with GOOG411, discuss Google’s general philosophy and approach to speech services, and review some of the lessons learned thus far. 

D101 – Speech in the Mainstream: Top Trends
10:15 a.m - 11:15 a.m
MODERATOR: Tim Moynihan, Senior Analyst and Project Leader - Opus Research
Daniel Hong, Senior Director, Product Marketing Strategy - [24]7
Dr. William Meisel, President - TMA Associates


Download Presentation

The maturation of speech recognition technology is leading to new business opportunities in a consolidating market. Where are customer wins occurring? What are the top trends and drivers in the speech industry? And what factors will influence the speech industry in coming years? Daniel Hong maps where the speech industry is right now and where it is headed. Bill Meisel discusses how disruptive trends are driving the way people communicate with each other and with automated systems and suggests an approach to navigating these turbulent times.

D102 – Using Analytics to Understand Your Customer
11:30 a.m - 12:30 p.m
MODERATOR: Dr. Judith Markowitz, President - J. Markowitz Consultants


Analytics can take many forms within an enterprise. Two that involve speech processing are showcased in this session. One approach delves into the spoken content of interactions between customers and call center agents and the paths within the channels that customers follow as they interact with an enterprise. The second examines communications channels (IVR, agent, Web, etc.). Each approach extracts information from its analysis that delivers important business intelligence to the enterprise.

Utilizing Speech Analytics to Improve Quality Assurance Processes
Tom Harker, Chief Technology Officer - Calibrus

In a call center environment where quality assurance is a must, there are many challenges. Usually there is a trade-off between efficiency, productivity, and cost. This case study shows how utilizing speech analytics for quality assurance has lowered costs while increasing efficiency and productivity.

Improving your Bottom Line by Understanding Customer Behavior
Scott Witter, Vice President, U.S. Wealth Management & Business - Hartford Life

Download Presentation

Customer Case Study - Click to learn more!This presentation shows how The Hartford Insurance Property and Casualty used speech analytics to identify customer behavior, understand what data defines that condition, and what the information means to the bottom line. The case study illustrates that when you closely examine customer experience across multiple touch points you begin to understand the true benefits of that channel and whether your customers, as well as your business, are achieving the expected success.

Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.


More of the world is going mobile and a new generation, the mobile generation, is using their wireless phones for more than just voice communication. The recent introduction of the iPhone is one such example. In this lunch presentation, Beatriz Infante, CEO of VoiceObjects, will introduce you to this mobile generation and show the next generation of applications they expect, not just on the iPhone but on every phone.

D103 – Speech to Increase Revenue & Decrease Costs
1:45 p.m - 2:45 p.m
MODERATOR: Rob Marchand, Senior Director, Product Management, Genesys Telecommunications - Alcatel Lucent


Speech applications are being used to increase revenue and decrease costs by revolutionizing business processes and customer interactions. In this session, hear lessons learned from customers and industry leaders who have pioneered the implementation and deployment of successful speech applications. Learn how you can increase customer service and save money at the same time from developers who have successfully improved the bottom line within their organizations.

DIRECTV: Look Who’s Talking
Michael J. Uhlenkamp, Call Center Technology Manager - DIRECTV

Download Presentation

Customer Case Study - Click to learn more! Which IVR solution is the right choice? For DIRECTV, it isn’t a single technology that provides the answer. Using the right mix of natural language, ASR, and DTMF has allowed DIRECTV to simplify its self-care functionality, improve IVR utilization, and positively impact customer satisfaction. Hear how implementing natural language has been an effective strategy, and why ASR and DTMF still play an integral role in providing best-in-class service by the nation’s leading satellite television provider.

Natural Language Call Routing Tips & Strategies
Dorothy A. Verkade, Head of Speech Innovations - Aetna

Download Presentation

Customer Case Study - Click to learn more! Aetna is in the final phase of implementing its second “next-generation” Aetna Voice Advantage, a state-of-the-art speech portal using natural language call routing and a suite of self-services features. Aetna will share key insights and experiences, from setting the strategy through the implementation. Where is the value to the enterprise and the satisfaction for the caller? How do callers respond to “Tell me why you are calling today”? You’ll learn 10 key tips for designing a natural language call routing approach.

D104 – Speech Enables Self-Service
3:00 p.m - 4:00 p.m
MODERATOR: Dr. Nava A Shaked, CEO - Brit Business Technologies Ltd (BBT)
Richard Grant, Chief Technology Officer - Ordercatcher Inc.
Chester Anderson, Vice President, Business Development - Ordercatcher Inc.
Alexandros Papanikolaou, Sales Manager - Village Roadshow Greece


Download Presentation

Customer Case Study - Click to learn more!Hear how to improve customer service by enabling customers to use phones and cell phones to place orders with automated speech systems instead of waiting in lines to purchase tickets or place orders. A fast-food company and a cinema chain explain how automated speech systems that save money and improve customer satisfaction were successfully implemented and deployed. Hear how such problems as menu navigation, recognition of non-English words, real-time menu updates, and peak call processing were overcome.

D105 – Speech Drives CRM
4:15 p.m - 5:00 p.m
MODERATOR: James Barnett, Director - Alcatel Lucent
Christian J. Pereira, Director Business Development - D+S communication center management GmbH
Brian Gebert, Director of Corporate Sales - Shunra Software Ltd.
Jangwoo Shin, Chief Technology Officer - WebForPhone


Download Presentations

Customer Case Study - Click to learn more!By speaking on a telephone, users can retrieve and update data on accounts, contacts, opportunities, and calendar applications. Learn how to overcome the difficult problems of CRM applications, including recognition of custom vocabulary and database searches. These industry experts will present demonstrations and share key learnings.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m

TRACK E: TOOLS & ENVIRONMENTS
Wintergarden Room (6th floor)
Speech Technology at Google
(Broadway Ballroom)

9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google


Hear about Google’s vision for speech technology. Following months of development and speculation, Google recently released its first speech application, GOOG411. Mike Cohen will describe Google’s experience with GOOG411, discuss Google’s general philosophy and approach to speech services, and review some of the lessons learned thus far. 

E101 – Open Source Development Environments
10:15 a.m - 11:15 a.m
MODERATOR: Dr. Moshe Yudkowsky, President - Disaggregate Corporation
Phil Shinn PhD, Call Center Engineering - Morgan Stanley
Ken Osowski, Vice President, Product Management - Pactolus


Download Presentation

In this technical session about open source development environments, Phil Shinn will introduce and demonstrate an open-source speech application design toolkit, the VUID Toolbox, which consists of custom Visio stencils, Visual Basic macros and Python scripts that make designing and testing speech apps fun! Ken Osowski will analyze and compare the scalability, subscriber feature flexibility, multi-service integration potential, and other key service enablement characteristics of leading and emerging open source telecom technologies and discuss the relative usability/complexity of various dominant open source technologies.

E102 – Windows Vista Development Environment
11:30 a.m - 12:30 p.m
MODERATOR: Steve Chirokas, Executive Director, Marketing - VoltDelta


Windows Vista supports speech interfaces to many of its applications. In this session, demonstrations will show attendees how to use Visual Studio to develop SALT and IVR applications for the Microsoft Office Communication Server. This session will also demonstrate and discuss the Speaky Media Center for controlling a Windows Vista-based media center.

Developing Speech-Enabled Applications

This presentation shows how to create an IVR using Microsoft Office Communications Server 2007 Speech Server and Windows Workflow Foundation. You will also learn the difference between the SALT and VoiceXML development environments.

Speaky Media Center: A Voice-Based Solution to Interact with PCs
Fabrizio Giacomelli, CEO - Mediavoice

Mediavoice has developed Speaky Media Center, which is based on a Vistacompliant remote control and enhanced with push-to-talk voice capabilities, and ASR and TTS technology. Speaky uses a user-friendly interface with dynamic TTS-based feedbacks to voice interact with content such as EPG TV guide, telephony, photos, videos, music, weather, and horoscopes.

Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.


More of the world is going mobile and a new generation, the mobile generation, is using their wireless phones for more than just voice communication. The recent introduction of the iPhone is one such example. In this lunch presentation, Beatriz Infante, CEO of VoiceObjects, will introduce you to this mobile generation and show the next generation of applications they expect, not just on the iPhone but on every phone.

E103 – New Language Specifications
1:45 p.m - 2:45 p.m
MODERATOR: Emmett Coin, Speech Scientist - ejTalk


This session reviews two emerging languages. The W3C State Chart XML (SCXML) will be a fundamental part of VoiceXML 3.0, as well as a stand-alone control language. The VoiceXML Forum’s Data Logging Specification will describe a format for log files created by speech applications and used by log report generators and database management systems.

Specifying Speech Workflow Applications Using SCXML
James Barnett, Director - Alcatel Lucent

SCXML is a flow control language based on Harel State Charts. It is being developed by the W3C for use with VoiceXML 3, but can be used in a wide variety of workflow applications. This presentation provides an overview of the SCXML, along with pointers to open-source implementations of it, and a discussion of future plans for the language.

A Uniform Data-Logging Specification
Mr. David L Thomson, Director - AT&T Labs

The VoiceXML Forum Tools Committee is developing a specification for capturing runtime data from speech systems. This data is useful for service analysis and tuning. The specification will improve compatibility across vendors. The presentation reviews the specification, which is available in draft form, and offers implementation tips.

E104 – Which Tools Are Right for You?
3:00 p.m - 4:00 p.m
MODERATOR: Mr. David L Thomson, Director - AT&T Labs
John Fuentes, Principal Solutions Architect - Intervoice
Matt Whipple, Principal Consultant, Self-Service Solutions - Avaya
Dr. Moshe Yudkowsky, President - Disaggregate Corporation


Download Presentations

With the high cost of developing speech applications, businesses are turning to speech application development tools to decrease the time and effort needed to develop speech applications. This session discusses the types of development tools, identifies criteria for useful development tools, and suggests some development tool characteristics that should be avoided. The speakers will also identify missing tool functionality, recommend strategies for tool interoperability, and characterize desirable tool user interfaces.

E105 – Techniques for Reusability
4:15 p.m - 5:00 p.m
MODERATOR: Dr. Moshe Yudkowsky, President - Disaggregate Corporation
Tim Barnes, CEO - OpenMethods
Jerry Carter, Director, Speech Architecture & Standards - Nuance Communications, Inc.
Rob Marchand, Senior Director, Product Management, Genesys Telecommunications - Alcatel Lucent


The expense of developing speech application software has caused enterprises to look at ways to decrease development costs. This panel explores ways to reuse existing code and offers suggestions about how to construct code to improve its usability. The panel will also discuss the problems and benefits of reusable grammars, subdialogs, packaged applications, and other strategies for reusability.

Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m




Silver Sponsors
Bronze Sponsors
Media Sponsors