|
|
|
SpeechTek 2007 - Advance Program
August 18-20, 2008 • New York Marriott Marquis • New York, NY
|
|
|
|
|
|
TRACK A: MEETING BUSINESS GOALS WITH SPEECH
|
|
Soho (7th Floor)
|
Speech Technology at Google
(Broadway Ballroom)
9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google
 |
Hear
about Google’s vision for speech technology. Following months of
development and speculation, Google recently released its first speech
application, GOOG411. Mike Cohen will describe Google’s experience with
GOOG411, discuss Google’s general philosophy and approach to speech
services, and review some of the lessons learned thus far.
|
|
A101 – Speech & Self-Service Strategy
10:15 a.m - 11:15 a.m
MODERATOR: Ron Owens, Director, Multimedia Applications PSO - Nortel Networks
|
Speech-enabled
applications in the call center make a myriad of self-service options
available to the end user. However, the idea of “if we build it, they
will come” has proven false for many organizations deploying speech.
Why are some speech applications well-tolerated and some avoided at all
costs? What are the factors that cause users to abandon automated
systems in favor of live agents? Experts in this session consider
speech technology as a part of an overall self-service strategy. Learn
techniques for strategic planning, data collection, and analysis that
will help create self-service applications that end users actually want
to use. |
How to Increase Self-Service Containment Without Sacrificing Customer Satisfaction
Nancy Gardner, Senior Analyst - Convergys
POWERPOINT SLIDESHOW
Want
to know why callers are abandoning automated systems? Ask them. At the
main transfer points, callers are asked to state the reason for their
call. By matching what callers told us to the self-service options they
chose, we discovered key application performance issues that led to
changes in design, verbiage, and the introduction of “supercharged”
grammars.
|
Organic Growth Through Speech: Cross-Selling & Up-Selling
Lizanne Kaiser, Customer Experience Designer - Genesys Telecommunications Laboratories
POWERPOINT SLIDESHOW
How
do you grow customer relationships when so many calls are automated?
How do you convert service into sales without annoying customers?
Explore best practices for promoting organic growth and customer
loyalty using speech-automated cross-selling and up-selling. Learn
specific techniques for designing timely and relevant offers.
|
Defining a Telephony Self-Service Strategy
Tony Lorentzen, Vice President, Consulting Services - Viecore
POWERPOINT SLIDESHOW
This
session looks at defining a self-service strategy from a holistic
perspective: externally from the consumer’s perspective and internally
from the business and technical perspective. Learn how to find the
pitfalls in the design of a self-service strategy, how to meet the
objectives of consumers and call center business and technical teams,
and how to use technology to meet the objectives of self-service.
|
|
A102 – Beyond Usability: How Good Is Your Speech Application?
11:30 a.m - 12:30 p.m
MODERATOR: Phillip Hunter, Vice President, Interaction Design - SpeechCycle
|
Usability
is widely recognized as a measure of the quality of a voice user
interface, and usability testing is a must-have in all VUI design
projects. But does usability tell the whole story? These experts agree
that excellent speech applications are more than just easy-to-use. In
this session, hear cutting-edge ideas about what to measure beyond
usability and how it can improve your speech application.
|
Beyond Usability: It Ain’t the Only Outcome that Matters!
Melanie Polkosky, Human Factors Psychologist & Consultant - IBM
POWERPOINT SLIDESHOW
You’ve
heard it over and over again, you’ve tested for it, you’ve thought
about
it, you’ve designed your application to get it. But when is usability
not enough? This session focuses on usability plus other outcomes you
need to consider when you’re designing your next application.
|
Beyond Usability: How Good Is Your Speech Application?
Silke Witt-Ehsani, Vice President, VUI Design Center - TuVox
POWERPOINT SLIDESHOW
This
presentation offers an overview of best practices for a) how to define
speech application success criteria; b) how to instrument a speech
application so that the desired numbers can be measured; and c) how
success criteria influence the application design. Examples will be
shown using several case studies in which different success criteria
have greatly influenced the final application.
|
|
Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.
 |
More
of the world is going mobile and a new generation, the mobile
generation, is using their wireless phones for more than just voice
communication. The recent introduction of the iPhone is one such
example. In this lunch presentation, Beatriz Infante, CEO of
VoiceObjects, will introduce you to this mobile generation and show the
next generation of applications they expect, not just on the iPhone but
on every phone. |
|
A103 – Success Criteria for the Speech Customer Experience
1:45 p.m - 2:45 p.m
MODERATOR: Lizanne Kaiser, Customer Experience Designer - Genesys Telecommunications Laboratories
|
How
do you know if your speech application is living up to your objectives?
Is the application meeting the goals you set when you started the
project? You’ll only know the answer to these crucial questions if you
establish success criteria, tied to specific metrics, before the
project begins. In this session, learn how to develop rigorous,
meaningful criteria that will allow ongoing evaluation and improvement
of your speech applications.
|
Success Criteria for the Speech Customer Experience
Carrie Nelson, Speech Solutions Team Technical Lead - Nortel Networks
POWERPOINT SLIDESHOW
What
defines a successful speech application? The answer may involve many
different elements. Some are measurable analytics, and other aspects
are more qualitative, such as caller satisfaction and customer
perception. Further, success criteria definitions are not the same for
every application. The key challenge is to clearly identify early on
the business goals from the customer perspective and use them to drive
the definition of success metrics.
|
Measuring Speech Applications from a Caller Perspective and a Business Perspective: Four Dimensions of Success
Scott Taylor, General Manager, Business Consulting - Nuance
PRESENTATION (PDF)
In
this session we’ll examine key dimensions of success for speech
applications: effectiveness, efficiency, utility, and attractiveness.
We’ll examine some of the successful methods employed by customers for
measuring these dimensions, including both databased measurement, as
well as experiential measurement, through direct customer feedback.
We’ll also review strategies for migrating from “the old metrics” to
the new metrics.
|
|
A104 – New Business Models for Speech
3:00 p.m - 4:00 p.m
MODERATOR: Gregory Simsar, Vice President, Speech Services - Syntellect, Inc.
|
In
years past, the decision to deploy speech was all about cost
reduction—companies used speech applications to offload tasks from
more-expensive live agents. Many organizations are realizing that this
simplistic model does not always work as advertised, and that speech
can do more than just reduce costs. Experts in this session detail new
ideas for maximizing the value of speech applications and using speech
for more than cost savings.
|
Innovate or Saturate: Applying the Web Model of Innovation to Speech
John Amein, Senior Vice President, Strategic Partnerships - Voxeo
POWERPOINT SLIDESHOW
To
reach its full potential, speech must enable more than higher
automation rates in traditional IVR applications. Triggered by maturing
standards and a broadening audience of developers, a new movement of
creative speech development is emerging as a significant market
segment. Learn how the Web model of innovation has been applied to
speech applications.
|
Role of Speech Recognition in Free Directory Assistance
John Roswech, Senior Vice President of Sales - Jingle Networks, Inc.
With
411 fees rising to $2 or more per call, 1-800-FREE411’s ad-supported
free directory assistance has saved millions of consumers millions of
dollars in needless charges. With higher success rates and lower costs
than before, speech recognition is critical to 1-800-FREE411’s caller
experience, making free 411 an exciting new media opportunity.
|
|
A105 – Simulating the Personal Touch
4:15 p.m - 5:00 p.m
MODERATOR: John Roswech, Senior Vice President of Sales - Jingle Networks, Inc. Debbie Harris, Vice President - Ayalogic Albert Kooiman, Group Product Manager, Unified Communications - Microsoft Brad Schorer, Senior VP Marketing & Business Development - VoltDelta
|
Sixty
percent of calls fail to achieve productive results. Incessant routing
by automated systems keeps callers longing for the good old days of
talking to human agents. How can we make good use of automation without
losing the personal touch that’s so important to customers? In this
session, panelists consider all customer communications as one flow,
fusing contact with live agents with automated processes. Attendees
will learn from the panelists’ real-world experiences about how
customer service organizations are using new technologies to bridge the
human-automation divide.
|
|
Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m
|
|
TRACK B: VUI FOR VUI DESIGNERS
|
|
Empire (7th Floor)
|
Speech Technology at Google
(Broadway Ballroom)
9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google
 |
Hear
about Google’s vision for speech technology. Following months of
development and speculation, Google recently released its first speech
application, GOOG411. Mike Cohen will describe Google’s experience with
GOOG411, discuss Google’s general philosophy and approach to speech
services, and review some of the lessons learned thus far.
|
|
B101 – Whose VUI Is It, Anyway? User Versus Business Requirements
10:15 a.m - 11:15 a.m
MODERATOR: Jenni McKienzie, Senior Business Solutions Advisor - Travelocity
|
A
voice user interface is a balancing act between the goals of the
business and those of the end users. These goals are often in
conflict—businesses want to push more calls to self-service, users want
total access to live agents—often to the detriment of the success of
the application. When should user requirements win out? In what cases
are business requirements more important? The experts in this session
provide the knowledge you need to answer these questions.
|
Customers Request the Darndest Things: 10 Challenges for VUI Designers
Eduardo Olvera, Senior User Interface Designer - Nuance
POWERPOINT SLIDESHOW
Business
owners have business goals, objectives, and requirements. Designers
bring experience and advocate user needs throughout the design process.
So how can we create outstanding experiences when objectives may seem
to clash or customers have preconceptions about “how the system should
work”? Explore some common challenges, understand the real issues
behind resistance, and discover how to focus instead on successful
systems.
|
Successfully Combining User & Business Goals
Erin Smith, Senior VUI Designer - Convergys
POWERPOINT SLIDESHOW
By
the time an application has the go-ahead from executives, requirements
are driven by the business and not the caller. Learn how to find out
who the caller really is and how to take several steps back to design
for the true caller, so your application is actually used and liked.
Business requirements are important, but it’s essential to find the
right balance.
|
|
B102 – Usability Surveys: Practical Techniques
11:30 a.m - 12:30 p.m
MODERATOR: Susan L. Hura, Principal - SpeechUsability Peter Leppik, CEO - Vocal Laboratories, Inc.
POWERPOINT SLIDESHOW
|
Surveys
are an important method of getting opinion feedback from users of
speech applications. At best, surveys provide quantifiable data that
clarifies user opinions, but many do-it-yourself surveys do not achieve
this result. In this session, you will learn how to craft surveys that
deliver reliable, accurate data to improve the performance of your
speech application. Attendees will gain a basic understanding of survey
theory, methods, techniques, and analysis.
|
|
Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.
 |
More
of the world is going mobile and a new generation, the mobile
generation, is using their wireless phones for more than just voice
communication. The recent introduction of the iPhone is one such
example. In this lunch presentation, Beatriz Infante, CEO of
VoiceObjects, will introduce you to this mobile generation and show the
next generation of applications they expect, not just on the iPhone but
on every phone. |
|
B103 – Controlling Prompts for Maximum Usability
1:45 p.m - 2:45 p.m
MODERATOR: Erin Smith, Senior VUI Designer - Convergys Tom Houwing, Director - voiceandvision B.V.
POWERPOINT SLIDESHOW
|
Prompts
are at the heart of any VUI design. The embodiment of the sound and
feel of the application, prompts convey both affective and
informational content. In a very real sense, the usability of a speech
application is largely determined by the quality of its prompts.
Crafting effective prompts is a creative and scientific endeavor,
requiring a diverse skill set. This expert VUI designer outlines an
approach for writing, recording, coaching, and processing prompts to
ensure the highest quality possible.
|
|
B104 – You Be the Expert! Speech and the End-to-End Customer Experience
3:00 p.m - 4:00 p.m
MODERATOR: Melanie Polkosky, Human Factors Psychologist & Consultant - IBM Lizanne Kaiser, Customer Experience Designer - Genesys Telecommunications Laboratories
POWERPOINT SLIDESHOW
|
Come
share your experience! Audience members will actively participate in
this session, sharing insights and anecdotes on the do’s and don’ts of
how to use speech automation to create a better end-to-end customer
experience. End users don’t evaluate speech automation in
isolation—they view it as part of an integrated customer service chain.
So in designing the optimal VUI, it’s important to take into account
what might happen before, during, and after the automated speech
interaction in order to create a seamless customer experience.
|
|
B105 – Communication Strategies for Speech Projects
4:15 p.m - 5:00 p.m
MODERATOR: Judi Halperin, Speech Engineer, Contact Center Practice, Self Service Solutions - Avaya
|
Speech
projects always involve multiple contributors, often with diverse
backgrounds and differing levels of understanding of project goals and
speech technology itself. The voice user interface designer often sits
squarely in the middle of a group of project sponsors, developers, call
center and telephony managers, and others who have a stake in the
success of a speech project. In this session, experts suggest effective
techniques for facilitating communication both within the team
delivering the speech application and between the team and project
sponsors.
|
Does Your Customer Know What You Are Doing?
Maria Aretoulaki, Head, Speech Design - Vicorp
POWERPOINT SLIDESHOW
This
presentation stresses the importance of incremental and modular
descriptions of system functionality for targeted and phased reviews
and testing. This strategy ensures clarity, consistency, and
maintainability beyond the project lifetime and eliminates the need for
changes midproject, thus both managing customer expectations and
protecting the service provider from ad-hoc requests.
|
The Habits of Highly Effective Speech Development Teams: What You Don’t Know Might Be Hurting Your Projects
Melanie Polkosky, Human Factors Psychologist & Consultant - IBM
POWERPOINT SLIDESHOW
Teaming
is an essential, complicated, and stressful aspect of technology
development. This session focuses on what makes a team function well,
the most common teaming problems in speech projects, and ideas for
troubleshooting to make your team highly effective!
|
|
Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m
|
|
TRACK C: ADVANCED SPEECH TECHNOLOGY SYMPOSIUM
|
|
Shubert (6th floor)
|
Speech Technology at Google
(Broadway Ballroom)
9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google
 |
Hear
about Google’s vision for speech technology. Following months of
development and speculation, Google recently released its first speech
application, GOOG411. Mike Cohen will describe Google’s experience with
GOOG411, discuss Google’s general philosophy and approach to speech
services, and review some of the lessons learned thus far.
|
|
C101 – Advances in Speech Recognition Processing
10:15 a.m - 11:15 a.m
MODERATOR: Thomas Schalk, Vice President, Voice Technology - ATX
|
Advances
and improvement in core speech recognition technology are difficult to
demonstrate, since accuracy is strongly dependent on application,
particular speakers, background noise, and other variables. Beyond
accuracy, speech recognition technology can be improved by better
handling of complex or “natural” dialogs. Audio channels and speech
platforms are important components of today’s speech applications. In
this session, speakers explore the advances in core speech technology,
audio channel processing, and speech platform integration and go behind
the scenes of Vista to expose interesting aspects of the integration of
speech technology.
|
Speech Technology in Vista
Fil Alleva, General Manager, Speech - Microsoft
Windows
Speech Recognition (WSR) in Vista is a practical solution for
speech-enabled access to Windows-based PCs for users who find keyboard
and mouse interfaces to be less productive than they would like. The
technology behind WSR includes automated personalization, the Microsoft
Speech Recognizer, SAPI 5.3, the accessibility framework, the text
services framework, and Windows Desktop Search all being employed to
deliver the Windows Speech user experience.
|
Speech Processing for DRS Versus NSR
Veeru Ramaswamy, Chief Technology Officer - Vianix
POWERPOINT SLIDESHOW
There
are two methods for compressing and transmitting digital speech for
server based automatic speech recognition. Distributed Speech
Recognition (DSR) schemes gained popularity in the late 1990s due to
limited data channel bandwidth availability. The evolution of higher
bandwidth channels and advances in voice compression now allow Network
Speech Recognition (NSR) applications to achieve the speech recognition
accuracy of DSR in similar bandwidth and provide additional benefits.
This presentation compares voice-based NSR with features-based DSR
recognition schemes.
|
|
C102 – Advances in Text-to-Speech Processing
11:30 a.m - 12:30 p.m
MODERATOR: Thomas Schalk, Vice President, Voice Technology - ATX
|
Text-to-speech
synthesis is getting better, more flexible, and is now used globally in
a wide spectrum of speech applications. Advances in standards have
improved text-to-speech quality. The Speech Synthesis Markup Language
(SSML) provides a standard way to control speech synthesis and text
processing parameters. The Pronunciation Lexicon Specification (PLS) is
designed to enable interoperable specification of pronunciation
information. This session reviews some much-needed clarifications about
how text in multiple languages should be annotated and describes work
being done to link SSML and PLS more seamlessly.
|
Applying the Pronunciation Lexicon Specification to ASR & TTS
Patrizio Bergallo, Senior System Architect - Loquendo
POWERPOINT SLIDESHOW
Many
speech applications demonstrate the need to define the pronunciation of
certain words (for instance proper names, locations, etc.) or to expand
acronyms/abbreviations, both for ASR and TTS usage. This presentation
describes the W3C PLS (Pronunciation Lexicon Specification) that
defines lexicon documents to be referenced by SRGS grammars and SSML
prompts.
|
The Internationalization of the W3C Speech Synthesis Markup Language
Daniel Burnett, Speech Standards Lead Engineer - Nuance
POWERPOINT SLIDESHOW
In
SSML, how do you mark tones, or use pinyin for pronunciation, or
indicate a change in language but not a change in voice? Learn about
the changes in SSML that provide improved support for Mandarin,
Cantonese, Japanese, Hindi, and other world languages. This session
also explains multi-language annotation and how to link with PLS.
|
|
Attendee Lunch
12:30 p.m - 1:45 p.m
Beatriz Infante, President & Chief Executive Officer - VoiceObjects, Inc.
 |
More
of the world is going mobile and a new generation, the mobile
generation, is using their wireless phones for more than just voice
communication. The recent introduction of the iPhone is one such
example. In this lunch presentation, Beatriz Infante, CEO of
VoiceObjects, will introduce you to this mobile generation and show the
next generation of applications they expect, not just on the iPhone but
on every phone. |
|
C103 – Advances in Natural Language Processing
1:45 p.m - 2:45 p.m
MODERATOR: Thomas Schalk, Vice President, Voice Technology - ATX
|
The
demand for natural language has reached an all-time high as directed
dialog applications continue to be criticized for being inefficient and
not flexible enough. There is little dispute that out-of-grammar
handling is generally poor when an active grammar is large. In-grammar
accuracy for extensive vocabularies has been achieved by using large
amounts of speech data to extract statistical information to represent
acoustical units. Likewise, statistical approaches have been applied to
advance natural language understanding. Most recently, statistical
approaches are being applied to voice interface design with the goal of
improving user experience. This session reveals some exciting advances
in natural language that will affect the future of the user experience.
|
Creating More Natural Language Interfaces Using Robust Parsing
Krishna Govindarajan, Speech Science Global Discipline Leader, Professio - Nuance
For
the current state-of-the art speech recognition systems, the in-grammar
accuracy is quite good, especially for directed-dialog systems.
However, due to the variability of how callers respond, a portion of
the utterances are not covered by the grammar, i.e., they are
out-of-grammar (OOG). OOGs affect the “perceived” accuracy of the
system, and are one of the primary items addressed during tuning. This
presentation discusses the concepts of “near OOGs,” “far OOGs,” and
related concepts.
|
No Data Like More Data: Experimental Voice Use Interface in Action
Roberto Pieraccini, Chief Technology Officer - SpeechCycle Jonathan Bloom, Senior Interaction Designer - SpeechCycle
POWERPOINT SLIDESHOW
Today
we are extending the data exploitation paradigm to voice user interface
(VUI) design. Statistics and machine-learning sciences are now
complementing the art of designing the best prompts and interaction
strategies with the goal of optimizing automation and improving user
experience. Using a few case studies, this presentation shows how to
“experimentally” choose among competing VUI designs without disrupting
the user experience while optimizing global indicators of performance.
|
|
C104 – Speech-to-Speech Translation
3:00 p.m - 4:00 p.m
MODERATOR: Bill Scholz, President - NewSpeech LLC
|
Recent
innovative integration of recognition and synthesis technology has led
to the realization of fully automatic speech-to-speech translation.
This session explores the latest techniques for implementing automated
language translation and considers the technology behind the
integration: how to manage out-of-grammar responses, the effects of
using robust parsing versus SLMs, and incorporating an open source
speech analytics solution called Unstructured Information Management
Architecture (UIMA).
|
Speech-to-Speech Infrastructure Based on UIMA
Jan Kleindienst, Manager, Conversational Interactions and Architect - IBM
POWERPOINT SLIDESHOW
This
presentation shows a distributed infrastructure for integration of
third-party recognition, translation, and synthesis technologies into
speech-to-speech system combinations. The infrastructure is built over
the Unstructured Information Management Architecture (UIMA), an
open-source framework for speech analytics. The Web infrastructure has
successfully been used for the remote automatic evaluation of
speech-to-speech systems on pan-European scale.
|
Integrating Language Translation Software with Speech Recognition
Hannah Grap, Marketing Communications Manager - Language Weaver, Inc.
POWERPOINT SLIDESHOW
As
automated language translation technology moves to statistically based
computational methods, the timing is right to integrate language
translation and speech recognition technologies. Case study examples
and demos of existing integrated solutions will give the audience an
overview of how to leverage speech applications across languages.
|
|
C105 – Voice Search
4:15 p.m - 5:00 p.m
MODERATOR: Thomas Schalk, Vice President, Voice Technology - ATX
|
Voice
search is perhaps the hottest topic in recent speech deployments.
Analogous to searching the Web with text, voice search can encompass a
number of services, including directory search and searches for
specific information, such as news or sports scores. What are the
requirements for achieving effective dialogs when searching by voice?
How does dynamic content, such as location-based ads, fit into the
voice-user interface? What other analogies are there between voice
searching and Web searching? This session is a must for those
interested in learning about the trends in voice search.
|
Optimizing Software Architecture for Voice Search
Leo Chiu, Chief Technology Officer - Apptera
POWERPOINT SLIDESHOW
Voice
search is very hard to do well when you consider the millions of
different accents, behaviors, and speech patterns a software program
would have to decipher. What is the best way to architect the solution
so that it has the best chance of providing an effective consumer
experience? What are the business considerations for making it work in
the real world? In this presentation you will hear thoughts and
learnings from the edge of the “voice search” frontier.
|
Data Mining for Voice Search
Charles Galles, Multimedia Applications Speech Analyst - Nortel Networks
Voice
search topics and Web content change all the time. How can an architect
prepare the recognizer to recognize fundamentally new words and topics?
With all of the activity on the Internet, are there any useful data
sources for recognizer training? This presentation will explore how the
Web and other data sources may be leveraged to keep your voice search
solution current.
|
|
Exhibit Hall Grand Opening & Welcome Reception
5:00 p.m - 7:00 p.m
|
|
TRACK D: SPEECH TO GROW YOUR BUSINESS
|
|
Majestic Room (6th floor)
|
Speech Technology at Google
(Broadway Ballroom)
9:00 a.m - 10:00 a.m
Michael Cohen, Manager, Speech Technology Group - Google
 |
Hear
about Google’s vision for speech technology. Following months of
development and speculation, Google recently released its first speech
application, GOOG411. Mike Cohen will describe Google’s experience with
GOOG411, discuss Google’s general philosophy and approach to speech
services, and review some of the lessons learned thus far.
|
|
D101 – Speech in the Mainstream: Top Trends
10:15 a.m - 11:15 a.m
MODERATOR: Tim Moynihan, Vice President, Global Marketing & Sales Support - Envox Worldwide Daniel Hong, Lead Analyst - Datamonitor Bill Meis | |