Speech specs -- ADTmag

Speech specs

By John K. Waters
December 1, 2002

''Historically, speech has been complicated to implement largely because the standards had not been developed to actually write speech applications,'' said Sunil Soares, director of product management at IBM's Pervasive Computing Division. ''Over the past three years, that has begun to change. You can think of voice today as being where the Web was in 1994, when we had static Web pages and PCs. We didn't know what to do with all of the technology and how to implement it.''

The emergence of a new specification (Speech Application Language Tags or SALT) and the maturation of an older one (VoiceXML), are beginning to provide a sense of stability in the speech industry.

Voice Extensible Markup Language (VoiceXML) was written by the VoiceXML Forum, which contributed it to the World Wide Web Consortium (W3C) standards body. VoiceXML has been around for about two-and-a-half years now, and there are more than 600 vendors and service providers who currently adhere to that particular standard for development.

The W3C defines VoiceXML as a markup language ''designed for creating audio dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations. Its major goal is to bring the advantages of Web-based development and content delivery to interactive voice response applications.''

SpeechWorks was one of the earliest companies to embrace VoiceXML. The company's flagship product line, OpenSpeech, is a speech-recognition solution optimized for VoiceXML. ''We were the first company to introduce a line of products built from the ground up to support VoiceXML,'' said Steve Chambers, chief marketing officer at SpeechWorks. ''For us, it has been good because everyone wants a standard. It delivers investment protection.''

Chambers expects to see most of the speech applications appearing in the near term to be VoiceXML-based, primarily because the standard has been around for a while. But another speech standard, SALT, has received a lot of support from some very big players.

SALT was created by the SALT Forum, a group of technology companies working together to accelerate the development of speech technologies in telephony and so-called multimodal systems. The founding members of the group are SpeechWorks, Intel, Cisco Systems, Philips, Comverse and Microsoft. Formed in October 2001, the SALT Forum now claims more than 50 member organizations; it released the 1.0 version of SALT earlier this year.

According to James Mastan, director of marketing for Microsoft's .NET Speech Technologies, the SALT spec defines a set of lightweight tags as extensions to commonly used Web-based programming languages. ''The idea,'' Mastan said, ''was not to reinvent the wheel, but to take advantage of the existing Web infrastructure and standards, and to simply add some lightweight standards that allow developers to add speech to their Web applications in an integrated fashion.''

Basically, the SALT tags allow developers to add speech interfaces to Web content and applications using familiar tools and techniques. In ''multimodal'' applications, the tags can be added to support speech input and output, either as standalone events or jointly with other interface options, such as speaking while pointing to the screen with a stylus, Mastan said. In telephony applications, the tags provide a programming interface to manage the speech-recognition and text-to-speech resources needed to conduct interactive dialogs with the caller through a speech-only interface.

The SALT specification is designed to work equally well on traditional computers, handheld devices, home electronics, telematics devices (such as in-car navigation systems) and mobile phones.

''What's really going to matter here from an app development perspective is the types of tools available to application developers to enable them to build these multimodal applications,'' said Peter Gavalakis, marketing manager at Intel, ''not the SALT tags in and of themselves. But you need some standard or at least an open specification that an industry ecosystem can develop around.''

SALT-based offerings are already coming down the product pipeline. In May, Microsoft announced the beta release of its .NET Speech SDK, a Web developer tool that the Redmond software maker billed as the first product based on the SALT spec. Philips is reportedly building a SALT-based browser and a telephony platform for SALT. HeyAnita, a speech hosting company, is developing a SALT-based browser for its hosted speech platform. Carnegie Mellon University is developing an open-source SALT browser, which the university expects to be available by the end of the year. Kirusa, a company that is heavily involved in the multimodal application area, is focusing on building multimodal wireless apps around SALT.

Microsoft's Mastan believes that both SALT and VoiceXML will be around for a while, adding that there is some discussion among standards bodies about convergence of the two in the future.

Microsoft's entrance into this market has received mixed reviews, but is generally considered a good thing.

''Microsoft threw a monkey wrench in the gears with SALT,'' said Meta Group analyst Earl Perkins. ''But it's had both a positive and negative affect. It drew attention to a growing market, because Microsoft never enters a market unless they realize there's money to be made. But on the other hand, they introduced another standard, so there may be a bit of a delay while vendors sort out how they're going to support both of them.''

See the following related stories:
Giving applications a voice , by John K. Waters
Talking speech tech , by John K. Waters
Multiple modes , by John K. Waters

About the Author

John K. Waters is a freelance writer based in Silicon Valley. He can be reached at [email protected].

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Securing IT in the AI Era
July 23, 2025

VSLive! 4-Hour In-Depth Workshop: Immersive .NET Full Stack Training: C# Interfaces: Effective Usage while Avoiding Pitfalls
July 29, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

4-Hour VSLive! Workshop: Testability in .NET
August 27, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

VSLive! 2-Day Hands-On Training Seminar: Hands-On with .NET Web Development in 2025
October 7-8, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Visual Studio Live! Las Vegas
March 16-20, 2026

Free White Papers

More Tech Library

Speech specs

Featured

A New Bumper Crop of Tools for the New-Age Citizen Developer

Microsoft Invites Developers to VSLive! with Visual Studio Subscriber Discounts

Apple Launches On-Device AI Framework, LLM Tools, and OS Redesign for Developers

Microsoft Launches 'jaz' to Optimize Java Applications on Azure

A New Bumper Crop of Tools for the New-Age Citizen Developer

JetBrains and Spring Deepen Kotlin Integration for Backend Development

Java at 30: A Language Still Going Strong

Microsoft Launches 'jaz' to Optimize Java Applications on Azure

Apple Launches On-Device AI Framework, LLM Tools, and OS Redesign for Developers

A New Bumper Crop of Tools for the New-Age Citizen Developer

JetBrains and Spring Deepen Kotlin Integration for Backend Development

Java at 30: A Language Still Going Strong

Microsoft Launches 'jaz' to Optimize Java Applications on Azure

Apple Launches On-Device AI Framework, LLM Tools, and OS Redesign for Developers

Upcoming Training Events

Free White Papers

Sponsored Webcasts