News

Microsoft sets sail with SALT-based speech apps

Microsoft announced last year its intention to provide developers with tools to add speech capabilities to their applications. Last week, the Redmond, Wash.-based software giant showed off two sample applications built with its Speech Development Kit (SDK). Targeted at the retail and financial sectors, the sample apps are designed to provide developers with best practices to incorporate speech functionality into existing Web applications.

Released last May, Microsoft's SDK was the first application based on the Speech Application Language Tags (SALT) specification. SALT was developed by the SALT Forum, a group of technology companies that joined forces in October 2001 to accelerate the development of speech technologies in telephony and so-called multimodal systems. The group released the 1.0 version of SALT earlier this year.

According to James Mastan, director of marketing for Microsoft's .NET Speech Technologies, the SALT spec defines a set of lightweight tags as extensions to commonly used Web-based programming languages. "The idea," Mastan said, "was not to re-invent the wheel, but to take advantage of the existing Web infrastructure and standards, and to simply add some lightweight standards that allow developers to add speech to their Web applications in an integrated fashion."

In "multimodal" applications, these tags can be added to support speech input and output, either as standalone events or jointly with other interface options, such as speaking while pointing to the screen with a stylus, Mastan said. In telephony applications, the tags provide a programming interface to manage the speech recognition and text-to-speech resources needed to conduct interactive dialogs with the caller through a speech-only interface.

Vertigo Software, a provider of software development and consulting services, used Microsoft's SALT-based SDK to build the two sample applications. The first is the ASP.NET Commerce Starter Kit. Based on the IBuySpy Store sample, it demonstrates how an existing Web-based e-commerce store can be speech-enabled. According to Vertigo reps, by speech-enabling the ASP.NET Commerce Starter Kit with the Microsoft Speech SDK, users will be able to order an item by product number, browse a store catalog, hear product descriptions and add products to their shopping cart -- all by voice.

The second sample app is the Fitch & Mather Stocks (FMStocks) Web application, an online stock brokerage that allows customers to manage a stock portfolio by telephone. According to company representatives, users of FMStocks can obtain quotes on stock prices, buy and sell stock, and review their portfolios.

"By following these sample applications, developers are able to speech-enable their current Web applications based on Visual Studio .NET, [even if they have] limited-to-zero experience with speech technology," Vertigo CEO Scott Stanfield said in a statement. "With the SALT-based Speech SDK integrated in Visual Studio, developers now have an incredible new interface option that can augment existing, traditional systems."

Where will speech technology find success? According to Brian Strachman, senior analyst at In-Stat/MDR, although speech technology installations are currently most commonly found in call centers, the advent of standards such as the SALT specs suggest that opportunities are growing for speech-savvy developers -- especially in mobile computing environments.

"[T]he call-center market, although it's been fairly profitable, is just the tip of the iceberg," Strachman said. "There are a lot of call centers out there that could still benefit from speech, and I think there are going to be lots of other markets where speech recognition will be. ... [S]peech-enabling a PDA is another great application for SALT. Many people use their PDAs now only for basic contact management, scheduling [and] that sort of thing. If you get more complex than that, you run into limitations with the interface in dealing with either a small keyboard or a touch screen. SALT has the ability to make PDAs much more useful for mobility applications, and to really bring the power of the Internet to the PDA. And that's the beauty of speech -- it's the most natural way to communicate.

"SALT also provides a programming model that is familiar to millions of developers," Strachman added. "So in terms of training a development ecosystem to work with the specification, which is really what creates an industry, there is a large base of talent out there already."

Microsoft's SALT-based SDK is currently in beta 2. The company recently integrated it into its Visual Studio .NET development environment. The company is billing the combined tools as a "faster, easier and more economical way for Web developers to leverage their existing Web development knowledge and skills and incorporate speech functionality into Web applications."

Developers can download the sample applications on Microsoft's Web site by visiting http://www.microsoft.com/speech/techinfo/sampleapplications/. The applications also include white papers designed to guide developers in speech-enabling their Web applications.

Links:

For other Programmers Report articles, please go to http://www.adtmag.com/article.asp?id=6265

About the Author

John K. Waters is a freelance writer based in Silicon Valley. He can be reached at [email protected].