Voice Vendors Gather at Biannual SpeechTEK Event -- ADTmag

Voice Vendors Gather at Biannual SpeechTEK Event

By John K. Waters
March 1, 2005

The top speech technology vendors gathered in San Francisco for SpeechTEK West, the biannual industry conclave (February 21-23). The industry was out in force for the show, with virtually all of the top vendors making announcements.

Market leader IBM disclosed that it would be providing Teges Corporation, a developer of Web-based information for Intensive Care Units, with WebSphere software that can enable doctors making their rounds to enter or access patient information using speech, a keyboard or handwriting (an interaction type known as "multimodal") via handheld, slate, or tablet PCs.

Coral Gables, FL-based Teges's i-Rounds solution is based and developed on IBM's WebSphere Everyplace Multimodal Environment software, which is based on the X+V markup language. X+V comprises three World Wide Web Consortium (W3C) standards: XML Events, Voice XML and XHTML.

"If you look at speech, it has traditionally been very hard to develop," says Igor Jablokov. "You practically needed a PhD to create, tune, and deploy these applications. But X+V is a kind of unification of HTML and VoiceXML, which are the two standards that the W3C has for both voice and visual interaction. With it, we married the visual Web to the speech Web. And that's how you get these multimodal applications."

Big Blue reps demoed the new system at the show, which is currently up and running at IBM's Austin Lab.

For more information, go to: www.ibm.com/pvc/multimodal and www.teges.com.

San Diego-based LumenVox announced a new version of its speech recognition engine at the show. Version 5.5 of the SRE introduces lattice-based confidence scores, as well as improved barge-in and-or end-of-speech (EOS) detection, designed to sense when a caller has begun speaking, finished speaking, or paused.

The new SRE is designed to allow developers to custom-design their own grammars, on the fly, to filter out-of-vocabulary words, and to modify the engine and grammar performance at run-time, the company says. Built-in grammars handle single digits, currency, natural numbers, and date and time.

LumenVox also unveiled a new version of its LV Speech Tuner (LVST), a GUI-based maintenance tool designed for tuning, transcription, and testing of any speech-driven application created on multiple ASR platforms. And the company announced that it has added support for a VB ActiveX exe interface to its Speech Driven Information System, which will allow developers to program their application logic with Visual Basic.

For more information, go to: www.LumenVox.com.

Avaya Inc., a Basking Ridge, NJ-based business communications software company, demoed new speech self-service solutions at the show, including the latest release of Avaya Interactive Response. Interactive Response is a speech self-service platform that supports voice standards and the latest speech engine technologies. Standards supported by the product include VoiceXML 2.0, which gives businesses greater flexibility in how they build, deploy and manage speech applications; the Q.SIG protocol, which enables businesses to use speech to route information across international multi-vendor networks; and Media Resource Control Protocol (MRCP), which simplifies the process of integrating self-service platforms and speech technologies.

For more information, go to: www.avaya.com.

Ai-Logix debuted a new developer's platform for its WordALERT real-time keyword-spotting product. WordALERT combines Ai-Logix's SmartWORKS components with speech recognition board products from Natural Speech Communication Ltd. (NSC) to detect keywords spoken for call recording, quality assurance monitoring, and security applications.

The Somerset, NJ-based company's new developer platform combines a set of work environment tools for building keyword-spotting-based applications. Developers use the toolset to build, test, and deploy applications in real-time and off-line using their own development tools. Standard features include up to 16 simultaneous keyword-spotting channels (based on the vocabulary size), four channels of analog interfaces, four channels for playing and recording voice, dynamic grammar allocation, multiple and custom language support, and four hours of technical support. Multiple language support is also available as the possibility to use higher capacity boards capable of handling a larger number of channels within one box.

For more information, go to: www.ai-logix.com.

On the sartorial side, Voice Partners launched a new line of speech-tech clothing at the show. The Palo Alto, CA-based user-experience design consultancy bills itself as "the SWAT team of speech, bringing together the leaders in linguistics, voice user interface design, deployment and usability." Basically, Voice Partners conceives and designs automated services that users can talk to.

Voice Partners' new clothing line, called VUI Wear (voice user interface), consists of a selection of caps, T-shirts, tank tops, boxer shorts, camisoles, and thong underwear adorned with logos and slogans inspired by the work-a-day lives of its employees. Among the more memorable examples at the show: "I've Got a Full-on Raging Persona," "I'm Sorry, I Didn't Understand," and an homage to the "Got Milk?" ad campaign, "Got VUI?"

The line also includes coffee mugs, stickers, license plate frames, tote bags, and a teddy bear. The VUI line is available from the company's online store at: www.cafepress.com/voicepartners.