Search Engines Look for Answers Inside the Enterprise

Talking Points

  • Searches within the enterprise are often not just about finding information,
    but mining it, reusing it and “finding out what the information byproducts
    are,” one expert says.
  • In the age of software consolidation, companies are struggling to get beyond
    the old practice of aiming search products at individual silos to solutions that
    search across the enterprise.
  • One cutting-edge search capability is the ability to include rich media—the
    growing archives of audio and video content—which companies increasingly
    must contend with.

Thanks to Google, enterprise users expect the same search experience within their organizations that they get on the wide-open Internet-quick results and lots of relevant hits.

However, enterprise search is far more complex and difficult than Internet searches for a number of reasons. For one thing, points out Susan Feldman, an IDC analyst, the business models are entirely different. Searches within the enterprise are often not about just finding information, but mining it, reusing it and generally “finding out what the information byproducts are,” Feldman says. That’s seldom the goal of an Internet search.

She sees search products falling within a much broader category she calls information access platforms, of which search is only an element. “It’s [about] what you do with information…categorizing, tagging, extracting relationships and entities, adding metadata—all of those things enrich search.” To include all of that, IDC labels the category “content access tools.”

Another challenge is that search engines in the enterprise need to touch a wide range of file types and documents, both structured and unstructured. The vast majority of information in the enterprise is not neatly stored in a database.

Search is also made difficult because enterprises lack page ranking systems, so they must use other methods to determine how closely a result matches a request. Google’s page rank algorithm uses links to help determine relevancy, but few documents within companies are linked to other documents. (See related story, “Simulating Google’s secret sauce.”)

Further, the Web’s system of HTML pages lends itself well to searches—Internet search engines often take you deep into a Web site if that’s where your search phrase lies. However, within the enterprise, a word or phrase might fall within hundreds of text pages, reams of PowerPoint slides or a long discussion thread. What usually happens is the user must download the entire file before he or she can figure out whether it’s the needed document.

Evolution comes from intelligent design
Enterprise search, while still in its infancy, is evolving because many enterprises are moving away from aiming search products at individual silos to solutions that search across the enterprise, says Philip Russom, an analyst at Forrester Research.

One solution, called a federated search, presents to the user results culled from multiple search engines as a single result. Federated search, Russom says, is a hot topic in companies these days.

The most sophisticated solutions are offered by pure-play search companies, which include Verity, Autonomy, Convera and Fast Search & Transfer ASA—that are focused on enterprise search solutions. The big platform vendors—IBM, Oracle and Microsoft—market products that include search capabilities as a feature. Each of the three is working on bigger and better enterprise search solutions.

According to IDC’s Feldman, companies such as Verity, Autonomy, Fast Search and Endeca Technologies have been rapidly adding features to their basic platforms either through internal development or acquisition. Companies like these, which have been in the search business for years, have “a deep understanding of the multiple ways that [companies] need to use information,” Feldman says.

When it works like it’s supposed to, an enterprise search solution provides quick returns. Search consultant Avi Rappoport cites an oil company client with offices worldwide and many departmental intranets. The firm discovered its departments were bidding against each other for jobs and Rappoport helped the client see a properly configured enterprise search product could prevent that. “For them, there was an immediate ROI,” Rappoport says. “They didn’t bid against each other anymore, didn’t prepare bids twice and didn’t drive up prices.”

More often, the ROI can be tougher to calculate, she concedes. Search is very much a part of the overall enterprise architecture, and results depend not just on the engine, but on the quality of data, its structure and its organization. “To solve real problems and to come up with something with an ROI, you need to be sure a search tool is addressing an actual enterprise problem,” Rappoport says.

According to John McPherson, software group manager of the Institute for Search & Text Analysis at IBM’s lab, enterprise search used to be driven by a line of business or an app. More than ever, he says, “people are seeing that search is fundamental to improving the return on investment…to helping regular employees do their jobs better.” The kinds of data and applications that people want to search are growing correspondingly diverse, he says. (See related story, “Content still reigns.”)

Keep on googling
Most end users are familiar with Google, and that has helped smooth its Google Search Appliance entry into the enterprise market.

When data integration company Informatica recently redesigned and enhanced its Web sites, familiarity was part of the reason the company selected Google Mini as its enterprise search appliance. According to senior manager of Web marketing Tiffany Trevers, “we asked users what they wanted and many said, ‘Make it work like Google.’”

Informatica runs seven Web sites—six in languages other than English—for its more than 800 employees and 2,100 customers. Because almost half of Informatica’s visitors are international, Google can work out of the box with a variety of languages made all the difference.

Site traffic analysis indicated customers would sometimes search and then leave the site—possibly because they weren’t finding what they wanted. With the new search tool, Trevers says: “We can already see that we’re keeping people on the site longer with the improved search.”

Informatica is now considering Google Mini for additional sites such as, its internal sales portal.

Age of software consolidation
It’s common for companies to have many search engines—one for each department, says Forrester’s Russom. However, in the age of software consolidation, “companies are struggling to get beyond this old practice,” Russom says. “They’re trying to apply search on a broader basis.”

National Instruments, which makes computer tools for scientists and engineers, was running four different search products when it decided to consolidate them for efficiency late last year. The company has between 500,000 and 1 million documents in various formats, databases and file types on its public Web site, according to search and community manager Jeff Watts.

The company shifted all its search functions for it intranet and extranet to Fast Search’s Enterprise Search Platform. The company is working on making internal file-based content, such as Word files and PDFs, accessible under Fast Search as well, according to Watts.

Fast Search takes the multitude of structured and unstructured data it encounters and creates its own structured database or what Watts terms “a database on steroids.” Because the Fast Search database can incorporate content from many sources including intranet content, technical documentation and discussion forums, the company uses the Fast Search index to help customers find answers to technical questions through the extranet. “Alot of people are able to resolve their questions before they call us,” producing significant savings weighed against the company’s front-line tech support personnel who are well-paid engineers, Watts says.

Faster response times and greater relevance also have reduced significantly the time customer support spends per customer, driving down costs further, Watts says.

Mining for gold
At Hewlett-Packard, senior business analyst Randy Collica uses SAS Enterprise Miner and SAS Text Miner for customer analytics projects, such as plumbing lists used for external marketing campaigns, and analyzing customer buying patterns.

Once he knows the business need, whether it’s boosting revenue in a business area or cross selling to a different set of customers, Collica uses the miner products to manipulate the data. With Enterprise Miner, he can search and build predictive models of customer risk and loyalty to address specific business needs.

For example, a project might search out which prospects would be most likely to buy certain products, based on other customers with similar profiles, and list only those prospects to receive catalogs.

Collica claims a substantial ROI for each of the projects he works on. His group of seven analysts churns out “roughly a dozen data mining projects a year…each of those, on average, gives 200 to 500 times ROI,” he says.

Richness equals goodness
One cutting-edge search capability is the ability to include rich media—the growing archives of audio and video content that companies increasingly must contend with. Technologies translate spoken content into text, then perform a text search, or use voice recognition for search comparisons.

At Video Monitoring Services, a company that monitors the media market in the U.S. for clients, CIO Gerry Louw is using Autonomy’s IDOL Server to run complex internal searches on a range of file types including multimedia. From 17 offices, the company monitors roughly 1,000 TV news outlets for clients, continually feeding a synopsis of content into a centralized Autonomy repository. There, employees apply some 28,000 defined searches, many of them very granular, in real time against the Autonomy database.

“You literally can search anything,” Louw says. “Video, speech-to-text analysis audio, textual, any kind of unstructured doc you can imagine, from e-mail to blogs to documents.” To search video, Autonomy first converts the video’s audio into text.

VMS is also using Autonomy’s Virage Video Archiver, a content management solution for storing and managing rich media content such as video and audio. “Very few of the products we looked at had flexibility with audio and video [files],” Louw says.

Performance is important, since the company’s 400 or so service representatives need to quickly perform complex queries as soon as a story airs, then deliver information to clients. “We’ve reduced our processing time by 66 percent, which is important in our market,” Louw says.

Louw says VMS has yet to use what he considers Autonomy’s strongest feature: conceptual search. Once implemented, that will allow VMS to search on concepts in addition to actual words, allowing users to get results “when they don’t know [exactly] what they’re searching for,” Louw says.

The reduced processing time, leads Louw to estimate an ROI of more than 100 percent over the 3 years he’s been running the product—despite an steady increase in content.

Tipping the scales
Forrester’s Russom says scalability is the primary feature to consider when choosing a search tool. Many companies start their enterprise search experience in a limited way, Russom says, so a good enterprise search tool should have the capacity to grow to allow thousands of users to search millions of documents.

Buyers should be aware that installing and maintaining a search solution will be a project, search consultant Rappoport warns: “Search is not a turnkey thing; it takes a fair amount of tuning.” She suggests installing a product, then monitoring what users are searching for and how they’re using the tool. Finally, Rappoport says, consider a team to maintain the product that includes a non-tech rep such as a librarian or an editor who is focused on content.

Sidebar: Content still reigns
Sidebar: Simulating Google’s secret sauce

More on
CodeCrawler in search of developers
By Kathleen Ohlson

Special Report: Web Analytics
By Kathleen Ohlson, Alan Radding

Enterprise Insider Blog
By Jason Halla