Internet content management: Untangling the 'Net

The number of pages on Internet sites is growing at a blistering pace. Forrester Group, a Cambridge, Mass.-based research firm, estimates that business sites will eventually average about 15,000 pages and several times that number in scripts, banners, applets and so on. Such growth forces companies to effectively manage their Web sites and content in order to compete in the world of electronic commerce.

In the past, many businesses built Web sites by appointing a Webmaster, who then created a few static HTML pages that customers visited for information. Users depended on information technology (I/T) departments to create what was on the site. And the small amount of content that existed was typically managed by one or two people.

This situation has changed drastically. With the introduction of tools such as Microsoft's FrontPage, users can now create complex sets of static and dynamic Web pages. Instead of one or two Java programmers controlling a business' Web content, there are hundreds or even thousands of employees globally who represent the company in cyberspace. Unfortunately, this means that many organizations have no idea of the content found on their Web sites. Even businesses that outsource development of their Web content may or may not know what material internal employees are posting to the Web. Furthermore, companies that do business globally may have material posted in different countries and in several languages.

"As enterprise Web sites grow and mature, the need for content management systems becomes more acute," said Nikos Drakos, an analyst at the Wakefield, Mass., offices of Gartner Group. "The manual procedures and processes that work well for hundreds of Web pages don't scale well to thousands or tens of thousands of pages, especially as updates increase in frequency."

Content management to the rescue

The ability to manage and monitor who sees what, when, where and how is at the heart of content management. According to Alexis dePlanque, program director of advanced information management strategies at Meta Group, Stamford, Conn., there are four key elements of content management: content creation, content submission/assembly, content storage/management and content publishing.

Content creation involves using tools, such as Microsoft's FrontPage or Netscape's Navigator Gold Editor, to create HTML and XML pages, as well as dynamic pages using tools like Java, ActiveX and so on. Some issues associated with content creation include deciding who manages the creation process, coordinating the creation process with I/T, managing access rights, "costing" the creation process and deciding what content creation tools to use.

Content submission/assembly involves managing content that comes in multiple formats from multiple sources. This typically includes a workflow-enabled edit/approval process, security management, and digital signature and audit capabilities. Organizations must also consider the following when formulating a content submission/assembly process:

  • Primary requirements for content submission
  • Whether or not contributors work directly against a repository
  • Workflow process
  • Collaborative creation
  • Role of the Webmaster
  • Length of the document creation life cycle
  • Implementation of the submission/assembly process

Content storage and management is a key application requirement for both content creation and publishing. Content structure, application integration, formatting standards and performance are just some of the factors that influence decisions in this stage. Other factors include:

  • Where content is stored
  • How content is stored, accessed and retrieved
  • How distributed global content is managed
  • How backups, mirrors, security and other safety issues are addressed
  • Whether to store content as documents or components
  • How documents/components are integrated with applications (for example, using dynamic publishing)
  • Storage costs

Content publishing includes both static and dynamic publishing, as well as some emerging areas, such as personalized services, registration, push technology and so on. Other issues include:

  • Dynamic content assembly
  • Separate creation and management pro- cesses for static vs. dynamic content
  • Publishing policies
  • Click-stream analysis (for example, correlating changes in Web activity with content changes)
  • Integration of Web interaction with other tracking systems, such as call center and sales force automation software

The number and complexity of issues surrounding each element of content management will continue to grow. As companies "go" global, Forrester Group predicts that companies will tailor Web services in the same manner that they tailor marketing literature to specific cultures. "This will drive early adopters to better manage the relationship with individuals through personalization and affinity programs that foster close ties to the brand and site," predicts Forrester in a report about global content management and commerce. Content creation and update tools, along with personalization systems, can be used to address local and regional technology and services.

What's in a name?

I recently looked at three very different content management tools: Raveler from Platinum Technologies Inc., Oakbrook Terrace, Ill.; StoryServer 4 from Vignette Corp., Austin, Texas; and TeamSite from Interwoven Inc., Sunnyvale, Calif. Each tool provides various flavors of the four key elements of content management.

However, each software vendor has a slightly different name for content management. Platinum refers to it as "Web infrastructure management," while Vignette uses the terms "Internet relationship management." Interwoven uses the description "enterprise Web production." After reading the marketing literature, it seems as if the only thing these three vendors agree on is that their tools do something on the Web.

Introduced in July 1998, Platinum's Raveler is a fairly new product. According to Ted Collins, senior vice president of Platinum's Internet Solutions Group, "Raveler is designed to aid in Web content and context control as part of Web infrastructure management." Platinum divides control tasks into managing teams and their materials, content with its attendant structure, and sites as the context for the content. As a concept, Raveler addresses content management in a four-step cycle: creation, deployment, production (including access monitoring) and measurement.

Raveler provides content development and management teams with tools to "version" content in a repository that has a controlled access and change mechanism. Teams can be set up with pre-configured publication processes (workflows) that provide consistent deployment procedures. Automatic backup/recovery tools are included for safety, and Raveler provides teams with role-based, content handling authority. Site audit trail features are also included to ensure workflow and standards compliance.

The tool has three parts: the client software, the Raveler server and a Web server. The Raveler and Web servers can be combined on a single machine. Raveler's key piece, however, is the content warehouse, which stores all content as well as data about the content. Multiple versions of content can be stored, and changes to content can be correlated with Web activity on a particular page or set of pages. Realtime monitoring of hits is tracked from Web logs and graphic displays are available. Data from multiple mirrored sites can also be stored, controlled and tracked. Workflow management and version control features help prevent accidental content overwrites or deletions.

According to Platinum, Raveler can protect intellectual property by automatically versioning, distributing, testing and sharing work with authorized co-workers from their desktops. It also gives managers a comprehensive view of Web projects, enabling them to define workflow processes, monitor progress, and control the transfer of information to and from sites. I/T staff can dynamically view performance and usage information, as well as consolidate data for trend analysis and reporting.

Raveler is available on Windows NT in order to support Web sites on Windows NT and some Unix flavors. It can also support authoring tools on Windows 95, 98 and NT. Netscape, Microsoft and Apache Web servers are supported.

Vignette's StoryServer

Vignette, founded in December 1995, was an early entrant in the content management category, according to Gartner Group. To transform content management into "Internet relationship management," Vignette divides the functionality of StoryServer into four key areas: life-cycle personalization services, open profiling services, advanced content management and its business center.

Like Raveler, StoryServer's life-cycle personalization and open profiling services allow site managers to observe visitor behavior. According to Vignette, in addition to visitor activity/site change correlations, life-cycle personalization services let online businesses adapt their site's presentation, navigation and content to a visitor's environment.

This is accomplished through the use of three agents: the presentation agent, which ensures that visitors are shown content most appropriate to their environment and location; the matching agent, which provides a means of developing and deploying personalized visitor experiences without capturing information that could be seen as private; and the recommendation agent, which provides directed personalization services for frequent visitors by suggesting content based on their preferences vs. those of similar visitors.

Open profiling services consist of a content catalog, visitor registry and observation manager. The content catalog manages the descriptions, groupings and interrelationships of a site's content. The visitor registry is a kind of data mart that stores customer preferences, while the observation manager captures data that lets the site customize itself to each visitor.

StoryServer's content management services are similar to those found in Raveler. The tool supports standard content format, including XML, and provides an open interface to third-party Web applications and products. It also provides static and dynamic versioning, as well as "views" or groups of content assets. StoryServer also includes workflow features, such as notifications and audit trails for editorial control.

The tool's business center is a kind of decision support service consisting of report, keyword and profile managers. The report manager is a high-level analysis and reporting tool that provides decision support data about customer preferences, demographics and content/product popularity. The keyword manager is a front-end tool that describes the structure of the content delivered on the site. The profile manager has a secure graphical front-end for building and managing the structure of the visitor registry.

Gartner Group recommends StoryServer to sites that have content-intensive dynamic pages that are updated frequently. For those sites that require team management of six or more people, StoryServer provides good workflow and development/deployment services.

The tool is available on Windows NT and Solaris. It also supports authoring tools on Windows 95, 98 and NT. Netscape, Micro- soft and Apache Web servers are supported.

Interwoven's TeamSite

With roles ranging from author to master user, TeamSite concentrates mostly on the production and publishing of Web content, and less on site management. One interesting feature is "visual differencing," which lets users select two different versions of content and view them side by side.

The workflow function features a standard assign/edit-approve/reject system. Editors can assign work to authors using the notification feature; they can then be notified when the work is completed, as well as when the content is published on the Web. TeamSite also supports concurrency and various types of locking to manage and control source documents and code. Content is moved to a staging area each time it is changed or receives approval to be published.

Administrative tools include audit trail reports and the ability to view the full revision history of any file. Event logging tracks the development of files, user activity and development of the site as a whole. Parameter-driven deployment is secured through the use of TeamSite's OpenDeploy mechanism, while handshakes and error checks prevent deployment problems.

One notable difference between TeamSite and other tools is that Interwoven chose to develop its own data store rather than utilize a commercial database to get a smaller and faster data store. According to the Seybold Group, Media, Pa., a downside to Interwoven using its own object store is that it does not yet behave like a robust database. Also, TeamSite does not dynamically build editions on the fly the way some database-driven products do.

TeamSite is available on Windows NT and some Unix flavors. It can also support authoring tools on Windows 95, 98 and NT. According to Interwoven, TeamSite is compatible with any available Web server.

First, do no harm

Some tool features are controversial. Many organizations are understandably nervous about tracking and recording data that may be deemed private. A popular example is a "visitor observation" capability that tracks where a Web "surfer" has been, who they are, their preferences, demographics and sometimes even security-related data like passwords.

When implemented properly, however, these tools can have a powerful impact on an organization's ability to manage and control its Internet image. According to Forrester Group, site managers face perpetual growth and are searching for tools that help them develop and maintain unified Internet strategies and architectures. A good site management strategy addresses organizational growth, visitor demographics, and the increasing number and types of business transactions executed on a Web site. It also forms the basis for understanding and agreement on standards, methods, tools, techniques, deliverables and the overall direction of Web-based systems development in an organization.

Tools that manage Web site growth properly will continue to evolve, with vendors leapfrogging each other with new and ever more useful features. Forrester suggests picking products that have a site focus, that allow you to plug in tools that manage content, and that have references to prove their ability to handle the growing load. This is sound advice, given the fact that a business never gets a second chance to make a first impression, even in cyberspace.