In-Depth

Software Agents Go To The Movies

Long touted as a promising technology, software agents have seemingly vast potential for electronic commerce. The possibilities, and the basic design considerations for Internet Information Extraction (IE), can be illustrated by a look at a familiar, simple and yet surprisingly rich domain: motion picture theaters, films and showings.

In fact, the newly emerging eXtensible Markup Language (XML) may be a useful technology for enabling Internet IE agents targeted at all kinds of online listings.

The term "agent" has become a popular and exciting buzzword. Given the richness and diversity of information sources, as well as the number of commercial opportunities on the Internet, agents would seem to be especially valuable for Web interaction. Yet agents that help users find their way intelligibly through this data maze are comparatively rare.

What exactly is an agent? An agent is a program that a person or organization (the "client") designates to perform a specified task. Once the agent knows what its client wants it to do, it proceeds to carry out the task without the need for constant human intervention. It can run unattended for long periods of time (a week, a month, perhaps many months), until its task is done or its client recalls it. It can visit and communicate with a variety of information sources, and it can tailor its answer to a client's requirements.

To get a view on how agents, XML and the Web may work together in years to come, it may be worthwhile to look at how a personal agent can help navigate something many of us use on a regular basis: movie theater schedules. While people can access movie information from different sources, dealing with existing information sources and the servers that provide human access to these sources can be difficult, time-consuming or practically impossible. Agents can bridge the gap between what people want to ask, and what existing Web and phone systems support. This gap exists not only in movies, but in almost any practical domain - travel planning, stock trading and so on.

Information about movies and available movie showings is currently available by phone, on the Web and, of course, in daily and weekend newspapers. But agent technology allows for much more sophisticated searches. It also allows for interaction between the movie-going public, movie producers and critics, and the theaters that exhibit these movies.

So, what do people want to know about movies? The obvious questions are "What movies are currently playing or will be playing next weekend?" and "Where and when can I see my movie?".

In fact, there are many other questions. Perhaps what I would like to ask is something like this:

  • "Find the earliest showing of the ‘Blair Witch Project.'" (Or the latest showing, or showings before noon, or showings within a specified time band).

  • "I only want theaters with ample or handicapped parking spaces. Which are they?"

  • "What theaters are within a certain distance by road (or an estimated driving time less than 30 minutes) from where I am right now?"

  • "I like short movies. What's playing with a duration of less than two hours?"

  • "Find only theaters with stadium seating and large screens."

  • "I am looking for a multiplex showing two movies, one suitable for adults and the other suitable for children under 10."

  • "I'd like to see ‘Phantom Menace,' but only if there's handicapped parking and stadium seating."

  • "I can't recall either the title or the actors of a film I read about in an intriguing review. I don't know if the film is still playing anywhere, or whether it has moved to video. Also, I am not sure if the film will be classified as foreign, art, comedy or drama. I remember that the film was French, and can give a brief (two or three sentence) description of the story line. What film is this? Where can I see it or rent it?"

But what if no movie showing satisfies all of a client's requirements? In that case, the client may wish to indicate which criteria are absolute and which ones may be relaxed; for example, the client may prefer a movie starting before noon, but may accept a starting time up to 1:30 p.m. if all of the other criteria are satisfied. Or the client may want to "weight" the criteria, indicating which ones are more important; for example, handicapped parking and an early showing time are twice as important as a large screen and stadium seating.

The customer may also want to specify all criteria in advance, and only be notified if one or more showings satisfy all the absolute criteria. Or they may prefer to be given a set of alternatives, none of which satisfy all the criteria, but all of which "come close." Finally, the customer may want to know one movie that satisfies their requirements, or a ranked list of movies that satisfy or approximate their requirements.

All of the above requirements (except perhaps the last one) deal with actual movie showings, and the theaters at which the movies are actually being exhibited. But another possibility is a client who wants to study the larger set of movies that are "coming" - movies that may or may not yet be scheduled. Indeed, it may be the case that a decision has not yet been made whether a given movie will be shown at all, whether it will go into general distribution or merely be shown at an "art" theater in the region. By studying the set of forthcoming movies, the customer may not only be able to determine which movies they are likely to want to see, but cast a vote for bringing a desired film to the local multiplex. A personal request to the manager of a local theater, or even to the management of the chain to which the local theater belongs, may be ineffective. But tens, hundreds and perhaps even thousands of votes on the Web may be quite influential.

A personal agent can not only respond to a rich variety of requests, it can respond to questions whose answers will not be available for a long time. An agent can wait indefinitely - hours, days, weeks, months and so on - before a desired condition is satisfied and it can alert a user. For example, a user may want to be informed when a foreign film receiving little publicity is shown in their viewing area in a theater with handicapped parking - a condition that may not occur for months (if at all). If the film is never shown locally, the user may then want to be informed when the film becomes available on video, TV or on the Internet - an event that may not be advertised or even mentioned in the local newspaper, or may be mentioned only on a day when they have not read the paper. A human could easily miss these events, but a tireless agent is unlikely to.

Agents: They remember

Agents differ from conventional information sources in another crucial respect: They remember. Your movie agent may remember your previous preferences and tailor its responses to your tastes. It may even make unsolicited suggestions based on either your past requests or selections, or the selections of other users who have tended in the past to exhibit similar tastes. (In principle, the "server" that provides access to an information source could save the preferences or "profiles" of all its clients. But it would not be very realistic for a server to save the profiles of thousands of clients, especially if the profiles change frequently. It would be even less realistic for thousands of Web servers to each save the profiles of thousands of clients!)

An agent can interact with its customer to determine requirements and preferences. It can then move to multiple Web server sites, and interact with each server. Of course, a server can also interact directly with each client, as many Web servers do.

However, an agent can interact with a client at the client's local site to determine requirements, then move to the site of an information source to interact locally with its server. If the interactions are complex, this can greatly reduce the amount of Internet traffic and speed up the interactions themselves.

More importantly, the interactive and logical capabilities of the agent can exceed those of some or all of the servers with which it must interact. For example, a user may specify a logical combination of criteria and constraints, such as the examples given above. While the information source ("Moviefone," for example) may contain much of the needed information, its server may not provide the user with the ability to ask such questions. Users must then ask a series of questions and put the results together themselves. An agent, by contrast, can issue a query to "Moviefone," evaluate the information according to its client's requirements, issue another query to "Moviefone" on the basis of this evaluation, and continue querying until the desired results are achieved.

Furthermore, a client's request may require information from a number of information sources. With an agent, there is no need for a client to "surf" the Web and piece the data into one logical whole. An agent can move from one information site to another, accumulating the information its client needs. Moreover, the calculation it performs at one site may determine which site it visits next; for example, it may determine at a film producer site that Film A may be of interest to its customer. It will then visit film magazine and film critic sites to confirm that hypothesis, then visit a film distributor site to determine if the film is available in the customer's area, and finally visit theater sites to determine if showing times, parking and so on meet the customer's criteria.

Alternatively, the agent can generate "clones," or copies of itself, to visit multiple sites simultaneously, speeding up the process of data collection and logical calculation. The clones can then communicate with their parent to combine their partial results into a final report for the client.

The capabilities described here do not require remarkably "intelligent" agents. On the contrary, most of the required capabilities are already found in existing software, such as database systems, information retrieval systems or data mining systems.

Angle on XML

What is needed, and does not (in most cases) currently exist is for the pages that exist on the Web, or that are created by information servers, to be "marked up" in such a way that they describe their own content. For example, if a page describes the dates and times of movies showing at a number of theaters, there must be a standard way of saying (to external programs such as agents), "This is the theater name, this is the movie name, this is the date at which the movie is being shown at the given theater, this is the time of showing, etc." Moreover, all of these items must be represented in standard ways; for example, is the date represented as month/day/year, day/month/year or year/day/month?

In fact, XML already specifies the names of the data items contained in a page, and how they are organized together. What is needed is for the members of the XML community to agree on standards for representing generic types of data such as names, dates and times. Then the participants in any given domain, such as the film domain, could come together to build on these generic "data types," agreeing on how to name, represent and organize the data specific to their applications. This does not require "rocket science," only agreement on what the customers want to know, and how to represent it.

Also, it should be noted that Web pages represented in a "markup" language such as XML can support flexibility with respect to what information they provide. XML can support flexibility with respect to what information agents search for. One theater may advertise its ample parking and stadium seating. Another theater not possessing those attributes can simply omit them from its Web page. An agent looking for theaters with ample parking will not crash when it encounters a site that does not mention parking. It may simply omit the given theater from its list of candidates, or tell the customer that the theater has the desired film at an ideal time, but comes up short on the criterion of parking. More generally, different kinds of sites may provide different kinds of information; for example, a theater site, a film producer site, a film critic site and a generic movie site such as "Moviefone" may each provide quite different information (and a different user interface). But the customer's agent can gather data from all of these sites, provided that they each organize, name and represent their data according to the common, agreed-upon conventions.

Finding sources of knowledge

Ultimately, one or more evaluation criteria operate in some ill-understood mechanism so that a potential patron becomes aware of "content of interest." Although emulating the logic that a human uses to express likes and dislikes would be difficult for an automated personal agent, alerting a potential filmgoer to possible films of interest is actually a solved agent activity. In one approach, called collaborative filtering, a person fills out an online questionnaire, and a profile is generated and persists in a closed system. Agents representing a person's preferences can then look for other agents in the same system that represent similar interests and ask them for recommendations. Another kind of filtering, called content-based filtering, allows an agent to learn preferences by asking the individual to rate material presented to them.

Software agent architectures must support processing strategies that enable an agent to find and query knowledge sources, react to (and, where appropriate, fuse) discovered knowledge and make decisions based on the knowledge it has gained. Agents must therefore be either comprehensively knowledgeable in a given domain (rather unlikely), know how to communicate with other agents that are knowledgeable (still not likely), know how to glean information from HTML-like documents in cyberspace (some promising results), or be able to locate sources of structured knowledge (still in the future, but being worked on very hard).

An agent's first difficult task is finding domain-relevant knowledge resources. The emergence of portals can be helpful (to humans at least) in a few domains, but they are not sufficiently generalized. Search engines can be helpful (again, to humans), but as with anything designed for browser consumption, the results must be sieved through some mechanism that turns HTML source code into information usable by an automated process.

Wrappers and meta-content

There is a surprising amount of regularity among Web pages that an agent can exploit. For example, the agent could use a library of "wrapper" algorithms to aid in information extraction from the pages in its itinerary. Wrappers map non-standard page formats into a standard format for the benefit of visiting agents. While wrappers are an interesting idea, they are not generally considered feasible because they are site-specific, the solution does not scale to large numbers of sites, sites may change their layout at will making wrappers invalid, and agents cannot examine new sites for which no wrapper exists.

There are several ongoing research efforts to use AI methods for automatic wrapper induction, and some results have been encouraging. However, if the page is not regular or is extremely complex, the feasibility of induction drops significantly.

A far better approach would be for content providers to actually structure the content of their Web pages for access by software other than visual browsers. Indeed, much has been written about the significance of XML or XML-like strategies for tailoring content to the end-user device. Markup languages are beginning to appear to transform content both horizontally (for specific end-user devices) and vertically (for specific applications or industries). Moreover, since XML itself can act as a translator between XML representations, the implication for application design is that it becomes much easier to tailor vertical content for a wide range of devices - financial data (in Financial Product Markup Language, FpML) for a speech output system (using Voice Output Markup Language, VxML.), or movie data to a handheld device (using Handheld Device Markup Language, HDML).

One critical point for designers and developers to consider is that XML lends some measure of syntactic rigor to content, but not necessarily semantic rigor. XML allows us to dictate how entities like software agents can talk and exchange information, but it does not validate the content of the conversation. Another point is that more work is needed (and is being done in the XML community) to refine document content description and validation beyond that offered in traditional XML DTDs.

For example, movie sites might use a common XML description for show time listings, but use a different separator character between hours and minutes. Further, European sites would tend to express "1:40 p.m." as "13.40." Content providers will ultimately be forced to adhere to unambiguous representation standards, purely as a matter of competitive pressures. Moreover, the semantics of each item must be standardized too; for example, is the show time in hours, quarters of an hour, or hours and minutes? What functions are defined for the given item; for example, comparison to a baseline time or comparison to a range of times? On the other hand, some information sources are inherently unstructured, while some are structured but variable. Agents will need to provide information retrieval capability for unstructured data.

One design approach that could be used would be to create "schema oases," where agents might go to find out what sort of questions could be asked of a given XML domain. The agent would return these questions to its home base and construct an HTML form for the human user. A refinement of this strategy would have both an XML DTD and style sheet appropriate to various devices. The ultimate oasis site would also associate URL(s) for content providers that are willing to respond to a visiting agent using the given XML schema. Although such a service does not exist, the idea may be commercially viable.

In the scenario shown in Fig. 1 (p. 42), the agent would hold its intermediate and final query results in XML form because it might not be certain what form the output might need to take. It may be that certain activities might take time to complete. Or the user might have commissioned the search from a Web browser that is currently available by wireless. The agent might use a service such as Portico, Roku or Activerse's Dingbot to find and connect to the end user. In the worst case, the agent would E-mail its results back to its client.

Having considered some development implications of content and content structuring, let us consider software agents as a design center. Customers can send out mobile agents to survey this wealth of information. As depicted in Fig. 1, an agent can act as a discovery tool, a data miner and an information integrator. An agent can find films and showings satisfying a customer's needs or it can provide alternatives approximating those needs. It can engage in a dialogue with the customer, allowing them to modify their requirements based on what they learn. Finally, it can persist and apply user preferences later.

Agents can also provide the flexibility to search according to a variety of requirements - return only the first satisfactory film showing found, a list of film showings satisfying the specified requirements - and interact with the user to modify the requirements as necessary.

An agent can provide output to both a Web site - a vote or review - and to the person it represents.

The increasing amount of information available to users on the Web, the growing number of devices that can be used to access it, and the growth of e-commerce challenge our current ways of thinking and will lead to new methods for Web design. Using a familiar example of searching for movie listings on the Web, we have considered some of the design challenges and techniques that are appropriate to the problem of finding meaning in cyberspace.