In-Depth

Records Managers Are Suddenly Sexy

Data management should stand on its own as an umbrella effort driving all of IT’s efforts, because the essential task of IT is data management.

Two major storage players—CA and Hitachi Data Systems—made announcements this week about new products in archive-centric data management. HDS, on Monday, rolled out something called the Hitachi Content Archive Platform, while CA, on Tuesday, announced the acquisition of yet another archive-related company, MDY, located just a stone’s throw away from Islandia, NY, in Fair Lawn, NJ.

With these announcements, the vendors are seeking street credibility by referencing the pedigree of their products—not in IT but in records administration. Records management is finally being recognized as a profession and practice after laboring for years to bring order to paper files, films, microform and fiche, and optical media. There might be a trend afoot to wrap data archiving products in the flag of records-management discipline, seeking sanction from the Association of Records Managers and Administrators (ARMA) and other groups to validate product value propositions.

I wonder how much real value the records management discipline brings to actual electronic data archiving. Beyond the metaphor of object management, the truth is that electronic archiving is a much more daunting task than maintaining a room of file cabinets, trays of microfilm spools, or jukeboxes of optical media. With these items, the records manager can impose a value-add scheme of “addressing” and “tagging.”

With electronic data, however, we are forced to play the cards we are dealt. The paltry amount of metadata flowing with files to their disk drive “containers” is insufficient to adequately describe the contents of the file or its context. It is often difficult to derive from file names and metadata structures information about what application created the data or how many times the data has been accessed since the last time we checked. That is what makes the challenge of data management so daunting. I am not aware of any secret sauce that the records managers bring to resolve these issues—which goes to the soul of what’s wrong with the 60-year-old file systems we use today.

HDS’ latest offering is interesting as the first step in delivering a robust content-addressing-plus-archiving solution for its customers. They have leveraged some of the technology from newcomer Archivas to provide an archive engine with content indexing. A Google-like interface can be used to search against the files you have stored in the platform.

HDS claims that the product architecture embraces the OAIS (Open Archival Information System) model—“a globally accepted standard (ISO) and utilized by records management and archival science communities for defining the requirements of a long-term archive for data preservation.” We could dwell on the meaningfulness—or lack thereof—of this model, but we won’t. OAIS defines the primary functions of any archive, which (according to HDS) are “the ingest of data and metadata, content authenticity, managing metadata and content as an object as well as providing access to the archive with a robust set of full-text index and search for retrieving content from the archive.” Pretty basic stuff, those ISO standards. The point is that HDS wants everyone to feel warm and fuzzy about the “standards compliance” of the platform. Okay, we get it.

For now, the platform itself is a standalone array, which, on the surface at least, makes it similar to EMC’s Centera platform. It also requires Fibre Channel, because, for now, IP network-attached storage modalities are “too insecure.”

Just the Beginning

Hu Yoshida, HDS CTO, says that this is just the beginning of a bigger design that will eventually leverage the company’s flagship product, TagmaStore, providing yet another storage target virtualized and managed by the company’s switched head array. It will eventually become another “storage tier” in HDS’s Tiered Storage architecture.

The differences between this approach and EMC’s Centera are numerous, covered in a competitive matrix released with the product announcement that will likely be posted somewhere on the HDS Web site. Suffice it to say that there are many more integral bells and whistles, including integral search via Archivas, a non-disruptive maintenance and update capability, and an openness to external players to add value around the platform that is very much lacking in the bunkers of Hopkinton.

In fact, HDS announced an ISV outreach on the same day as the announcement of the platform. The “usual suspects” in the archiving space—e-mail, database and file archive and management people—have all signed on. It is reminiscent of the swarms that used to form around each big iron vendor when Fibre Channel fabrics were first being introduced: the same gaggle of switch, bridge, and software vendors showed up at every party.

The product also provides speeds and feeds improvements over Centera that have yet to be validated. An important question that consumers should be asking: can data can be un-ingested from this platform any more readily than it can be from Centera if the HDS platform proves to be, as we say in the South, the same sort of dog, but with a different set of fleas? Time will tell, I guess.

I am not trying to be overly critical of the HDS approach. I have a friendly and ongoing debate with Yoshida regarding the right place to put the functionality associated with content addressing. HDS clearly wants the intelligence put on the array. In my view, content-addressable storage should be a software function located on a server.

There are merits to both views, but clearly HDS is motivated by a desire to sell hardware and Fibre Channel, and anything that adds value to the business case sells more gear. It must also be antagonizing to sell everything HDS to a customer, but to still need to sacrifice a lucrative archive/CAS sale to rival EMC. I have been in many shops where this is evident. Now, HDS can provide a one-stop shop. Many of their customers will be happy, if only because it streamlines their billing.

The problem remains, how do they un-ingest their data from Centera to move it to the new HDS (or anyone else’s) product? There are some tools out there, but the ISVs I’ve talked to who have them are afraid to use them. One fellow told me that he was concerned that EMC would sue anyone who tried to move data off their platform using an unlicensed hook to their archive API. According to this fellow, anyone who tried would be accused of “reverse engineering” EMC’s intellectual property.

CA’s Move

Intellectual property is what CA seems to be all about these days. They have reorganized their storage efforts to pursue three basic goals: information management, recovery management, and resource management. Most of the recent PR coming out of the company has focused on information management—as well it should.

On Tuesday, the company announced the acquisition of privately-held MDY, a records management company. Their FileSurf product discovers files distributed all around the enterprise and provides a means for centrally managing them via policy. The acquisition comes on the heels of a licensing arrangement announced earlier this month with Arkivio for their AUTO-STOR “Information Lifecycle Management” technology and the company’s acquisition last November of iLumin, the e-mail archiving firm.

I will travel to Islandia in a week or so to get a briefing on how the company plans to integrate all of these products and technologies on something other than a “brochure level.” It seems to me, after visiting the Web sites of CA, MDY, and Arkivio that their respective product functionality claims step on each other in significant ways. Making sense of which bits and pieces from which products will prevail in a CA-branded architectural model is impossible without an in depth discussion with Anders Lofgren, Senior Vice President, Product Management for CA's Storage Management business unit.

For now, the official press release simply states “CA will market and support the full suite of MDY products and services, which address every aspect of records management. CA also plans to integrate MDY’s records=management technology with CA Message Manager (it’s re-branding of iLumin) to offer customers a solution for managing email and other business documents.”

The Bottom Line

Do the HDS or CA announcements significantly expand the solutions for information management available to consumers in the market? As things stand right now, I have my doubts, but there is good news to go with all the fuzzy marketecture: the announcements do confirm that customers are interested in improving their management of information (or else why would the vendors be so interested in filling this need?).

In the case of HDS, their announcement merely validates the case for content addressing—which I believe should ultimately be done in a completely hardware agnostic manner, and quite possibly as a function of the server operating system. When the HDS solution moves into TagmaStore, we are part of the way toward this goal since any array can be targeted as the repository for the CAS data.

I am also hopeful that HDS can expose a framework model for enabling third party ISVs to plug their software into an information management model that none of them, by themselves, can hope to deliver. That framework is ultimately more important than all the disks in Denmark.

In this effort, however, CA might actually have an edge. I like the idea popular in Islandia that they are trying to do something meaningful to integrate all of the products in the market that claim to do “information lifecycle management” but instead simply do archive-by-data-type or data movement. As CA already proved in its integration of the many products comprising its BrightStor storage management product line, they can do framework-based integration when they put their minds to it. Maybe they can do something meaningful in the realm of information management.

The other bright spot in all of this activity is that it dispels last year’s myth that Legal would soon be running IT. That was always difficult for me to swallow; it was a corollary to the proposition that compliance would drive IT activities going forward. Data management should stand on its own as an umbrella effort driving all of IT’s efforts: the essential task of IT is data management.

If you read the releases from the vendors, the lawyers are no longer the new force in IT. Rather, it is the records-management professionals, who have largely labored in obscurity as the bottom-feeders of the bureaucracy. I suspect they have always been responsible for making businesses run, but have never received recognition. Now it's their day in the sun: enjoy the tan, but watch out for the burn.

Your comments are welcome. [email protected]