Getting Started on Data Management

The only way to solve the seemingly intractable problems of storage management is to do so strategically.

Next week in St. Louis I will be completing a series of four presentations, sponsored by Computer Associates, covering storage management. The talks have been well-attended and well-received, and may be continued to more cities in a couple of months. I hope they are, because the feedback from the storage managers, IT managers, and CIOs regarding their storage challenges has been illuminating.

For those unable to attend the events, I want to share a story I’ve been telling and the responses from those who joined the conversation in Raleigh, Minneapolis, and Vancouver.

My objective has been to discuss what I call the new storage management. The new storage management is really data management, an umbrella concept covering the placement of data on the right hosting platform and making sure that it has appropriate protection based on the characteristics of the data itself, which are largely inherited from the application that creates the data in the first place and the business process that it serves.

My contention is that unless we start getting serious about data management, understanding our data and allocating storage resources and services to it in an intelligent manner, storage costs will soon become excessive for most companies. In addition, I think that the industry is not going to deliver data management, partly because doing so is not in the best interests of vendors who prefer to sell “stovepipe solutions,” but also because it may be impossible for the hardware folks to do much for cleaning up the customer’s “junk drawers” of storage. We are mostly left to our own devices to develop a managed data environment.

From what I've heard in conversations with IT managers, CIOs, and others at these presentations, most understand the importance of data management. With budgets under close scrutiny, and confronting hiring freezes, they are being told to increase service levels as their spending levels are being cut. Data growth is continually increasing, but IT’s ability to throw more unmanaged storage capacity at the problem, supplemented with more techs and administrators to keep the gear up and running, is diminished.

For the most part, FC SANs haven’t fixed the problem, nor have consolidation strategies based on bigger arrays, or any of the other panaceas offered by their trusted vendors. If anything, problems have been made worse. Now, instead of one junk drawer, they have a cabinet full of junk drawers. Applications are writing data everywhere, though specific locations are masked from view by vendor-supplied virtualization schemes. Backup is a nightmare: critical and non-essential data are mixed together so inextricably that identifing specific data that needs to be backed up becomes impossible. A few have started down the management-by-data-type or management-by-archive path. In November, CA bought an e-mail archive vendor, Ilumin. E-mail is one of at least four types of data that most companies have in abundance, the others being databases, workflow data (the output of content management systems), and, of course, “unstructured data”—a tactful, if technically inaccurate, reference to user files. The thinking goes that if we continuously identify data of each type that isn’t being referenced, we can archive it and get it off our expensive disk.

This is another tactical measure for coping with the data explosion. It will help to buy time, but it will not solve the ultimate problem of data management: managing data itself. Over time, archives can become unwieldy and many schemes proffered by the industry today, especially those wedded to dedicated hardware, are patent vendor lock-in strategies.

Moreover, there is no manager of managers overseeing the multiplicity of archives being generated. UK-based BridgeHead Software is working on building such a central manager. They work with a variety of partners, including GridTools for database archiving, and others for e-mail and workflow data, and provide a central metadata repository to help track where everything goes and the policies that place it there. It’s a start.

Combined with an intelligent and cooperative effort of both IT and business managers in setting limits on monthly storage allocations, chargeback strategies to check excessive use, and safety-valve strategies (such as the unlimited availability of tape), companies can buy themselves time to implement a real data management strategy. Such a strategy is keyed to an intelligent analysis of business process and the data it creates. It provides a naming scheme for data and subsequently uses that naming scheme in an automated, service-oriented, policy-based management methodology that is applied when data is first created and captured to media and that continues until the data has finished its useful life.

We need to buy time for several pragmatic reasons. First, no one has the budget to rip and replace infrastructure and start over. We will need to evolve to the right storage platforms and processes as part of the natural evolution and replacement of our infrastructure. Because we retain hstorage data between three and seven years (depending on the company), it will take at least that long to get all of our infrastructure components purpose-built to support our applications, their data, and our management scheme. We will need to replace our SANs that really aren’t SANs with real IP network-based storage. We will need to adopt an architecture of distributed storage with centralized management. And we will need to encourage the development of the software tools that we lack today for managing resources, services, and policies.

In the final analysis, the only way to solve the seemingly intractable problems of storage management is to do so strategically. Tactical measures will only take us so far.

Your views are welcomed. Write to [email protected].

About the Author

Jon William Toigo is chairman of The Data Management Institute, the CEO of data management consulting and research firm Toigo Partners International, as well as a contributing editor to Enterprise Systems and its Storage Strategies columnist. Mr. Toigo is the author of 14 books, including Disaster Recovery Planning, 3rd Edition, and The Holy Grail of Network Storage Management, both from Prentice Hall.