In-Depth

A Database Archiving Solution Like No Other

Though StorHouse is principally a data warehousing system, its stronger value may be as a database archiving solution.

FileTek is a warhorse of a company. For nearly 20 years, the software development house has steadily improved its data management offerings and made quiet inroads into some of the world’s most prestigious companies. Even so, the company may have escaped your attention owing mainly to its focus on product development and customer support and a lesser concern with marketing.

I first encountered FileTek when looking for a solution to archive output from the mainframes of a mortgage banking company. FileTek was one of the pioneers of "computer output to laser disc" (COLD) technology in the 70s and 80s and later created technology for data warehousing that was also ahead of the market. It wasn’t that the company was striving to be on the cutting edge; they were simply listening to what their customers were telling them—and developing products to meet those needs.

I was thinking about this when the taxi pulled up at FileTek's headquarters in Rockville, MD, just outside the Washington, D.C. beltway. It seemed like old times.

I also wondered about the change of management that had just been announced at the company. William Loomis, who had moved up through the ranks after joining FileTek over a decade ago, had just been tapped as the company’s chief executive officer. At the same time, company co-founder and retiring CEO, William Thompson, went outside the company to bring in a firebrand named Phil Pascarelli to serve as FileTek’s new president.

I later learned that Loomis will serve as the "inside man," managing corporate operations and finances, while Pascarelli will act as the "outside man," tasked to develop new markets and new vendor ecosystems and to steer a growing number of features and functions into the company’s core product: StorHouse.

StorHouse is interesting. While categorized by its developers as a data warehousing system that enables organizations to copy (or cut) subsets of their databases into more manageable chucks for use by data mining and business intelligence tools, it is in fact an extraordinarily versatile and powerful code set that can be applied readily to database archiving. After visiting with the folks at FileTek, I think you are going to be hearing a lot more about this other use.

StorHouse is a fine data warehousing tool and has established itself on a price/performance basis as a worthy competitor to the leaders in that space, including data warehousing appliances such as Netezza. If you need a warehousing solution, by all means take a look at StorHouse.

The Archiving Angle

That said, StorHouse also presents enormous value as a technology for archiving databases. We have written about other products and how useful they can be in carving out rarely referenced data from bloated databases, thereby stalling the rate of new storage investments and improving the efficacy of backup processes. However, I have never seen a product like StorHouse in the database archiving space.

Why? I suppose it is because StorHouse was not originally designed as a database archiving tool. It doesn’t have that "utility software" feel about it. The data extracted and stored in StorHouse has all the bells and whistles of a full-blown relational database management system: the product was designed, after all, to provide an easy-to-use data mart that business people can hammer against with their BI queries.

With this in mind, I can think of about a dozen of my clients who would do well to give StorHouse a look. Some would find it interesting because of what it can do for making their unwieldy, never archived, databases more manageable. You simply note the parts of the database that you wish to replicate or cut from the existing database and point StorHouse toward the storage repository where you want the output to go. The product uses a well-honed and well-understood FTP-based data mover to do the job and the resulting archive or data mart is a read-only database you can query.

If you use StorHouse this way, say, once a month or twice a year to carve out older data, you can still access all of the extracted data sets and run queries across them using standard calls. ODBC is supported on the standards side, and SAP and Oracle are fully supported on the proprietary software side.

Other prospective users might want to take advantage of StorHouse to defer the purchase of expensive data warehousing appliances. Why spend $750,000 on an appliance with its own dedicated storage, when you could deploy StorHouse at a much lower price and use whatever legacy storage you already have?

Several of my clients might be interested in this product because of the way it enables multiple data warehouses from multiple (and disparate) databases to communicate with each other in a normalized manner. By using StorHouse to create the data marts, you also enable StorHouse to readily handle the extension of queries across multiple marts. I can see where this would come in handy following mergers and acquisitions.

Loomis and Pascarelli are being handed a best-of-breed product based on a stable code set. Their job now is to increase the visibility of StorHouse in the market, and to make sure that the messaging aligns with what companies understand their needs to be. Data marting and mining still has a following, particularly in retail, oil and gas, telecommunications, and other industries that crunch numbers for a living. However, the database archiving slant for the technology is potentially much more saleable (and lucrative) across a broad range of markets.

Virtually every company is seeking ways to optimize storage infrastructure and data protection investments. Taking data with a low probability of re-reference out of primary storage and placing it on cheaper storage is a common goal in the companies I visit. StorHouse provides a mechanism for doing this in the database world.

FileTek has a daughter company, Clearview, which handles fixed-content archiving. From where I’m sitting, if Loomis and Pascarelli combine the two products under a common manager of managers, they will be well on their way to having a general purpose, cross-type, data-archiving solution.

I'll be watching this story unfold, and you should be, too. Your comments are welcomed: [email protected].