At LinuxWorld Expo: Amazon moving to Linux for data warehousing

A keynote presentation at LinuxWorld Expo in New York last week provided a look inside one of the biggest Linux shops going: Amazon.com, king of all things e-commerce. The Amazon operation, as described by speaker Tom Killalea, has moved increasingly to rely on Linux boxes to deliver its online product. Data warehousing software will be next to make the move to Linux at Amazon.

The incursion of Linux is so extensive that, at this point, it is easier to identify what does not run on Linux than to enumerate what does, said Killalea, vice president of infrastructure at Amazon.com.

Killalea outlined the key steps in the company's full-scale move to Linux. In 2000, HTTP server farms made the transition. In 2001, remaining commercial app servers moved to Linux. Put another way, server load balancing was the first to go to Linux, followed by active/standby fault-tolerant clusters and then distributed message queuing systems. Beginning last year and completed early this year, DB servers are now getting the Linux treatment.

"We want low-maintenance overhead with interchangeable parts," said Killalea, who added that Amazon is endeavoring to take a new approach to the usual technology tradeoff "between fast, reliable and cheap." You usually have to pick one or two of those traits, he explained, but the effort now is to gain all three; Linux is an enabler in that effort.

Still ahead are the Linux-based data warehouse systems. This is an "interesting case," he quipped. As is appropriate for this mega-site, the data warehouse is significant. Killalea said it was "more than 14 Terabytes to start." Here, throughput is a special area of interest for Amazon, which specializes in personalized site presentation.

"Personalization is hard because you have to optimize for relevance -- it has to be very targeted -- and it has to be [achieved with] low latency," he added. "Prior to this initiative, data warehousing on Linux has been very small scale."

Of course, technology practitioners are not simply interested in Amazon.com because of its Linux usage. The firm's use of technology and the Web to grow to a company with yearly revenue of $3.9 billion is based on its unique approach to using technology for business purposes.

Amazon's push, as described by Killalea, is by no means totally reliant on standard or off-the-shelf software. Custom logic servers are a major part of the operation. The company is serving not just consumers, but other merchants as well. For example, NBA.com and Target are among those using Amazon's technology to power their own online stores. "That all runs on Linux," he said. Developers, too, are a focus.

"We're not just a Web site, we are an application," said Killalea, who said Amazon's Web Services SDK has "given developers tools to access features of the Amazon.com platform." He estimated that there have been more than 50,000 downloads of Amazon's Web services SDK.

For more Linux news, go to ADT LinuxWorld News Page

About the Author

Jack Vaughan is former Editor-at-Large at Application Development Trends magazine.