News
Cloud Computing Leaving Relational Databases Behind
- By Joab Jackson
- September 22, 2008
One thing you won't find underlying a cloud computing initiative is a relational
database. And this is no accident: Relational databases are ill-suited for use
within cloud computing environments, argued Geir Magnusson, vice president of
engineering at 10Gen, an on-demand platform servicer provider.
Magnusson, who also helped write the Apache Geronimo application server software,
spoke at the O'Reilly Web 2.0 conference, being held this week in New York.
"Cloud computing is different kind of technology," he said. "It
is different enough it will change how we do things as developers. We will have
to re-examine how we build things."
During his talk, Magnusson listed a number of new databases created specifically
to work in a cloud computing environment. They include Google's Bigtable, Amazon's
SimpleDB, 10Gen's own Mongo, AppJet's AppJet database and the Oracle open-source
BerkelyDB.
None of these databases, Magnusson pointed out, are relational ones (He did
point out one notable exception, a version MySQL tweaked for Web environments,
called Drizzle.
These databases all have characteristics that make them uniquely suited to
serving cloud computing-styled applications. Most of these databases can be
run in distributed environments -- meaning that they can be spread out over
multiple servers in multiple locations. None of them are transactional in nature.
And they all sacrifice some advanced querying capability for faster performance.
(In many cases, these databases can be queried using object calls, rather than
SQL queries, which programmers are far more comfortable working with anyway.)
Although very large relational databases, such as those offered by Oracle,
have been implemented in data centers, cloud computing requires a different
kind of setup to operate to its full potential. It necessitates that the database
material be spread across different locations -- hence the name cloud computing.
Executing complex queries across vast geographic distances can slow response
time; moreover, it is difficult to design and maintain an architecture to replicate
relational data across different locations and keep that data in sync if one
location goes down.
"The scale out of [cloud] architectures have properties that are different
from the ones we work on," he said. As a result, in cloud environments,
"no one is doing relational. Data is being targeted in a clustered fashion,"
he said.
Magnusson's view was echoed by another speaker at the Web 2.0 conference, Alex
Iskold of AdaptiveBlue, a consumer-oriented company that offers a browser plug-in
featuring personalized recommendations based on a user's history, using semantic
tags and Web services. The company built the service on Amazon's hosted platform
services, including SimpleDB. Iskold noted that such a service would not scale
up to widespread use if AdaptiveBlue used a relational database for the job.
About the Author
Joab Jackson is the chief technology editor of Government Computing News (GCN.com).