Oracle Big Data Appliance Featuring Hadoop Released -- ADTmag

Oracle Big Data Appliance Featuring Hadoop Released

By John K. Waters
January 12, 2012

Oracle this week officially released the Oracle Big Data Appliance, its new "engineered system" that tightly bundles servers and software into a unified system.

The system combines full rack configurations of Oracle Sun servers with the Cloudera distribution of the Apache Hadoop software framework, the Cloudera Manager admin and management console, and an open source distribution of the R programming language (for statistical computing and graphics).

The system is designed to run on Oracle Linux and includes the community edition of the Oracle NoSQL database as well as Oracle's HotSpot Java Virtual Machine.

Oracle's Executive Vice President of Product Development Thomas Kurian previewed the Big Data Appliance in October at his company's annual OpenWorld conference. Oracle's plan to include a NoSQL DB generated a lot of buzz. NoSQL, the non-relational, distributed, schema-free, open-source, horizontally scalable DBs that emerged around 2009, have been getting attention as the most effective database for the Web, the cloud, and mobile computing. There are quite a few of them out there: Google, Amazon, Facebook and LinkedIn all have NoSQL databases.

But the Cloudera collaboration to create a system that makes Apache Hadoop work with Oracle's product stack is really the centerpiece of this announcement. The two companies are working together to provide support for the Big Data Appliance, Cloudera's co-founder and CEO Mike Olson said. The combination is "a natural and highly complementary fit," he said in a statement.

Palo Alto, Calif.-based Cloudera is a provider of Hadoop system management tools and support services. It's Hadoop distro, dubbed the Cloudera Distribution Including Apache Hadoop, or CDH, is a data management platform that combines a number of components, including support for the Hive and Pig languages, the Apache Zookeeper distro coordination service, the Flume service for collecting and aggregating log and event data, Sqoop for RDMS integration, the Mahout library of machine learning algorithms and the Oozie server-based workflow engine, among others. The CDH is available as a free download.

The Hadoop Framework is an increasingly popular, Java-based, open-source platform for data-intensive distributed computing. In a nutshell, it's a system that can analyze a large amount of data in a small amount of time. At its core, it's a combination of Google's MapReduce and the Hadoop Distributed File System (HDFS). MapReduce is a programming model for processing and generating large data sets. It supports parallel computations over large data sets on unreliable computer clusters. HDFS is designed to scale to petabytes of storage and to run on top of the file systems of the underlying OS.

Oracle is offering its Big Data Appliance in full rack configurations of 18 Sun servers. Each rack will provide 864 gigabytes of main memory, 216 CPU cores, and 10 gigabit-per-second Ethernet data center connectivity, among other features. The system scales via connections of multiple racks linked through the InfiniBand network.

In a related announcement, Oracle released its Big Data Connectors, software designed to allow users to integrate data stored in Hadoop and Oracle NoSQL DBs with Oracle Database 11g. The software bundle combines the Oracle Loader for Hadoop, the Oracle Data Integrator Application Adopter for Hadoop, Oracle Connector for HDFS and Oracle Connector R.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Securing IT in the AI Era
July 23, 2025

VSLive! 4-Hour In-Depth Workshop: Immersive .NET Full Stack Training: C# Interfaces: Effective Usage while Avoiding Pitfalls
July 29, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

4-Hour VSLive! Workshop: Testability in .NET
August 27, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

VSLive! 2-Day Hands-On Training Seminar: Hands-On with .NET Web Development in 2025
October 7-8, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Visual Studio Live! Las Vegas
March 16-20, 2026

Free White Papers

More Tech Library