VMware Updates Big Data Extensions
VMware Inc. on Friday updated its vSphere Big Data Extensions (BDE), adding support for Hadoop 2, along with other enhancements.
The set of integrated management tools helps users of the vSphere virtualization platform deploy, run and manage Hadoop clusters with the vCenter management tool.
Friday's announcement came exactly two years after VMware announced its related open source project Serengeti to facilitate the running of Apache Hadoop in virtualized environments. BDE is the enterprise version of project Serengeti, bundled with commercial support. The software deploys Hadoop components such as Hadoop Distributed File System (HDFS), MapReduce, Pig, Hive and HBase on the vSphere platform.
With the release of BDE 2.0, the software now supports the latest distributions of Apache Hadoop 2.0. Specifically, new distributions supported include Apache Bigtop 0.7.0, Cloudera CDH5, Hortonworks HDP 2.1, MapR 3.1 and Pivotal PHD.2.0.
Other enhancements include: the Hadoop Template Virtual Machine now uses CentOS 6.4 as its default OS; the Serengeti Management Server now supports IPv6 network addressing; new support for Internationalization Level 1; a central Web UI called the Serengeti Management Server Administration Portal for viewing, managing and troubleshooting Serengeti Services; and improved error handling.
VMware also combines BDE with its vCloud Automation Center to provide an on-premises Hadoop-as-a-Service offering
"BDE enables customers to run clustered, scale-out Hadoop applications on the vSphere platform, delivering all the benefits of virtualization to Hadoop users," VMware said. "BDE delivers operational simplicity with an easy-to-use interface, improved utilization through compute elasticity, and a scalable and flexible Big Data platform to satisfy changing business requirements."
For the under-the-hood details about how BDE works, VMware provides the following:
BDE is a downloadable virtual appliance integrated as a plugin to vCenter server. BDE requires a vSphere 5.0 or later license and an Enterprise or Enterprise Plus license. The Serengeti virtual appliance runs on top of vSphere and includes two virtual machines: Serengeti Management Server and the Hadoop Template Server. The Serengeti Management Server handles creation of the cluster, including creation and configuration of the virtual machines and assignment of Master node and Slave node roles. Once the cluster is created, the Serengeti Management Server then clones the Hadoop template to create and scale out the cluster. Once this is complete, the Serengeti Management Server starts the Hadoop service. BDE is controlled and monitored through the vCenter server.
vSphere is available for download as a free trial and pricing is available here.
David Ramel is the editor of Visual Studio Magazine.