Big Data Product Watch 10/31/13: To the Cloud!

Microsoft and Amazon Web Services this week highlighted a Halloween handout of Hadoop product announcements with a common theme of enhancing cloud-based analytics.

  • Microsoft this week announced, after months of beta testing, the general availability of its Windows Azure HDInsight Hadoop distribution. Quentin Clark, head of the company's Data Platform Group, made the announcement from the Strata + Hadoop World conference in New York, where he presented a keynote address reiterating the company's goal to bring Big Data analytics to 1 billion people.

    Windows Azure HDInsight aims to help meet that goal by providing Hadoop-based analytics via integration with familiar business intelligence tools such as the Excel add-in PowerPivot and data exploration/visualization tool Power View.

    "We have put in thousands of engineering hours and tens of thousands of lines of code," in bringing Hadoop to Windows in conjunction with partner Hortonworks, Clark said. Hortonworks itself last week announced the general availability of Hortonworks Data Platform 2.0, based on the recent release of Hadoop 2. The Hortonworks HDP (Hortonworks Data Platform) 2.0 for Windows Server will be available next month, Clark said, and Microsoft will support Hadoop 2 "in a future update to HDInsight."

  • Amazon Web Services, a competitor of Microsoft's Windows Azure in the cloud computing space, this week announced its own support of Hadoop 2 as part of improvements to its Elastic MapReduce (EMR) service. EMR lets users distribute Big Data processing to a resizable cluster of Amazon's Elastic Compute (EC2) instances. Other improvements to EMR include support for recent versions of HBase, MapR M7 for Hadoop and HBase, Hive and Pig.

    "Amazon EMR is used in a variety of applications, including log analysis, Web indexing, data warehousing, machine learning, financial analysis, scientific simulation and bioinformatics," according to the AWS announcement.

  • Rackspace this week announced early access to its Cloud Big Data Platform, which has been in customer preview. "This cloud-based Hadoop service allows you to spin up a fully configured and optimized Apache Hadoop cluster in minutes and use popular Hadoop tools like Pig and Hive with the added flexibility and ease of use the cloud offers," the company said.

    Rackspace also announced its "Managed Big Data Platform" is available on dedicated servers and external storage (the latter through a partnership of Hortonworks and EMC targeting the EMC Isilon device). The company also announced the Hortonworks HDP is available for use on the Rackspace Private Cloud.

  • Cloudera this week announced its Cloudera Enterprise 5 has been released in beta. The offering bundles the company's Hadoop distribution, CDH, and its Cloudera Manager Hadoop administration tool. The Cloudera Enterprise 5 beta has integrated the company's Impala real-time Hadoop query tool and Cloudera Search tool. It includes ready-made and custom analytic functions for Impala, support for multi-tenant environments, central deployment and management of third-party applications, data protection with Hadoop Distributed File System and HBase snapshots and easier data migration with Network File System Version 3 support. Cloudera also announced the Cloudera Connect: Cloud partnership program with initial members Verizon Enterprise Solutions, Sawis, SoftLayer and T-Systems.

In other recent Big Data happenings, Red Hat contributed its Hadoop plug-in to the Gluster open source storage community; Revolution Analytics announced its Revolution R Enterprise Big Data analytics platform powered by the R statistical computing programming language; MapR Technologies announced the integration of security and authentication capabilities with its Hadoop distribution; Dataguise enhanced its DG for Hadoop data privacy solution; Pivotal integrated its GemFire XD in-memory transactional database with its Hadoop offering, PivotalHD; and Splunk announced the general availability of its Hunk integrated analytics platform for Hadoop.

About the Author

David Ramel is an editor and writer for Converge360.