Pivotal Open Sources SQL-Based HAWQ Big Data Technology -- ADTmag

Pivotal Open Sources SQL-Based HAWQ Big Data Technology

By David Ramel
September 30, 2015

Pivotal Software Inc. open sourced its SQL-based HAWQ analytics engine for Big Data processing. The company contributed HAWQ and MADlib, an associated parallel machine learning library -- which was already an open source project -- to the Apache Software Foundation (ASF).

The move follows the open sourcing of Pivotal's Greenplum massively parallel processing data warehouse based on PostgreSQL technology in February, when the company said it was contributing that and other core components of its Big Data Suite to the community.

Pivotal, a 2013 joint "spin-out" of parent EMC Corp. and its VMware Inc. subsidiary, known for its Cloud Foundry Platform-as-a-Service (PaaS), has now followed through on that commitment.

"What this means for Hadoop, its users, and the larger community is that there is now a full-featured, SQL standards-compliant, battle-tested and proven interactive SQL engine purpose built for demanding analytical workloads and business transformation available in open source," Pivotal's Gavin Sherry said in a blog post yesterday.

Apache HAWQ will be able to seamlessly execute parallel MADlib machine-learning algorithms as a native application running on any Hadoop cluster based on Pivotal's Hadoop distribution (Pivotal HD), the Hortonworks Data Platform (HDP) from Hortonworks Inc. or the upcoming ODPi core. The latter, it was announced this week, is an open collaborative project to be stewarded by The Linux Foundation to provide a Hadoop-based common reference platform and set of technologies. It stems from the Open Data Platform announced in February by Pivotal and a host of other Big Data players, a movement subsequently blasted by major Hadoop distributor MapR Technologies Inc.

In addition to partnering up with MapR competitor Hortonworks, Pivotal is also teaming up with Altiscale Inc., which yesterday announced the future availability of HAWQ on its Altiscale Data Cloud.

"SQL on Hadoop has been called 'Hadoop’s killer app' by industry analyst Mike Gualtieri of Forrester," Altiscale said in a blog post. "The combination of a robust, massively parallel processing (MPP) SQL engine with the high-performing and scalable Altiscale Data Cloud ensures that enterprise customers can quickly make the most of their Big Data, accessing vast amounts of business information in Altiscale and applying it to pressing business issues using HAWQ."

Pivotal's Sherry said the decision to contribute the company's technology to the ASF was made in light of a radical transformation of the database industry in the past 10 years, including the mainstreaming of the open source movement and rise of enterprise mobility and Internet of Things (IoT) workloads. Other contributing factors include startups and Internet companies taking a more leading role in database research, usurping traditional vendors (several industry pundits characterized Pivotal's move as a direct shot across the bow of traditional database incumbent Oracle Corp.).

The need for tools to work with Hadoop as the inevitable substrate of new-age data warehousing and analytics is "bigger than Pivotal, or Pivotal's customers," Sherry said.

"We feel that by contributing HAWQ and MADlib to the ASF, making them bigger than Pivotal, and continuing to integrate them deeply into the Hadoop ecosystem is a first big step toward building not only a Hadoop native SQL engine, but ultimately an entire Hadoop native, datacenter-class, high-performance analytic database infrastructure," Sherry said.

Apache HAWQ and Apache MADlib will be ASF "incubating" projects, a first step to becoming accepted as a top-level project after meeting certain governance criteria.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
July 9-10, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
July 14-17, 2026

Visual Studio Live! @ Microsoft HQ
July 27-31, 2026

Visual Studio Live! @ San Diego
September 14-18, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

VSLive! 6-Week Training & Certification Course: Blazor Developer Accelerator: Hands-On Skills for Real-World .NET Teams
October 7 – November 11, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

Visual Studio Live! Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
December 15-18, 2026

Visual Studio Live! Las Vegas
March 22-26, 2027

Visual Studio Live! @ Microsoft HQ
August 2-6, 2027

Free White Papers

More Tech Library