Cloudera Releases SQL-on-Hadoop Tool -- ADTmag

Cloudera Releases SQL-on-Hadoop Tool

By David Ramel
May 1, 2013

Cloudera Inc. yesterday announced the release of its interactive, real-time query engine for Apache Hadoop.

Called Impala 1.0, the open source project has been in public beta since October 2012. "With Impala, users can query data stored in HDFS and HBase directly" Cloudera said. "The framework supports all standard file and data formats available, so users can choose the format that best suits their use case, including the latest in analytics-focused columnar formats like Parquet, and can promote data sharing and reuse across all computing workloads--from batch to interactive SQL--all from a single dataset."

The company said this approach reduces latency found in legacy data warehouse environments and doesn't require having to use proprietary formats or move datasets as it exists with Hive into special systems for analysis. It said Impala SQL is a subset of HiveQL, with some limitations. It supports common Hive interfaces such as JDBC and ODBC drivers and Hue Beeswax.

Impala doesn't use MapReduce. Cloudera said the tool is primarily designed for ad-hoc SQL queries for interactive and exploratory analysis on large datasets, while Hive and MapReduce are preferable tools to use for extremely long-running batch tasks such as with Extract, Transform and Load (ETL) systems. The company said it anticipates Impala also will be used when low latency is needed. It said Complex Event Procesing is ususally done with stream-processing systems. Impala, instead, "most closely resembles a relational database," the company said.

As an Apache-licensed open source project, the Impala platform's souce code is hosted on GitHub and is open to community contributions. Impala can be downloaded via the company's own installation and configuration manager or manually.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Live! 360 2-Day Hands-On Seminar: From Traction to Production: Building Generative AI Applications with Azure AI Studio
March 25-26, 2025

VSLive! 4-Day Hands-On Training Seminar: Hands-on with Blazor
May 5-8, 2025

Cybersecurity & Ransomware Live! VirtCon 2025
May 13-15, 2025

VSLive! 3-Day Hands-On Training Seminar: Master Modern JavaScript: Unlock the Full Potential of Your Code
June 2-4, 2025

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Free White Papers

More Tech Library