Cloudera Adds SQL Workload Optimizer to Enterprise Big Data Offering

The latest update to Cloudera Inc.'s enterprise Big Data solution includes a new tool, Cloudera Navigator Optimizer, which helps offload SQL workloads to Hadoop for more efficient analytics processing.

After a beta program, Cloudera Navigator Optimizer is a new component featured in the company's enterprise offerings, including Cloudera Enterprise 5.8, just released today.

The multi-faceted Navigator Optimizer "makes it easy to move the right workloads to Hadoop, and actively manages data to take advantage of Hadoop's benefits," its site says.

When introduced last November, it was described by Cloudera as "a unique, workload optimization tool (available via SaaS) that helps DBAs, data warehouse architects, and data analysts adopt an informed and systematic approach to getting optimal results with Hadoop. In summary, Cloudera Navigator Optimizer profiles and analyzes the SQL text in large, complex SQL workloads so users can gain an in-depth understanding of their workloads, identify queries best-suited for Hadoop and modify them as needed for optimal efficiency on Hadoop -- all via an easy-to-use Web UI."

Another enhancement noted by Cloudera in its updated enterprise solution is cloud-native, high-performance analytics added to its Impala component (updated to v2.6), a massively parallel processing (MPP) SQL query engine. "Impala now supports Amazon S3 -- joining Hive and Apache Spark -- to enable analysts to get instant insights from all their data, including data stored in cloud environments," the company said.

Cloudera Navigator Optimizer
[Click on image for larger view.] Cloudera Navigator Optimizer (source: Cloudera)

The SQL development experience has also been improved with a redesigned Hue (standing for Hadoop User Experience) an open source Web GUI SQL editor for interacting with Hadoop. Cloudera said, "Within the open-source Hue interface, SQL developers can easily explore and discover available tables through quickviews; quickly design queries with autocomplete suggestions; and get immediate assistance for debugging queries before they run for efficient troubleshooting. In addition, queries and results can be seamlessly shared with other users and groups, with the ability to directly set access permissions on results for trusted security."

Cloudera Enterprise
[Click on image for larger view.] Cloudera Enterprise (source: Cloudera)

Other "what's new" items in Cloudera Enterprise include:

  • Hive jobs can now be assigned to specific YARN resource pools based on Sentry policy (instead of default "hive" pool).
  • Apache Sentry support for Amazon S3.
  • Role-based access control (Sentry) support for Impala and Hive queries over Amazon S3.
  • Sentry support for Cloudera Search.
  • Hive metadata purge supported in Cloudera Navigator (for larger deployments).
  • New policy support for managed metadata assignment.
  • Navigator SDK support for Navigator Optimizer integration.
  • Debian 8.2 support.
  • Oracle JDK 1.8u74 and 1.8u91 support.
  • Impala queries now 3x faster on Kerberized clusters.
  • Significant performance improvements to Hive metadata replication in Backup and Disaster Recovery (BDR).

The 100 percent open source Cloudera Enterprise 5.8 can be downloaded now. Note that the download points to CDH 5.8.0, with CDH standing for Cloudera Distribution Including Apache Hadoop. Cloudera Enterprise comprises CDH 5.8, Cloudera Manager 5.8 and Cloudera Navigator 2.7.

About the Author

David Ramel is an editor and writer for Converge360.