News

Apache Advances Open Source Lens Big Data Platform

The Apache Software Foundation (ASF) has advanced the open source Lens project for unified Big Data analytics, providing a single view of multiple tiered data sources.

The project addresses the common problem of working with data silos, disparate sources of data that typically require a lot of plumbing to connect together for single queries running across all stores. By providing a metadata layer that abstracts multiple data stores into a single view, it provides an optimal execution environment facilitating unified queries.

"By providing an online analytical processing (OLAP) model on top of data, Lens seamlessly integrates Apache Hadoop with traditional data warehouses to appear as one," the ASF said in a statement yesterday. "It also provides query history and statistics for queries running in the system along with query life cycle management."

Lens uses a single shared schema server based on the Hive Metastore, which is shared by data pipelines (HCatalog) and analytics applications, the project's site states.

The ASF graduated Lens to a top-level project, meaning "the project's community and products have been well-governed under the ASF's meritocratic process and principles."

Apache Lens Architecture
[Click on image for larger view.] Apache Lens Architecture (source: Apache Software Foundation)
The project includes the following features, the ASF said:
  • OLAP Cube QL, a high-level SQL-like language to query and describe data sets organized in data cubes.
  • A JDBC driver and Java client libraries to issue queries, along with a command-line interface (CLI) for ad hoc queries.
  • Lens application server, a REST server that lets users query data, make schema changes, schedule queries and enforce query quota limits.
  • A driver-based architecture that facilitates plugging in reporting systems such as Hive, columnar data warehouses, Redshift and more.
  • A cost-based engine selection for optimal use of resources via the selection of the best execution engine for queries based on their cost.

"Incubating Apache Lens has been an amazing experience at the ASF," said project exec Amareshwari Sriramadasu in yesterday's statement. "Apache Lens solves a very critical problem in Big Data analytics space with respect to end users. It enables business users, analysts, data scientists, developers and other users to do complex analysis with ease, without knowing the underlying data layout."

According to the project's GitHub site, Lens has been developed by 19 master contributors since May 2013, led primarily by Sriramadasu, Jaideep Dhok and Rajat Khandelwal, who work at InMobi, which provides a performance-based mobile ad network.

About the Author

David Ramel is an editor and writer for Converge360.