News
Hazelcast's Parallel Streaming Engine Targets Java/Big Data Programmers
- By John K. Waters
- February 8, 2017
In-memory data grid (IMDG) specialist Hazelcast Inc. yesterday launched a new distributed processing engine for Big Data streams. The open-source, Apache 2-licenced Hazelcast Jet is designed to process data in parallel across nodes, enabling data-intensive applications to operate in near real-time.
The new processing engine is built on top of a one-record-per-time architecture, sometimes known as the continuous operator model. This approach processes incoming records as soon as possible, as opposed to accumulating records into micro-batches. The result is lowering latency for applications, the company said.
Jet's events-based architecture supports low latency Transaction Processing System (TPS) applications. It uses directed acyclic graphs (DAG) to model data flow, and includes a high-level java.util.stream API and a low-level Core (DAG) API, allowing direct manipulation of vertices representing data source readers, joiners, sorters, aggregators and data sinks. Jet ingests data at high-velocity (via socket, file, HDFS or Kafka interfaces), and processes the business logic or complex computation on incoming data.
The company is marketing Hazelcast Jet to developers of applications that require a near real-time experience, such as sensor updates in Internet of Things (IoT) architectures (think smart-house thermostats and lighting systems), as well as in-store e-commerce systems and social media platforms. Hazelcast CEO Greg Luck described Jet as "a super-fast, low-latency, next-generation DAG Engine for Big Data processing."
"We believe that the Hadoop and Spark ecosystems are too complex to program and to deploy and have set out to bring Hazelcast's legendary simplicity to Big Data," Luck said in a statement. "We have designed it as a general-purpose engine for the intersect of Big Data programmers and Java programmers. But if you are already a Hazelcast user or have data in Hazelcast, it will be the easiest way to solve your Big Data problems."
The company's namesake IMDG, also an open-source product licensed under an Apache, allows developers to include the grid in their applications. The company also makes a commercially supported version called Hazelcast.
Hazelcast has roots planted deeply in the Java language and platform (its IMDG is written in Java), but the company has spread its support over the years, thanks in part to the efforts of its open source community, to include several clients and programming languages, including .NET/C++, Node.js, Python, Clojure and Scala.
About the Author
John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].