Talend Touts First Native Spark Support in Big Data Integration Platform
- By David Ramel
- October 5, 2015
Talend today said an update of its Big Data integration platform makes it the first such offering to include native support for Apache Spark and Spark Streaming.
Spark is an open source project whose update of Big Data processing technology -- especially real-time, streaming analytics -- has made it a favorite in the industry, giving it claims to the most popular open source Big Data project and inviting popularity comparisons to Apache Hadoop itself.
Talend said in a statement today that the addition of more than 100 Spark components to Talend 6 increases data processing speed and lets users more easily convert streaming Big Data or Internet of Things (IoT) sensor information to business insights that can be acted upon immediately.
"For existing Talend customers, the conversion of MapReduce jobs (the old way of doing things in Hadoop) to Spark are accomplished at the click of a button and result in immediate 5x performance increase," Talend's Ashley Stirrup said in a recent blog post. "Developer productivity is up 10x when compared to hand coding thanks to an intuitive design interface and prebuilt Spark components with automated Spark code generation. Talend 6 also provides a built-in Lambda architecture that provides a single environment for working with bulk and batch, real-time, streaming and IoT data."
Other enhancements include automated build, test and release processes that support better continuous delivery, an approach in which development, testing, deployment and operations all work together to deliver applications. Also, Talend said, Master Data Management (MDM) REST API and query language support are improved in the update, which helps developers provide an encompassing view of customers within apps. "This means that customer insight gleaned from both traditional and new Big Data sources like Web logs, social media, mobile and SaaS applications can be leveraged in real-time," the company said.
Furthermore, the company said, security provisions are improved through data masking, which helps conceal personal data as it gets shared for analytics, and a semantic analyzer provides data-type auto-discovery and Hadoop Distributed File System (HDFS) file profiling to help users better understand data.
"We believe Apache Spark has an opportunity to become the default in-memory engine for high-performance data integration and analytics," Talend quoted 451 Research analyst Matt Aslett as saying. "Building on its existing capabilities for in-Hadoop MapReduce processing with early native Spark and Spark Streaming support, Talend is positioned to capitalize on the demand for real-time analytics."
Talend said its free open source products based on version 6 of its platform are available immediately, while its commercial offerings are available as version 6.0.1 for download and are also being rolled out to subscription customers.
David Ramel is an editor and writer for Converge360.