MapR Adds Some Spark to its Hadoop Distribution -- ADTmag

MapR Adds Some Spark to its Hadoop Distribution

By David Ramel
April 11, 2014

MapR Technologies Inc. yesterday announced it had added the Apache Spark technology stack to its Hadoop distribution, one of the leading tools in the fast-growing arena of Big Data analytics.

Apache Spark, recently updated to a top-level project by the open source Apache Software Foundation, is an in-memory, distributed computing framework that improves Big Data analytics processing with Hadoop. Often used as a superior replacement for the original batch-oriented MapReduce technology, it works with newer technologies found in Hadoop 2 such as the YARN resource manager to boost processing of data in the Hadoop Distributed File System (HDFS) or any other Hadoop data store, such as HBase or Cassandra.

MapR has one of the leading Hadoop distributions, fighting with competitors such as Cloudera and Hortonworks to gain market share, secure funding and land big customers in the burgeoning Big Data market. MapR said the in-memory processing technology of Spark provides speed, easier programming and real-time processing capabilities. It's adding the Spark stack to its Hadoop distribution through a partnership with Databricks, a company founded last fall by the creators of Spark, which was originally developed at the University of California, Berkeley.

"We are now the only Hadoop distribution to support the complete Spark stack, including Spark, Spark Streaming (stream processing), Shark (Hive on Spark), MLLib (machine learning) and GraphX (graph processing)," said MapR executive Tomer Shiran in a blog post yesterday.

MapR said Spark provides two main benefits: application performance and developer productivity.

Developers are more productive, the company said, because Spark requires much less code to be written, as much as 1/5 of what's normally needed. Also, the simple programming abstraction it offers lets developers use multiple languages to design batch, interactive and streaming applications that operate on data collections. Developers can use Java, Scala and Python, with support for R reportedly coming.

"It has become clear that Apache Spark offers a combination of high-performance, in-memory data processing and multiple computation models that is well suited to serving as the basis of next-generation data processing platforms," MapR quoted 451 Research analyst Matt Aslett as saying. "MapR's support for the complete Spark stack, combined with its partnership with Databricks, should give Hadoop users the confidence to start developing applications to take advantage of Spark's performance and flexibility."

MapR said the addition of Spark, which incorporates the five separate Apache open source projects listed earlier by Shiran, brings the total number of such projects featured in its Hadoop distribution to more than 20.

The company said the new addition means its customers can get round-the-clock help for all Spark stack projects. It also said it will work with Databricks to develop a roadmap for further development and increase the cadence of new innovations. MapR claims to be the only Hadoop vendor offering a monthly release cadence for its distribution.

MapR and Databricks are conducting an April 29 webinar where developers can learn more about the benefits of using Spark in the MapR Hadoop distribution.

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Visual Studio Live! @ Microsoft HQ
July 27-31, 2026

Visual Studio Live! @ San Diego
September 14-18, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

VSLive! 6-Week Training & Certification Course: Blazor Developer Accelerator: Hands-On Skills for Real-World .NET Teams
October 7 – November 11, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

Visual Studio Live! Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
December 15-18, 2026

Visual Studio Live! Las Vegas
March 22-26, 2027

Visual Studio Live! @ Microsoft HQ
August 2-6, 2027

Free White Papers

More Tech Library