Big Data Product Watch 8/28/15: Streaming Analytics, High-Performance Computing and More -- ADTmag

Big Data Product Watch 8/28/15: Streaming Analytics, High-Performance Computing and More

By David Ramel
August 28, 2015

IBM, Intel and Hortonworks are among a host of Big Data players making news recently with streaming analytics, high-performance computing and open source options galore.

IBM updated its Open Platform for Apache Hadoop, a free download based on 100 percent Apache Hadoop technologies, on Intel and Power systems. Open source projects receiving updates provided by the platform, in addition to core Hadoop itself, include HBase, Hive, Oozie, Sqoop, Spark and many more. The increasingly popular Spark project is the latest darling of the Hadoop Big Data ecosystem, now available in version 1.4.1 on the IBM platform.
"Spark 1.4 is the first release to package SparkR, an R binding for Spark based on Spark's new DataFrame API," IBM said in a blog post Tuesday. "SparkR gives R users access to Spark's scale-out parallel runtime along with all of Spark's input and output formats. It also supports calling directly into Spark SQL."

The company also updated its BigInsights product, featuring "value-added capabilities" for its open source analytics offerings. With the updated version, "you will see new algorithms, including Decision Trees, Random Forests and Stepwise Compression," IBM said. "These algorithms enable R users to use existing R functions on a Hadoop cluster. Big R has expanded its library of machine algorithms to provide a richer set of classification, regression, factorization, feature extraction and survival analysis capabilities." The BigInsights 4.1 update targets Intel-based platforms.
Speaking of Intel, the giant chipmaker on Tuesday launched Parallel Studio XE 2016, its toolkit for high-performance computing (HPC) and Big Data analytics. Intel said the studio comprises a "suite of compilers, libraries, debugging facilities and analysis tools" on Intel platforms designed to help "software developers design, build, verify and tune code in Fortran, C++, C and Java."
The new tooling includes the Data Analytics Acceleration Library (DAAL) designed to speed up Big Data processing on Hadoop, Spark, R and Matlab. It also includes a vectorization advisor, designed to help developers squeeze the best performance out of modern processors through multithreading and vectorization, the latter of which uses single instruction, multiple data (SIMD) instructions. Intel exec James Reinders detailed all the updates in Tuesday blog post.

Intel made further Big Data news this week by investing in BlueData, provider of the EPIC software platform leveraging virtualization technologies, and partnering up with the company. "As part of this new collaboration, our product team at BlueData will be working together closely with Intel in areas including Hadoop and Spark, virtualization and container technology, as well as caching and security/encryption," BlueData announced Tuesday. "We'll be optimizing our software on Intel architectures to provide flexible, elastic, high-performance Big Data deployments on-premises."
Hortonworks Inc., commonly recognized as one of the top three commercial Hadoop distributors, is making a sort of Big Data investment itself through the acquisition of Onyara Inc. That company created and contributes heavily to the top-level Apache project NiFi, which "supports powerful and scalable directed graphs of data routing, transformation and system mediation logic."
"The acquisition will make it easy for customers to automate and secure data flows and to collect, conduct and curate real-time business insights and actions derived from data in motion," Hortonworks said on Tuesday. The acquisition resulted in a new Hortonworks offering called DataFlow.

DataFlow addresses analytics problems associated with "data in motion" stemming from what Hortonworks calls the Internet of Anything (IoAT). This data comes from sources such as sensors, machines, geo-location devices, social feeds, Web, clicks, server logs and so on, Hortonworks said. "While the majority of today's solutions are custom-built, loosely secured, difficult to manage and not integrated, Hortonworks DataFlow powered by Apache NiFi will simplify and accelerate the flow of data in motion into HDP for full fidelity analytics," the company said.
Application infrastructure specialist Concurrent Inc. announced Driven 1.3, an updated offering designed to help monitor and manage Hadoop applications. "Driven offers enterprise users -- developers, operations and lines of business -- unprecedented visibility into applications written in Cascading, Scalding, Cascalog, Apache Hive and MapReduce," the company said in a statement Tuesday. "It provides deep operational insights, search, segmentation and visualizations for rapid troubleshooting and performance management."
Driven provides a scalable metadata repository that helps enterprises analyze relevant app metrics such as service-level agreements, key performance indicators and data lineage, the company said. Concurrent said the offering now offers a plug-in agent to work with Apache Hive and MapReduce jobs and tasks, along with improved collaboration and sharing capabilities.
Impetus Technologies announced free versions of its StreamAnalytix platform, comprising open source components such as Apache Storm, Kafka and Hadoop. With out-of-the-box interfaces for Apache Cassandra, Apache Solr and Elasticsearch also available, StreamAnalytix embeds a complex event-processing engine for real-time analytics of streaming data, the company said.

StreamAnalytix, Impetus said, provides for rapid application development via a visual interface that lets coders leverage drag-and-drop operators, visually draw connections, configure messages and alerts, and view performance metrics, with the ability to save them for later analysis.

The free offering "is designed to continuously ingest massive volumes of data," Impetus says on its site. "The high-performance stream processing engine continuously queries, filters, correlates, integrates, enriches and analyzes data to discover exceptions, patterns and trends that are presented through live dashboards."

About the Author

David Ramel is an editor and writer at Converge 360.

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Loop Engineering Emerges as Developers Put AI Coding Agents on Repeat

Azul Takes Aim at the Java Runtime Security Blind Spot

Nvidia Expands Cosmos Physical AI Platform with Edge Model, Japanese Manufacturing Partners

Claude’s Arrival in Microsoft Foundry Gives Azure Developers Another Frontier Model Option

Eclipse IDE 2026-06 Adds Java 26 Support and Developer Tooling Updates

Upcoming Training Events

0 AM

Visual Studio Live! @ Microsoft HQ
July 27-31, 2026

Visual Studio Live! @ San Diego
September 14-18, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

VSLive! 6-Week Training & Certification Course: Blazor Developer Accelerator: Hands-On Skills for Real-World .NET Teams
October 7 – November 11, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

Visual Studio Live! Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
December 15-18, 2026

Visual Studio Live! Las Vegas
March 22-26, 2027

Visual Studio Live! @ Microsoft HQ
August 2-6, 2027

Free White Papers

More Tech Library