DataStax Does Big Data with Spark, Hadoop Integration

DataStax today introduced an upgrade to its Cassandra NoSQL database offering that sports new Big Data capabilities via integration with Apache Spark and Hadoop.

Announced at this week's Spark Summit 2014, DataStax Enterprise 4.5 (DSE) provides in-memory, real-time analytics by adding the Spark integration following a partnership with commercial Spark steward Databricks, a company founded by the original creators of Spark.

Spark is an open source data analytics cluster computing framework that improves on the two-stage, batch-oriented MapReduce paradigm that was a core component of the original Hadoop ecosystem. The Spark processing engine is claimed to provide 100x speed improvements over MapReduce when used in-memory and 10x gains when used on disk. It also provides other functionality, such as SQL support, streaming data, machine learning and graph computation.

DataStax Enterprise 4.5
[Click on image for larger view.] DataStax Enterprise 4.5
(source: DataStax)

"DataStax Enterprise 4.5 is the only distributed database management system for online applications with both in-memory transactional and analytical capabilities, scalability for any sized workload, enterprise search, comprehensive security, and transparent management services," the company said in a statement.

Another major enhancement to DSE 4.5 is more seamless integration with Hadoop. "For the first time, companies can now merge Cassandra data with Hadoop and easily integrate operational and historical data together," DataStax said. "To build on DataStax's commitment, we have forged solid partnerships with Hadoop vendors Cloudera and Hortonworks and DSE has been certified for their platforms."

The company said the new release also features enhanced point-and-click visual management, which can be done remotely from any device. A new Performance Service is designed to simplify operations with automated diagnostics and performance tuning.

"This new functionality removes the mystery of how well a cluster is performing by supplying diagnostic information that can easily be queried," DataStax said. "Further, DSE 4.5 automatically scans clusters, visually identifies issues, and provides expert advice on resolving problems."

While DataStax offers enhanced enterprise commercial database products based on the open source Apache Cassandra project, it also provides a free version for download.

About the Author

David Ramel is an editor and writer for Converge360.