News

Basho Open Sources Its NoSQL Database Targeting IoT Development

Basho Technologies Inc. today open sourced its Riak TS NoSQL Big Data database optimized for Internet of Things (IoT) development, which is out in a new version sporting standard SQL querying and integration with Apache Spark.

Riak TS 1.3 is specially engineered to accommodate time series data, or temporal data marked with timestamps that's constantly spewed from sources such as remote Internet-connected sensors, machines and systems. The TS in its name stands for time series. Basho's Alexander Sicular explains more about time series data in a blog post published today.

Riak TS is "a distributed NoSQL database architected to aggregate and analyze massive amounts of sequenced, unstructured data generated from the IoT and other time series data sources," the company said in announcing its new database last October.

It's now freely available under an Apache v2 License. "The open source version enables developers to download the software for free and use it in production as well as make contributions to the code and develop applications around Riak TS," the company said in a statement today.

Prediciting Baseball Games
[Click on image for larger view.] One Use Case: Predicting Los Angeles Dodgers Baseball Games by Analyzing Traffic Sensor Data
(source: Basho Technologies)

Along with being open sourced, the 1.3 version that just today moved into general availability has received several improvements in functionality, including standard SQL-based querying, despite its "NoSQL" designation (which, according to Wikipedia, originally stood for "non SQL" but has been subsequently revised to stand for "not only SQL" in view of many NoSQL databases adding standard SQL querying capabilities).

"As part of our research, we investigated the different variants of SQL being used by other NoSQL projects," Basho CTO Dave McCrory explained in a blog post today. "In the end, we found that they were all unsuitable. Every single customer that we have spoken to has wanted or preferred standard SQL. Riak TS 1.3 delivers just that, with a shell that offers standard SQL commands. We will ultimately try to support as much standard SQL as possible."

Another new feature is better integration with Apache Spark, the immensely popular technology especially useful for real-time processing of streaming data. It's one of the hottest open source projects in the world and a rising star in the Big Data/Hadoop ecosystem. Basho said a new Spark Connector provides seamless integration with the technology.

"Several NoSQL solutions have support for Spark, however most of them are either bound by a poorly implemented Spark connector or the requirement to run Spark on every cluster node," McCrory said. "Our experience tells us that this is a big disadvantage when there are either large amounts of data analysis that needs to be done or when efficient use of hardware is a requirement. The reason that approach is problematic is that the size of the NoSQL cluster running all of those Spark instances on top is artificially forced to be disproportionately large. This was costing our customers money and was less efficient.

"In contrast, our Spark support offers the ability to run Spark decoupled from Riak TS (and very soon Riak KV). This provides a significant advantage allowing your Spark cluster to be sized independent of your Riak cluster."

Other enhancements to version 1.3 include:

  • Data Aggregators and Arithmetic operations inside Riak TS.
  • Extremely fast write and query performance for time series data.
  • High-performance clients released for Java, Erlang and Python.
  • Riak TS 1.3 EE (Enterprise Edition) now supports Multi-cluster Replication.

"Riak TS is the first enterprise database built specifically for time series data," said CEO Adam Wray. "As enterprises collect IoT time series data from sensors, they will need fast, reliable and scalable read and write performance. Riak TS is purpose-built to store and retrieve time series data with enhanced read and write performance. Extending an open-source version, enhancing ease of use, and adding support for Multi-cluster Replication will make it even easier for companies and community to demonstrate return on their IoT investments."

Riak TS 1.3 supports many programming languages, including Java, Ruby, PHP, Python, Erlang, C# and Node.js, and it runs on several Linux variants including RHEL/CentOS (6 and 7), Ubuntu LTS (12.04 and 14.04), Debian 7 (development only) and OSX 10.8+ (development only). It can be downloaded here for now, and will be hosted on GitHub in the future.

About the Author

David Ramel is an editor and writer for Converge360.