MapD Open Sources GPU-Powered Database

MapD Technologies Inc., one of a group of select companies that offer GPU-accelerated databases, today announced the open sourcing of its MapD Core database. The company is contributing the project to the open source community and placing its code on GitHub under an Apache 2 license in order to seed a new generation of data applications.

"MapD pioneered the use of graphics processing units (GPUs) to analyze multi-billion-row datasets in milliseconds, orders-of-magnitude faster than traditional CPU-based systems," the company said in a statement. "By open sourcing the MapD Core database and associated visualization libraries, MapD is making the world's fastest analytics platform available to everyone."

Those associated visualization libraries are also available on GitHub.

In a blog post, company founder and CEO Todd Mostak provided more reasoning for the open source move. "We are doing this first and foremost out of our belief in the transformative power of open source software," he said. "Whether in the Hadoop or deep learning ecosystems, open source is driving tremendous innovation that simply has not been possible with proprietary software."

The company also announced a free Community Edition of its software -- provided for non-commercial development and academic use -- that includes the MapD Core and MapD Immerse visual analytics client. The company also unveiled the MapD Analytics Platform Enterprise Edition, which adds in the MapD Core GPU rendering engine along with distributed scale-out, high availability capabilities -- such as LDAP and ODBC -- that aren't included in the open source version.

The MapD Analytics Ecosystem
[Click on image for larger view.] The MapD Analytics Ecosystem (source: MapD Technologies)

"MapD's decision to open source its Core database is significant, as it will further energize an already active GPU analytics community," the company quoted Jim McHugh, NVIDIA general manager of DGX Systems, as saying. "I expect this move by MapD will drive adoption, by enabling hundreds of thousands of GPU developers to experiment with database acceleration and build new GPU-accelerated application solutions."

MapD is generally characterized as being a leader in the young GPU-powered database market, along with companies such as Kinetica, BlazingDB, Blazegraph and PG-Strom.

In a 2015 post, SQream Technologies sought to explain the benefits provided by GPUs in database processing.

Because of its structure, the GPU enables a single 'instruction' to be performed on huge chunks of data simultaneously (SIMD, Single Instruction, Multiple Data), compared to a general purpose CPU which typically has a smaller scale implementation of SIMD.

Think of the GPU as a coin press machine, which can punch out 100 coins with one operation from a single sheet of metal, whereas a CPU is a coin press which can punch out 10 coins at a time from a strip of metal. While the CPU might have a faster 'time between punches', it also requires a faster feed rate of metal strips as well. This is the key difference between the GPU and CPU. The GPU is
throughput oriented, while the CPU is latency oriented.

The GPU is therefore well suited for operations that perform the same instruction on large amounts of data at once.

The approach can be tricky, however, as Hacker News readers commented in a post last year.

The company that best overcomes those associated challenges will likely see tremendous market opportunities. The Next Platform last month reported that "There is an arms race in the nascent market for GPU-accelerated databases, and the winner will be the one that can scale to the largest datasets while also providing the most compatibility with industry-standard SQL."

About the Author

David Ramel is an editor and writer for Converge360.