Talend Updates Big Data Sandbox with Docker

Talend Inc., which for years has provided a sandbox to explore the world of Big Data for free, is now powering its offering with Docker container technology.

The company unveiled its sandbox in July 2014 to quicken the adoption of large-scale analytics, promising "zero to Big Data without coding in under 10 minutes."

Now the Big Data Sandbox -- which provides a single mechanism to try out technologies such as Apache Hadoop, Apache Spark, machine learning and Talend's own Big Data platform with a free 30-day trial -- is using Docker to package many of those technologies.

"One of the most exciting changes, but possibly the least visible, is our use of Docker for containerization of many of the underlying components," the company said in a blog post yesterday. "With the explosion of enterprise movement into the DevOps space, Docker has become a powerful tool for rapid and reliable provisioning and deployment of services and applications. We at Talend are embracing this movement internally and this Sandbox represents our first comprehensive use of Docker to distribute our own evaluation software platform."

Specifically, to let enterprises try out the technologies in a stand-alone configuration, Docker is used to provide underlying systems including Spark, Cloudera Hadoop, Hortonworks and Kafka.

In a separate blog post, the company explained more about Docker, the industry leader in software containerization. "Docker technology offers developers a way to package their application into a standardized piece of software in a complete filesystem that contains everything needed to run: code, runtime, system tools, system libraries -- anything that can be installed on a server," the company said. "This allows developers to quickly evaluate a variety of ready-to-run Big Data scenarios, tools and platforms within a virtual environment so that they can better understand the end-to-end lifecycle of a Big Data project and how it is likely to perform in their current environment."

Talend said its new sandbox features an intuitive, drag-and-drop, visual design environment designed to ease the building of integration workflows through the use of pre-built, Big Data use cases. It also features a step-by-step "cookbook" that lets less experienced developers start using Hadoop within minutes.

That cookbook's use case scenarios include:

  • Real-time analysis of data from multiple streaming sources.
  • Real-time, personalized offer recommendations based on customer behavior.
  • Clickstream analysis with the ability to visualize activity on a heat map so companies can more precisely track Web traffic.
  • Monitoring IT operations using Apache weblogs.
  • Extract, Transform and Load (ETL) offload performance to help accelerate complex workload processing.

"Most organizations have a limited pool of skilled developers and therefore find it difficult to grow their Big Data expertise or unlock the benefits of Hadoop," said Talend exec Ashley Stirrup in a news release. "Talend's Big Data Sandbox helps them overcome these challenges by allowing Java developers to quickly become proficient with Hadoop."

The sandbox comes with a 30-day trial of Talend 6 and requires VMware or VirtualBox virtualization, with 20GB of disk space and 8GB of RAM recommended. The company said it will demo the sandbox at the Sept. 26-28 Strata + Hadoop World 2016 conference in New York.

About the Author

David Ramel is an editor and writer for Converge360.