Free SQL-on-Hadoop Videos Highlight Academic Big Data Analytics Project

Through its developerWorks site, IBM is helping spread the word about an academic-based project that provides free videos on numerous data analytics concepts, such as the recently published "Introducing SQL on Hadoop" presentation.

"Curious about SQL on Apache Hadoop? In less than 5 minutes, this video introduces basic concepts, explains why the industry is investing in various SQL-on-Hadoop efforts, and describes a few sample use cases," says a new posting on the developerWorks site.

The video, presented by IBM's Cynthia Saracco, is part of the Data-X project curated by scientists working out of UC Berkeley and Goethe University in Frankfurt, Germany.

"Data-X is a project to produce a collection of video lectures on very practical and applied data analytics," says the project's Web site hosted on, which focuses on Operational Database Management Systems (ODBMS). "The goal of the project is to invite key experts to each address key aspects of working with data."

Various videos address those key aspects, which include:

  • collect
  • combine
  • store
  • use/compute
  • analyze
  • visualize the derived insights
  • validate findings

For example, along with the aforementioned SQL-on-Hadoop introductory video published Tuesday, videos added just yesterday include:

SQL-on-Hadoop is a skill in growing demand, being met by various vendors and open source projects, such as Apache Hive. The reasons behind its increasing popularity -- as pointed out in the introductory video -- include: it opens up Hadoop-based data to a wider audience; it's a proven, industry-standard query language in wide use; it addresses the shortage of developers skilled in non-SQL Big Data technologies such as MapReduce and Pig; and SQL is supported by many popular tools.

Introduction to SQL on Hadoop
[Click on image for larger view.] Introduction to SQL on Hadoop (source: Data-X project, via

In addition to the SQL-on-Hadoop series, previously published videos provide an introduction to SQL, practical and applied data analytics and an explanation of blockchain technology.

The videos can serve as a lead-in for developers interested in learning more about Big Data analytics, as offers many more education resources, with a focus on object technology and open source.

Dott-Ing. Roberto V. Zicari, Full Professor of Database and Information Systems at Frankfurt University, explains the threefold purpose behind the project's focus in an introductory post:

First, Object Technology is main stream since years now. Both for modeling, UML, and for programming environments, Java, C#. Code generation, MDA, patterns, platforms, objects are all over. This is very different than in the mid 80s early 90s when object modelers and developers were early adopters.

Second, the Open Source community clearly demonstrated the feasibility of their ideas that Open Source is more than simply a number of talented developers around the world cooperating together, but it is also a viable business model. Examples of these are in the operating systems obviously Linux, and more recently in the database market, MySQL.

Third, interesting new markets are shaping up, such as the Embedded market. This could prove to be an ideal place for technologies such as object databases to bridge that gap between all these objects around and databases.

IBM also used the project as a lead-in for developers interested in exploring more of that company's Big Data analytics educational resources, such as information on its own SQL-on-Hadoop implementation, Big SQL. For example, Saracco presents several Big SQL tutorials and presentations on Hadoop Dev.

IBM explained the Data-X project is curated by Zicari, a Visiting Scholar at UC Berkeley, and Ikhlaq Sidhu, Chief Scientist and Founding Director, Sutardja Center, UC Berkeley.

About the Author

David Ramel is an editor and writer for Converge360.