Databricks Partnership Enables 'Lakehouse' on Google Cloud

Databricks, the company founded by the original creators of the Apache Spark analytics engine, is partnering with Google Cloud to enable the deployment of its namesake data engineering solution with another leading cloud provider. Databricks is now available on Google Cloud, AWS, and Azure.

Databricks users will now be able to create a "lakehouse" (which combines the capabilities of a data lake and a data warehouse) that is capable of data engineering, data science, machine learning (ML), and analytics on Google Cloud's elastic network. Databricks now integrates with Google BigQuery's open platform and leverages the Google Kubernetes Engine (GKE), the companies said in a statement, enabling its users to deploy Databricks in a fully containerized cloud environment for the first time.

With this integrated solution, organizations can "unlock AI-driven insights, enable intelligent decision-making, and ultimately accelerate their digital transformations through data-driven applications," the companies said.

"This is a pivotal milestone that underscores our commitment to enable customer flexibility and choice with a seamless experience across cloud platforms," said Ali Ghodsi, CEO and co-founder of Databricks, in a statement. "We are thrilled to partner with Google Cloud and deliver on our shared vision of a simplified, open, and unified data platform that supports all analytics and AI use-cases that will empower our customers to innovate even faster." 

The integrations between Databricks and Google Cloud include:

  • Tight integration of Databricks with Google Cloud's analytics solutions, which makes it easier to extend "AI-driven insights" across data lakes, data warehouses, and multiple business intelligence tools.
  • Pre-built connectors for integrating Databricks with BigQuery, Google Cloud Storage, Looker and Pub/Sub.
  • Fast and scalable model training with Google Cloud's AI Platform using the data workflows created in Databricks, and simplified deployment of models built in Databricks using AI Platform Prediction.

Both Databricks and Google have long employed strategies with strong support for open source, and with this announcement, they threw a spotlight on "a commitment to open innovation and open source software."

"Under this new partnership, the two companies will continue to support the open source community, encourage open innovation and collaboration, making it easier for joint customers to build on open-source technologies," they said.

Last year, Databricks contributed its open source MLflow machine learning (ML) platform for managing the lifecycle of ML models to the Linux Foundation.

Other vendors, whose partnerships with the two companies form a Databricks/Google Cloud joint ecosystem, have committed to ensuring "seamless integrations" with Databricks on Google Cloud, including Accenture, Cognizant, Collibra, Confluent, Deloitte, Fishtown Analytics, Fivetran, Immuta, Informatica, Infoworks, Insight, MongoDB, Privacera, Qlik, SADA, SoftServe, Slalom, Tableau, TCS, and Trifacta among others. 

"Businesses with a strong foundation of data and analytics are well-positioned to grow and thrive in the next decade," said Thomas Kurian, CEO at Google Cloud, in a statement. "We're delighted to deliver Databricks' lakehouse for AI and ML-driven analytics on Google Cloud. By combining Databricks' capabilities in data engineering and analytics with Google Cloud's global, secure network—and our expertise in analytics and delivering containerized applications—we can help companies transform their businesses through the power of data."

Databricks was founded by the creators of the Spark research project at UC Berkeley that later became Apache Spark. The company's namesake unified analytics platform is powered by the Spark big-data distributed processing engine. Data science teams use that platform to collaborate with data engineering and lines of business to build data products.

About the Author

John K. Waters is the editor in chief of a number of sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at [email protected].