Kubeflow 1.0 Machine Learning Toolkit for Kubernetes Goes Live -- ADTmag

Kubeflow 1.0 Machine Learning Toolkit for Kubernetes Goes Live

By John K. Waters
March 4, 2020

The Kubeflow community this week announced the first major release of its open-source machine learning (ML) toolkit for Kubernetes.

With Kubeflow 1.0, the maintainers of the project are "graduating" a core set of stable applications needed to develop, build, train and deploy ML models on Kubernetes efficiently.

The list of graduating apps in this release includes:

Kubeflow's UI, the central dashboard providing quick access to the components deployed in a Kubeflow cluster;
The Jupyter notebook controller, which allows users to create a custom resource Notebook (shared document with live code, equations, visualizations, and narrative text);
TensorFlow Operator (TFJob), a Kubernetes customer resource for running TensorFlow training jobs on Kubernetes;
PyTorch Operator, for distributed training;
kfctl, the Kubeflow command-line interface (CLI) that's used to install and configure Kubeflow for deployment and upgrades;
Profile controller and UI for multiuser management.

"Kubeflow's goal is to make it easy for machine learning (ML) engineers and data scientists to leverage cloud assets (public or on-premise) for ML workloads," said Thea Lamkin, Google's open source strategist for AI/ML, in a blog post. "You can use Kubeflow on any Kubernetes-conformant cluster."

In Kubeflow Community User Survey, the results of which were published last December, the ability to use Jupyter notebooks emerged as a popular feature request among data scientists and ML engineers.

"With Kubeflow 1.0, users can use Jupyter to develop models," Lamkin said. "They can then use Kubeflow tools like fairing (Kubeflow's python SDK) to build containers and create Kubernetes resources to train their models. Once they have a model, they can use KFServing to create and deploy a server for inference."

Distributed training was another popular feature request. Kubeflow 1.0 provides Kubernetes custom resources that make distributed training with TensorFlow and PyTorch simple.

Since it was open sourced at Kubecon USA in 2017, the Kubeflow Project has grown "beyond our wildest expectations," Lamkin said, with the support of hundreds of contributors and 30 participating organizations, including Microsoft, Google, IBM, Cisco, Intel, and LinkedIn, among others.

The project evolved from an effort to open source the way Google ran its TensorFlow ML library internally, based on a pipeline called TensorFlow Extended. "It began as just a simpler way to run TensorFlow jobs on Kubernetes," the website explains, "but has since expanded to be a multi-architecture, multi-cloud framework for running entire machine learning pipelines."

"Ultimately, we want to have a set of simple manifests that give you an easy to use ML stack anywhere Kubernetes is already running, and that can self-configure based on the cluster it deploys into," the site states.

"The Kubeflow 1.0 release is a significant milestone, as it positions Kubeflow to be a viable ML Enterprise platform," said Jeff Fogarty, data science engineer at U.S. Bank. "Kubeflow 1.0 delivers material productivity enhancements for ML researchers."

The community has several more applications under development, which are planned for point updates of Kubeflow 1.0, including:

Pipelines (beta) for defining complex ML workflows
Metadata (beta) for tracking datasets, jobs, and models,
Katib (beta) for hyper-parameter tuning
Distributed operators for other frameworks like xgboost

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 4-Day Hands-On Training Seminar: Hands-on with Blazor
May 5-8, 2025

Cybersecurity & Ransomware Live! VirtCon 2025
May 13-15, 2025

VSLive! 4-Hour In-Depth Workshop: Deep Dive into ASP.NET Core Razor Pages
May 29, 2025

VSLive! 3-Day Hands-On Training Seminar: Master Modern JavaScript: Unlock the Full Potential of Your Code
June 2-4, 2025

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

4-Hour Hands-on Workshop: MCP Demystified
June 30, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

VSLive! 4-Hour In-Depth Workshop: Immersive .NET Full Stack Training: C# Interfaces: Effective Usage while Avoiding Pitfalls
July 29, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

4-Hour VSLive! Workshop: Testability in .NET
August 27, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

VSLive! 2-Day Hands-On Training Seminar: Hands-On with .NET Web Development in 2025
October 7-8, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Visual Studio Live! Las Vegas
March 16-20, 2026

Free White Papers

More Tech Library