Kubeflow 1.0 Machine Learning Toolkit for Kubernetes Goes Live -- ADTmag

Kubeflow 1.0 Machine Learning Toolkit for Kubernetes Goes Live

By John K. Waters
March 4, 2020

The Kubeflow community this week announced the first major release of its open-source machine learning (ML) toolkit for Kubernetes.

With Kubeflow 1.0, the maintainers of the project are "graduating" a core set of stable applications needed to develop, build, train and deploy ML models on Kubernetes efficiently.

The list of graduating apps in this release includes:

Kubeflow's UI, the central dashboard providing quick access to the components deployed in a Kubeflow cluster;
The Jupyter notebook controller, which allows users to create a custom resource Notebook (shared document with live code, equations, visualizations, and narrative text);
TensorFlow Operator (TFJob), a Kubernetes customer resource for running TensorFlow training jobs on Kubernetes;
PyTorch Operator, for distributed training;
kfctl, the Kubeflow command-line interface (CLI) that's used to install and configure Kubeflow for deployment and upgrades;
Profile controller and UI for multiuser management.

"Kubeflow's goal is to make it easy for machine learning (ML) engineers and data scientists to leverage cloud assets (public or on-premise) for ML workloads," said Thea Lamkin, Google's open source strategist for AI/ML, in a blog post. "You can use Kubeflow on any Kubernetes-conformant cluster."

In Kubeflow Community User Survey, the results of which were published last December, the ability to use Jupyter notebooks emerged as a popular feature request among data scientists and ML engineers.

"With Kubeflow 1.0, users can use Jupyter to develop models," Lamkin said. "They can then use Kubeflow tools like fairing (Kubeflow's python SDK) to build containers and create Kubernetes resources to train their models. Once they have a model, they can use KFServing to create and deploy a server for inference."

Distributed training was another popular feature request. Kubeflow 1.0 provides Kubernetes custom resources that make distributed training with TensorFlow and PyTorch simple.

Since it was open sourced at Kubecon USA in 2017, the Kubeflow Project has grown "beyond our wildest expectations," Lamkin said, with the support of hundreds of contributors and 30 participating organizations, including Microsoft, Google, IBM, Cisco, Intel, and LinkedIn, among others.

The project evolved from an effort to open source the way Google ran its TensorFlow ML library internally, based on a pipeline called TensorFlow Extended. "It began as just a simpler way to run TensorFlow jobs on Kubernetes," the website explains, "but has since expanded to be a multi-architecture, multi-cloud framework for running entire machine learning pipelines."

"Ultimately, we want to have a set of simple manifests that give you an easy to use ML stack anywhere Kubernetes is already running, and that can self-configure based on the cluster it deploys into," the site states.

"The Kubeflow 1.0 release is a significant milestone, as it positions Kubeflow to be a viable ML Enterprise platform," said Jeff Fogarty, data science engineer at U.S. Bank. "Kubeflow 1.0 delivers material productivity enhancements for ML researchers."

The community has several more applications under development, which are planned for point updates of Kubeflow 1.0, including:

Pipelines (beta) for defining complex ML workflows
Metadata (beta) for tracking datasets, jobs, and models,
Katib (beta) for hyper-parameter tuning
Distributed operators for other frameworks like xgboost

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].

Featured

AppTrends

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
July 9-10, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
July 14-17, 2026

Visual Studio Live! @ Microsoft HQ
July 27-31, 2026

Visual Studio Live! @ San Diego
September 14-18, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

VSLive! 6-Week Training & Certification Course: Blazor Developer Accelerator: Hands-On Skills for Real-World .NET Teams
October 7 – November 11, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

Visual Studio Live! Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training with CoPilot: 4-Day Hands-On Experience
December 15-18, 2026

Visual Studio Live! Las Vegas
March 22-26, 2027

Visual Studio Live! @ Microsoft HQ
August 2-6, 2027

Free White Papers

More Tech Library