Skip to the content.

KubeCon + CloudNativeCon Europe 2024

Back to all conferences

Table of Contents

AI & ML

Sponsored Keynote: Operating AI Services on Cloud Native Technologies

This video discusses Oracle's journey in the cloud-native and open-source ecosystem, highlighting their contributions to projects like Linux, OpenJDK, and Kubernetes. It also explores Oracle's approach to democratizing AI by leveraging CPU-based solutions and providing free ARM-based credits to CNCF projects, enabling wider access and innovation in the AI space.

From Insanity to Ingenuity: Seven Practical Tips for Navigating the AI Storm in DBaaS Evolution

This panel discussion explores the opportunities and challenges of incorporating AI technology into the cloud-native ecosystem. The panelists share practical insights and strategies for navigating the evolving landscape of AI, emphasizing the importance of open-source standards, governance, and community involvement in shaping the future of this transformative technology.

Sponsored Keynote: Cloud-Native x AI: Unleashing the Intelligent Era with Continuous Open Source...

The talk explores the challenges faced by cloud-native AI, including low GPU/MPU utilization, frequent training cluster failures, and deployment complexities. Huawei Cloud's solutions, such as Kuad, Volcano, and a proposed Serv AI platform, aim to address these issues by improving resource utilization, scaling, and performance for large-scale AI workloads.

Keynote: The Cloud Native News Show: AI Breakthroughs Revealed

The video showcases the growing partnership between cloud-native technologies and artificial intelligence, highlighting the latest developments, challenges, and future prospects in this space. Experts from various cloud-native and AI initiatives discuss the integration of AI into the cloud-native ecosystem, the formation of a new AI working group within the CNCF, and the potential for collaboration between the CNCF and the LF AI and Data Foundation.

Keynote Panel Discussion: Optimizing Performance and Sustainability for AI

The panel discussion explores ways to optimize performance and sustainability for AI workloads on Kubernetes. Key topics include abstracting Kubernetes for data scientists, utilizing specialized compute like Arm-based processors, and techniques to improve GPU utilization through workload scheduling and data pre-processing.

Keynote: Welcome + Opening Remarks - Priyanka Sharma, Executive Director, Cloud Native Computing...

This keynote highlights the growth and impact of the cloud-native ecosystem, particularly in the field of AI. The speaker showcases how cloud-native technologies like Kubernetes can enable the seamless transition from AI prototypes to production-ready applications, fostering collaboration between developers, platform engineers, and AI experts to drive sustainable innovation.

Keynote: Accelerating AI Workloads with GPUs in Kubernetes - Kevin Klues & Sanjay Chatterjee

This talk covers the challenges and solutions for accelerating AI workloads with GPUs in Kubernetes, including GPU sharing techniques, the new Dynamic Resource Allocation (DR) feature, and the need for advanced scheduling capabilities like topology-aware placement, fault tolerance, and multi-dimensional optimization to enable scalable GPU-powered AI on Kubernetes.

Sponsored Keynote: Build an Open Source Platform for AI/ML - Jorge Palma, Principal PM Lead

The video presents Kao, an open-source operator that simplifies and automates the deployment and usage of large language models in Kubernetes. Kao provides a workspace CRD that bootstraps the necessary steps, partners with a node provisioner to provision infrastructure, and creates an inference endpoint for easy integration with applications.

Keynote: Platform Building Blocks: How to Build ML Infrastructure with CNCF Projects

This talk discusses how Bloomberg has leveraged CNCF projects to build a scalable and efficient machine learning infrastructure that powers their AI-driven features in the Bloomberg Terminal. It highlights the use of projects like Kubernetes, Istio, Calico, and Keda to manage notebook environments, distributed model training, and model serving at Bloomberg's scale.

Strategies for Efficient LLM Deployments in Any Cluster -Angel M De Miguel Meana & Francisco Cabrera

This talk discusses strategies for efficient deployment of large language models (LLMs) in Kubernetes clusters, including model selection, quantization, and the use of small language models. It also covers techniques for managing model storage and distribution, GPU utilization, and scalable deployment architectures to optimize for energy efficiency and cost.

WasmEdge, portable and lightweight runtime for AI/LLM workloads | Project Lightning Talk

WasmEdge is a portable and lightweight runtime for AI/LLM workloads, designed to address the challenges of deploying AI applications across diverse hardware and software environments. By leveraging WebAssembly (Wasm) and the WASI (WebAssembly System Interface) standard, WasmEdge enables developers to create AI applications that can be easily deployed and run on a wide range of devices, from edge to cloud, without the need for complex hardware-specific configuration.

Lightning Talk: AI and Kubernetes: Achieving Together in the Next Decade - Steven Zou, VMware

The talk explores the synergistic relationship between Kubernetes and AI, highlighting how they can work together to drive technological innovations and mutual success in the next decade. The speaker outlines various ways in which Kubernetes can support and accelerate AI development and deployment, while also discussing how AI can enhance the Kubernetes platform, making it smarter, more elastic, and more reliable.

Accessibility

Cloud Native Perspectives: Understanding and Advocating for Accessibility in Tech

This panel discussion explores the challenges and experiences of individuals with diverse disabilities in the tech industry. The panelists share their personal journeys, the barriers they face, and the solutions they advocate for to improve accessibility and inclusion in the cloud native ecosystem.

Inaccessible "Accessibility" - Real-life Stories | TAG Lightning Talk

The video highlights the real-life challenges faced by deaf and hard-of-hearing individuals in the workplace, particularly during the interview process, where promised accessibility accommodations often fall short. It emphasizes the need for comprehensive training, flexible communication methods, and open dialogue to create truly inclusive and accessible environments.

Architecture

Why Kubernetes Is Inappropriate for Platforms, and How to Make It Better

This talk explores the limitations of using Kubernetes for building platforms and proposes a new framework, KCP, that addresses the challenges of multi-tenancy, multi-region, and multi-cloud deployments. The speakers demonstrate a platform-as-a-service example built with KCP, showcasing how it provides a more seamless user experience and better aligns with the needs of platform owners, service providers, and end-users.

Kubernetes SIG Architecture Intro and Updates - John Belamaric, Google & Davanum Srinivas, AWS

The video provides an introduction to the Kubernetes Special Interest Group (SIG) Architecture and its role in maintaining the technical aspects of the Kubernetes project. The presentation covers the group's responsibilities, such as managing API design, code organization, enhancements, and conformance testing, as well as their efforts to ensure extensibility and a smooth upgrade process for the Kubernetes ecosystem.

10 Years of Kubernetes Patterns Evolution - Bilgin Ibryam, Diagrid & Roland Huss, Red Hat

This talk explores the evolution of Kubernetes patterns over the past 10 years, covering foundational, structural, behavioral, configuration, and security patterns, as well as more advanced patterns like the Controller and Operator patterns. The speakers provide a comprehensive overview of these patterns and how they can be applied to build robust, scalable, and secure containerized applications on Kubernetes.

Chaos Engineering

Fire in the Cloud: Bringing Managed Services Under the Ambit of Cloud-Native Chaos Engineering

The video discusses the challenges of using managed services in the cloud and how chaos engineering can be applied to mitigate these challenges. The presenters introduce Litmus, an open-source chaos engineering platform, and discuss its features, roadmap, and ways for the community to get involved.

Cultural Shifts: Fostering a Chaos First Mindset in Platform Engineering

This talk explores the concept of a 'chaos first' mindset in platform engineering, where chaos engineering is proactively embraced to identify and address weaknesses in complex cloud-native systems. The speakers discuss the core principles of chaos engineering, how it can be integrated into the various components of platform engineering, and demonstrate a practical implementation using tools like Backstage and Litmus.

Community

The Journey of Organizing an Kubernetes Community Days Events – Our Odyssey Unveiled

The presentation explores the journey of organizing a Kubernetes Community Days (KCD) event, highlighting the demanding yet rewarding experience. The speakers share insights on getting involved, managing pre-event, event-day, and post-event activities, as well as the impact and metrics of their successful KCD event in Brazil.

Compute

Distributed AI: Using NATS.Io to Seamlessly Connect AI Applications... Tomasz Pietrek & Jeremy Saenz

This talk provides an overview of the NATS.io distributed messaging system and how it can be used to seamlessly connect AI applications across cloud, edge, and on-premises environments. The presenters demonstrate NATS.io's capabilities, including service discovery, load balancing, metrics collection, and real-time object detection using a Jetson Nano device, highlighting the flexibility and ease of use of the NATS.io platform.

CRI-O Odyssey: Exploring New Frontiers in Container Runtimes - Julien Ropé & Sohan Kunkerkar

The presentation explores the recent advancements in CRI-O, a container runtime, including its integration with confidential containers, support for WebAssembly, and upcoming features like split image file system and rootless containers. The speakers highlight the growing community involvement, new releases, and future roadmap for CRI-O, showcasing its evolution as a robust container runtime for Kubernetes.

Istio Project Update - Mitch Connors, Aviatrix & Zhonghu Xu, Huawei

The presentation provides an update on the Istio project, covering its project health, recent releases, and upcoming features. The speakers discuss the deprecation of the Istio operator and the introduction of Ambient, a new service mesh capability that allows the use of arbitrary proxies alongside Istio.

The Chain of Trust: Towards SLSA L3 with Tekton Trusted Artifacts - Jerop Kipruto & Andrea Frittoli

The presentation discusses the importance of maintaining a chain of trust in the software development lifecycle and how Tekton, a cloud-native CI/CD system, can help achieve Supply Chain Levels for Software Artifacts (SLSA) Level 3 security requirements. It introduces Tekton Trusted Artifacts, a feature that allows for secure handling and verification of artifacts throughout the build process.

Leveraging OCI 1.1 for Enhanced SBOM Integration and Vulnerability Scanning in Harbor

This talk discusses how to leverage the OCI 1.1 specification to enhance SBOM (Software Bill of Materials) integration and vulnerability scanning in the Harbor container registry. It covers the integration of SBOM generation and scanning capabilities in Harbor, using the Trivy security scanner, to provide comprehensive visibility and efficient vulnerability management.

eBPF’s Abilities and Limitations: The Truth - Liz Rice & John Fastabend, Isovalent

This talk explores the capabilities and limitations of eBPF, a powerful kernel technology that allows running custom programs to modify the kernel's behavior. The presenters demonstrate that eBPF can be used for complex tasks like implementing the Game of Life, challenging common perceptions about its limitations.

KubeDDR! a Dance of Predictive Scoring with MLOps, Step by Step - Leigh Capili & Annie Talvasto

This talk explores the application of machine learning and computer vision techniques to the game of Dance Dance Revolution (DDR), demonstrating how predictive scoring can be achieved using optical character recognition (OCR) and cloud-native MLOps practices. The presenters discuss the challenges of operationalizing machine learning models, highlighting the importance of bridging the gap between data science and production-ready deployments.

Precision Matters: Scheduling GPU Workloads on Kubernetes - Amit Kumar & Gaurav Kumar, Uber

This talk discusses the challenges and solutions involved in scheduling GPU workloads on Kubernetes at Uber. The speaker covers topics such as supporting heterogeneous clusters, managing multiple GPU SKUs, and addressing GPU resource fragmentation, as well as common pitfalls encountered and future work planned.

Squeeze Your K8s: How We Adopt Time-Series Forecasting Models in FinOps... Irvin Lim & Nicholas Kwan

The presentation discusses how the company Shopee adopted time-series forecasting models to optimize resource utilization and cost efficiency in their Kubernetes infrastructure. The speakers share their approach to using short-term and long-term forecasting techniques to improve scheduling, resource allocation, and cluster autoscaling, ultimately reducing costs and improving overall resource efficiency.

Maximizing GPU Utilization Over Multi-Cluster: Challenges and Solutions for Cloud-Native AI Platform

This video discusses the challenges and solutions for maximizing GPU utilization in a multi-cluster, cloud-native AI platform. The presenters propose leveraging the Kubernetes-based Kommander and Volcano projects to provide a unified management and scheduling solution for distributed GPU resources across different data centers and cloud providers.

Kubernetes Is FINALLY Removing in-Tree Cloud Providers - Bridget Kromhout & Chris Privitere

Kubernetes is finally removing the in-tree cloud provider code, which has been a complex and long-standing issue. This talk provides an overview of cloud providers, the migration process, and actionable takeaways for the community to contribute to the Sig Cloud Provider and help shape the future of Kubernetes cloud integration.

Search at Shopify: Highly Available Platform for Data Resilience and Compliance - Leila Vayghan

This talk discusses the search infrastructure at Shopify, a highly available platform designed for data resilience and compliance. The speaker delves into the use of Elasticsearch and Kafka to power the search functionality, highlighting the challenges of achieving high availability, scalability, and global accessibility for Shopify's diverse merchant and buyer base.

Increasing GPU Utilisation on K8s Clusters Dedicated for AI/ML Workloads

This talk discusses strategies for increasing GPU utilization on Kubernetes clusters dedicated to AI/ML workloads. It covers the use of open-source tools like Paddle, Volcano, and Armada to efficiently manage and schedule GPU resources across large-scale infrastructures with thousands of GPUs.

Object Storage on Kubernetes? Completed It with Provider Ceph. - Conor Nolan & Richard Kovacs

This talk explores the development of a Crossplane provider, called Provider Ceph, to manage S3 buckets across multiple Ceph clusters from a single Kubernetes cluster. The presenters discuss the challenges they faced, the benefits of using Crossplane, and the performance and scalability considerations they encountered when handling a large number of buckets.

Kubernetes as a Data Platform - Robert Hodges, Edith Puclla, Clayton Coleman

This video discusses the increasing adoption of Kubernetes as a platform for running data-intensive workloads, including databases, analytics, and machine learning. The panelists highlight the evolution of Kubernetes support for stateful applications, the convergence of databases and Kubernetes, and the opportunities for Kubernetes to enable more efficient and scalable data processing and AI/ML workflows.

Mastering GPU Management in Kubernetes Using the Operator Pattern- Shiva Krishna Merla & Kevin Klues

The presentation discusses the challenges of managing GPUs in Kubernetes and how the GPU Operator pattern can address these challenges. The GPU Operator provides a unified API to configure and manage the lifecycle of GPU-related components, including the Nvidia driver, container runtime, device plugin, and monitoring software.

KubeEdge: Extending Kubernetes to the Edge with Real-World Industry Use Case

KubeEdge is an open-source project that extends Kubernetes to the edge, enabling seamless management of edge devices and processing of edge data. The presentation covers the architecture, key features, and real-world industry use cases of KubeEdge, showcasing its ability to address the challenges of edge computing and drive innovation across various sectors.

Production-Ready AI Platform on Kubernetes - Yuan Tang, Red Hat

This talk provides an overview of building a production-ready AI platform on Kubernetes, covering key aspects such as scalability, reliability, flexibility, and reference implementations using tools like Kubeflow, Argo, and KServe. The speaker also introduces their recently published book on distributed machine learning patterns and encourages the audience to explore the resources mentioned.

SIG Windows Retrospective and Windows Image Building Deep Dive - Claudiu Belu, Cloudbase Solutions

The talk provides a deep dive into the process of building Windows images for Kubernetes, including the challenges of managing different Windows versions and the use of multi-stage Docker files and Docker buildx to build Windows images on Linux nodes. The speaker also discusses the benefits of using Nano Server-based images and the tools like Crane that can be used to optimize the image building process.

Hacking Kubernetes to Migrate in-Tree Volumes to CSI - Baptiste Girard-Carrabin & Antoine Gaillard

The presenters discuss their journey in migrating Kubernetes in-tree volumes to the Container Storage Interface (CSI) at scale, highlighting the challenges they faced and the unconventional solutions they implemented to ensure a seamless migration without disrupting their production environment.

Special Purpose Operating Systems: The Next Step in OS Evolution or One-Trick Ponies?

This panel discussion explores the evolving landscape of special purpose operating systems, highlighting their unique capabilities, challenges, and potential future directions. The panelists share insights on the diverse approaches taken by projects like Unicraft, Talos, Kyos, Bottle Rocket, and Flat Car, showcasing the specialized nature of these operating systems and the ongoing efforts to address emerging requirements, such as support for specialized hardware and seamless configuration management.

Three’s a Crowd: How to Achieve 2-Node HA at the Edge and Make Your CFO Smile

This talk presents a novel approach to achieving 2-node high availability at the edge, addressing the challenges of traditional Kubernetes deployments. The proposed architecture leverages PostgreSQL logical replication and a custom liveness agent to provide a simple, scalable, and cost-effective solution for running critical applications at the edge.

Swap Smart, Save Big - Jing Yan & Antti Kervinen, Intel

This talk presents a comprehensive approach to leveraging swap in Kubernetes clusters to optimize resource utilization without compromising performance. The speakers discuss various strategies for determining what to swap, how much to swap, where to swap, and when to swap, as well as techniques for isolating the effects of swapping on individual workloads.

Sidecar Containers in Kubernetes: Past, Present, and Future - Matei David, Buoyant & Mike Beaumont

This talk discusses the evolution of sidecar containers in Kubernetes, from their initial introduction in 2015 to the recent introduction of a native sidecar feature in Kubernetes 1.29. The speakers explore the various approaches to deploying sidecar proxies, including sidecar, host proxy, and ambient modes, highlighting the trade-offs and considerations for each approach.

This Won’t Hurt a Bit: Taking Kubernetes to Thousands of Dental Offices Simply and Securely

This talk discusses how Dentsiona, a healthcare company, leveraged Kubernetes and edge computing to transform dental care by providing a simple and secure solution for managing thousands of edge devices deployed in dental offices. The presenters highlight the key challenges they faced, such as onboarding, security, and maintenance, and how they addressed them using Pallet Edge, a platform that enables remote management and automation of edge Kubernetes clusters.

Make Your Cluster Fly: Embed a Multi-Node Kubernetes Cluster Inside an Aircraft Using Puppet & Flux

This presentation showcases the challenges of deploying and managing a multi-node Kubernetes cluster inside an aircraft, where connectivity and physical access are limited. The speakers discuss their approach using Puppet and Flux to automate the bootstrapping, image caching, and deployment of services, ensuring a resilient and independent cluster operation.

Introducing ClusterInventory and ClusterFeature API - Eduardo Arango Gutierrez & Ryan Zhang

The presenters introduce the ClusterInventory and ClusterFeature APIs, which aim to provide a common interface for representing and managing multi-cluster environments. These APIs are being developed in collaboration with the Sig Multicluster community to address the challenges of cluster management, application deployment, and observability in complex, multi-cluster Kubernetes setups.

Confidential Containers for GPU Compute: Incorporating LLMs in a Lift-and-Shift Strategy for AI

This talk presents a comprehensive approach to incorporating large language models (LLMs) into a lift-and-shift strategy for AI, leveraging confidential containers and the Kata runtime to enable secure and isolated GPU compute. The speaker discusses the need for stronger security boundaries in container environments, the evolution of sandbox technologies, and the specific features and reference architecture of Kata containers that enable GPU passthrough and confidential computing capabilities.

Unleashing the Power of DRA (Dynamic Resource Allocation) for Just-in-Time GPU Slicing

The talk discusses the use of Dynamic Resource Allocation (DRA) for improving GPU utilization on Kubernetes clusters, highlighting the challenges of using multi-instance GPUs and the benefits of DRA in addressing these challenges. The presenters also introduce Insta Slice, a solution that aims to simplify the user experience and migration process for adopting DRA.

Unlocking the TAG-Runtime Magic: Where Cloud-Native, Workloads, and AI Join Forces!

TAG-Runtime, a collaborative effort within the CNCF ecosystem, brings together diverse projects and working groups to unlock the full potential of cloud-native, AI, and workload management. By fostering cross-project collaboration, driving standards, and facilitating end-user engagement, TAG-Runtime aims to create a cohesive and efficient runtime environment for the cloud-native landscape.

Kubernetes Under the Hood: The Benefits of Container Focused OS- Mathieu Tortuyaux & Timothée Ravier

This talk provides an in-depth exploration of container-focused operating systems, such as Flatcar Linux and Fedora, and how they are designed to optimize the deployment and management of containerized applications, particularly in the context of Kubernetes. The presentation covers the key concepts of container-focused OSes, including immutability, automated updates, and provisioning using tools like Ignition and Butane, as well as the benefits they offer for reliable and secure container orchestration.

Keeping the Bricks Flowing: The LEGO Group's Approach to Platform Engineering for Manufacturing

The LEGO Group's approach to platform engineering for manufacturing involves leveraging cloud-native tools and operators to provide a self-service, cloud-like experience for their digital teams, while also embracing chaos engineering to ensure the reliability and resilience of their critical manufacturing services.

Sponsored Keynote: A Cloud Native Overture to Enterprise End User Adoption

This talk explores how Kubernetes, the leading container orchestration platform, can also be leveraged to manage virtual machines (VMs), providing a unified platform for diverse workloads. The speaker highlights the benefits of running VMs on Kubernetes, including the ability to leverage the rich CNCF ecosystem and streamline cognitive complexity by consolidating different compute environments.

Keynote Panel Discussion: Revolutionizing Cloud Native Architectures with WebAssembly

This keynote panel discussion explores how WebAssembly, a lightweight and fast runtime, can be leveraged in cloud-native architectures to complement traditional container-based approaches. The speakers highlight the benefits of using WebAssembly, such as improved scalability, density, and cross-platform compatibility, and introduce the open-source project Spin Cube, which aims to simplify the development, deployment, and operation of WebAssembly workloads on Kubernetes.

How to Deploy an AI-Optimized K8s Cluster with Kubespray - Kay Yan & Mohamed Zaian

This talk provides a comprehensive overview of Kubespray, a Kubernetes deployment tool that enables the deployment of production-ready Kubernetes clusters across various cloud environments and on-premises infrastructure. The presenters discuss Kubespray's features, best practices, and the integration of AI-optimized components, such as GPU support and advanced scheduling capabilities, to cater to the unique requirements of AI workloads on Kubernetes.

Revolutionizing the Control Plane Management: Introducing Kubernetes Hosted Control Planes

This panel discusses the concept of hosted control planes, which involves running Kubernetes control planes as pods in a separate management cluster. The panelists share their experiences, challenges, and best practices around implementing and managing hosted control planes, highlighting the benefits of this approach in terms of scalability, security, and disaster recovery.

Enable GPU-Acceleration Without Worrying About Managing Device Drivers - Christopher Desiniotis

The talk discusses methods for managing GPU device drivers in Kubernetes, including the use of containerized driver images and an operator-based approach to handle heterogeneous cluster configurations and driver upgrades. The speaker presents solutions to address the challenges of maintaining driver availability, performing controlled upgrades, and automating the process at scale.

The Art of Kubernetes Add-on Validation: Secure Strategies for the Modern Developer Platform

This talk presents strategies for securely validating Kubernetes add-ons, including static code analysis, Helm chart validation, container image validation, policy enforcement, and progressive delivery using ring deployments. The speaker demonstrates how these techniques can be integrated into a CI/CD pipeline to ensure the reliability and security of the modern developer platform.

Accelerators(FPGA/GPU) Chaining to Efficiently Handle Large AI/ML Workloads in K8s

This talk presents a novel approach to efficiently handle large AI/ML workloads in Kubernetes by leveraging accelerators (FPGAs/GPUs) through a chaining mechanism. The authors describe their custom resource model and operator-based architecture that enables composable and heterogeneous acceleration, addressing the challenges of current acceleration methods in Kubernetes.

DRAcon: Demystifying Dynamic Resource Allocation - from Myths to Facts - Kevin Klues & Patrick Ohly

This talk discusses the evolution of Dynamic Resource Allocation (DRAcon) in Kubernetes, from its initial approach to the current structured parameters model. It highlights the key features, challenges, and future plans for integrating DRAcon into the Kubernetes ecosystem.

AI, Edge, and Storage Walk Into a Mongolian Mine - Reza Jelveh, SoftSage Solutions OU

The talk discusses the challenges faced in deploying an AI-powered edge computing system for earthquake detection in a remote Mongolian mining site, including storage and compute limitations, as well as the need for performance-based design and the use of various open-source tools and techniques to overcome these challenges.

Self-Hosted LLMs on Kubernetes: A Practical Guide - Hema Veeradhi & Aakanksha Duggal, Red Hat

This talk provides a practical guide to self-hosting large language models (LLMs) on Kubernetes. It covers the different levels of LLM utilization, from prompt engineering to fine-tuning and building custom models, and demonstrates a containerized LLM application for speech-to-text translation, highlighting the benefits of self-hosting for privacy and control.

Tutorial: Cloud Native WebAssembly and How to Use It - Brooks Townsend & Michael Yuan

This tutorial provided an introduction to Cloud Native WebAssembly, including its context, key projects like wasm Cloud and wasm Edge, and hands-on demonstrations of building and deploying WebAssembly components using standard interfaces and tooling. Attendees learned how WebAssembly enables portable, secure, and performant applications that can run across diverse environments, from the cloud to the edge.

Breaking the Rules of Operator Development for AI at the Far Edge

The presentation explores breaking the traditional rules of Kubernetes operator development to address the unique challenges of deploying AI models at the far edge, such as resource constraints, scalability, and lifecycle management. The speakers propose innovative strategies, including combining operators, decoupling model data from the operand, and leveraging local edge repositories, to make operators more efficient and autonomous in the far edge context.

Cloud Native Desktops in Action - Thomas Fricke, Freelancer

This talk explores the development of secure, cloud-native desktop solutions, driven by the need for increased security in government and critical infrastructure. The speaker discusses two projects, OpenDesk and VNC Lagoon, that aim to provide comprehensive, open-source desktop environments integrated with Kubernetes and focused on security, compliance, and sovereignty.

Gen AI at the Edge: How Cloud Native Technologies Enable the Next Wave of Intelligent Applications

The panel discussion explores the synergy between cloud-based generative AI and edge computing, highlighting the challenges and opportunities in deploying these technologies at the edge. The speakers share insights on balancing model performance, resource optimization, data privacy, and security, as well as potential use cases that leverage the unique advantages of edge computing to enhance generative AI applications.

Navigating the Processing Unit Landscape in Kubernetes for AI Use Cases

This talk explores the various processing units available for AI and ML workloads in Kubernetes, including CPUs, GPUs, and TPUs. The speakers provide an overview of how these different hardware accelerators work and how they can be leveraged within a Kubernetes cluster to optimize performance and efficiency for AI/ML applications.

Sponsored Session: American Airlines Increases Velocity by Leveraging K8s at Scale and Autonomous...

This video discusses how American Airlines has leveraged Intel's Kubernetes optimization service, Granulate, to achieve significant cost savings and performance improvements in their data lake and Kubernetes environments. The presentation highlights Intel's broader efforts to provide early access to its hardware and software through its Developer Cloud, as well as its security initiatives with Intel Trust Authority.

Container Builds at Scale with Buildpacks | Project Lightning Talk

The talk introduces the use of Cloud Native Buildpacks, a tool that transforms source code into container images without using a Dockerfile. The benefits of using Buildpacks, such as efficient caching and rebasing, are discussed, along with a demonstration of how to get started with the 'pack' command-line tool.

Metal in the sand - summary of the Metal3's sandbox progress and future goals | Project Lightning...

Metal Cube is an open-source tool for provisioning and managing bare-metal operators with Kubernetes. The project has made significant progress during the sandbox phase, adding new features, streamlining release processes, and expanding its community, and is now seeking incubation within the CNCF.

Lightning Talk: Locking the Monster: Strategies to Isolate Resource Big Eaters - Peter Pan, DaoCloud

This lightning talk discusses strategies for isolating resource-intensive containers, known as 'resource big eaters,' to prevent them from disrupting the overall system performance. The speaker covers various techniques, such as configuring container-level resource limits, leveraging system-level controls, and monitoring resource usage to proactively address issues before they escalate.

A little bit of pixie dust: Evolving the on-premise experience with Tinkerbell | Project Lightnin...

This video introduces Tinkerbell, a bare metal provisioning engine that is a CNCF Sandbox project, and discusses how it can be used to evolve the on-premise experience by providing a cloud-native approach to managing bare metal infrastructure. The speaker highlights Tinkerbell's capabilities, such as supporting various operating systems, integrating with Cloud-Init, and offering a Cluster API provider, and encourages the audience to get involved in the project to help drive further innovation in the bare metal space.

Emerging Technologies from TAG-Runtime Working Groups | TAG Lightning Talk

The video discusses the emerging technologies being developed by the TAG-Runtime Working Groups, including advancements in WebAssembly and the work being done on accelerator device integration in the container ecosystem. The presentation covers the key initiatives and outcomes of these working groups, highlighting their open and collaborative nature, and invites the audience to engage with the ongoing efforts.

Lightning Talk: Decoding and Taming the Costs of Serving Large Language Models - Yuan Chen, NVIDIA

This lightning talk discusses the high costs associated with serving large language models, including the expenses of model inference, model serving, and handling large user volumes. The speaker proposes that efficient workload scheduling algorithms, model optimization techniques, and hardware performance improvements could help address these challenges and enable scalable, cloud-native solutions for emerging AI workloads.

Sharing Is Caring: GPU Sharing and CDI in Device Plugins - Christopher Desiniotis & David Porter

The presentation covers the integration of devices, such as GPUs, in Kubernetes, including the device plugin framework, the Container Device Interface (CDI) for declarative device access, and various GPU sharing strategies like time slicing, Cuda Multi-Process Service (MPS), and Multi-Instance GPU (MIG). The speakers discuss the trade-offs and use cases for each approach, as well as future plans for more native device management in Kubernetes.

Containers

Is Your Image Really Distroless? - Laurent Goderre, Docker

Failed to generate summary.

What’s New in Containerd 2.0? - Wei Fu, Microsoft & Akhil Mohan, VMware by Broadcom

The presentation discusses the new features and improvements in Containerd 2.0, including the sandbox API, transfer service, and runtime changes. The talk highlights the focus on extensibility, performance, and quality improvements in the upcoming release.

Databases

Unleash the Power of etcd: What Can an E[Xtensible]-etcd Bring? - Siyuan Zhang & Bogdan Kanivets

This talk explores the potential of extending the backend of etcd, the popular distributed key-value store, to address the diverse use cases and evolving requirements of modern systems. The presenters discuss their efforts to implement and benchmark alternative backends, such as SQLite and Badger, and share insights into the performance trade-offs and resource consumption characteristics of these approaches.

We Tested and Compared 6 Database Operators. The Results are In!

This talk provides a comprehensive overview of six different database operators for PostgreSQL and MySQL, highlighting their key features, strengths, and weaknesses based on the presenters' real-world experience in managing these operators in production. The presenters offer insights on the ease of installation, configuration, high availability, observability, and backup/restore capabilities of each operator, making this a valuable resource for DevOps teams and database administrators interested in leveraging database operators in a Kubernetes environment.

Scaling Heights: Mastering Postgres Database Vertical Scalability... Gabriele Bartolini & Gari Singh

The presentation discusses techniques for vertically scaling PostgreSQL databases on Kubernetes, including the use of Cloud Native PostgreSQL, an open-source operator that simplifies the deployment and management of PostgreSQL clusters. The presenters demonstrate how to leverage Kubernetes features like storage classes, table spaces, and volume snapshots to optimize PostgreSQL performance and availability.

Etcd 3.6 and Beyond - Wenjia Zhang, Marek Siarkowicz & Siyuan Zhang, Benjamin Wang

The talk covers the recent updates and future roadmap of the etcd project, including improvements to the underlying components (bbolt and raft), support for downgrade, and new sub-project governance framework. The maintainers also discuss opportunities for community contributions in areas like downgrade, performance testing, and test infrastructure.

Vitess: Introduction, New Features and the Vinted User Story

Vitess is a cloud-native, distributed database built around MySQL, offering features like sharding, high availability, and online schema changes. The presentation covers Vitess' architecture, new features, and the adoption story of Vinted, a major Vitess user.

Strimzi: Toward a ZooKeeper-less Future | Project Lightning Talk

Strimzi, a CNCF incubating project, focuses on running Apache Kafka on Kubernetes in a native way. The talk highlights the project's efforts to remove Zookeeper from Apache Kafka, simplifying deployment and operations, while providing various architectural options for Kafka clusters.

Developer Experience

Project Carvel: Composable Tools for Application Management

Project Carvel provides composable tools for managing application deployment and configuration, including ytt for YAML templating, kbld for image management, and the Carvel controller for declarative application deployment. These tools enable reliable, repeatable, and customizable application management workflows across different environments.

Giving and Receiving Professional Feedback

This talk provides practical guidance on giving and receiving professional feedback, emphasizing the importance of empathy, clarity, and maintaining a constructive dialogue. Key strategies discussed include the feedback sandwich approach, considering power dynamics, and fostering an open and supportive environment for constructive criticism.

When They Go High, We Go Low – Hooking Libc Calls to Debug Kubernetes Apps - Tal Zwick, MetalBear

This talk discusses Mir, a Rust-based developer tool that enables local development of cloud-native applications by transparently virtualizing their interactions with the Kubernetes cluster. The key technical details involve using dynamic library injection and function hooking to intercept and redirect system calls, allowing the application to run locally while accessing remote resources.

Shift-Left: Past, Present, and Future of Validation in CI... Alexander Zielenski & Stefan Schimanski

The talk discusses the evolution of validation in Kubernetes, from the early days of Kubectl create to the current state of client-side validation using tools like Cube Cuddle Validate. It highlights the challenges faced in achieving instant feedback on manifest validation and the efforts made to improve the expressiveness of Kubernetes schemas through features like Cell and validation ratcheting.

CNCF Governing Board Town Hall

The CNCF Governing Board Town Hall provides insights into the governance structure, responsibilities, and key focus areas of the CNCF. The discussion covers the challenges of sustaining open-source projects, diversifying the contributor base, and aligning the foundation's services with the evolving needs of the cloud-native ecosystem.

Open Policy Agent (OPA) Intro & Deep Dive - Anders Eknert, Styra & Xander Grzywinski, Microsoft

This talk provides an introduction to Open Policy Agent (OPA), a general-purpose policy engine that allows organizations to decouple policy from application logic and manage policies as code. It also covers updates to the OPA project, including the upcoming 1.0 release, and an in-depth look at the OPA Gatekeeper project, a policy controller for Kubernetes that leverages OPA.

Tackling Configuration Management at Scale with Flux, CUE and OCI at... Alec Hothan & Stefan Prodan

The talk discusses how Cisco tackles configuration management at scale using Flux, CUE, and OCI. The presenters propose improvements to the current setup, including desire state consolidation using OCI artifacts and simplifying variable management with the CUE language and the Teon tool.

Create Cloud Native Agents and Extensions for LLMs - Xiaowei Hu, Second State

This talk presents a novel approach to creating cloud-native agents and extensions for large language models (LLMs) using WebAssembly (Wasm) and the Wasm Edge runtime. The speakers demonstrate how this technology enables the deployment of lightweight, portable, and GPU-accelerated LLM applications that can run seamlessly across different hardware and software environments.

Savoir Faire: Cloud Native Technical Leadership

This panel discussion explores the qualities and characteristics of successful cloud native technical leaders, highlighting the importance of communication, collaboration, and a customer-centric mindset. The speakers share their personal journeys, strategies for balancing open-source and day-job responsibilities, and insights on how to effectively advocate for and incentivize open-source contributions within an organization.

Keynote: A 10-year Detour: The Future of Application Delivery in a Containerized World

The speaker discusses the evolution of application delivery in a containerized world, highlighting the need to approach software development as a manufacturing process that integrates the application and its deployment pipeline. They also emphasize the relevance of the platform community in the era of AI applications, which require a similar approach to traditional software delivery.

Rapid IDP Capability Development and Automated Testing at Autodesk - Jesse Sanford & Greg Haynes

This talk discusses Autodesk's journey in developing a rapid and automated Internal Developer Platform (IDP) using open-source tools like Argo CD, Backstage, and Crossplane. The presenters share the challenges they faced, the solutions they implemented, and the open-source tools they created to enable a cohesive development, CI, and CD experience across different teams and technologies within their organization.

Building Container Images the Modern Way - Adrian Mouat, Chainguard

This talk discusses modern approaches to building container images, including tools like Bazel, Dagger, and Apko, which aim to create minimal, reproducible, and secure container images. The speaker highlights the strengths and weaknesses of each tool, recommending multi-stage Docker builds with distroless base images as a simple and effective solution for most use cases.

Keynote Panel Discussion: Unity in Diversity: A Decade of Inclusive Growth in the Cloud Native...

This panel discussion explores the diverse contributions that extend beyond code in the cloud native community, highlighting the vital roles of local ambassadors, glossary translations, and community events in making the cloud native experience truly accessible for everyone globally. The panelists share their experiences and insights on building inclusive and supportive communities, addressing barriers for individuals with disabilities, and driving sustainability initiatives like the Cube Train project.

Document Your Career Path with SIG Docs! - Rey Lejano, Red Hat & Divya Mohan, SUSE

This talk provides an overview of how to document your career path by contributing to the Kubernetes documentation through the Kubernetes Special Interest Group for Documentation (SIG Docs). The speakers, Rey Lejano and Divya Mohan, share their personal journeys and insights on finding your niche, getting started the right way, and positioning yourself within the open-source ecosystem.

XRegistry - Looking Beyond CloudEvents - Klaus Deissner, SAP

The talk introduces XRegistry, a new initiative from the CNCF Serverless Working Group, which aims to provide a standardized way to discover, describe, and manage cloud event definitions and associated metadata. The speaker discusses the core concepts of XRegistry, including its hierarchical structure, API, and support for various schema formats, and highlights potential future developments such as event-driven notifications about registry changes.

What's New in gRPC? - Kevin Nilson, Google

The talk provides an overview of the recent developments and new features in gRPC, including improvements in developer tooling, support for Kubernetes Gateway API, stateful session affinity, custom backend metrics for load balancing, and integration with OpenTelemetry. The talk also highlights the team's efforts to enhance the gRPC ecosystem, such as investing in Rust support and expanding the documentation and content on YouTube.

What's New in Operator Framework? - Jonathan Berkhahn, IBM & Varsha Narsing, Red Hat

This talk provides an overview of the latest developments in the Operator Framework, focusing on the Operator SDK and Operator Lifecycle Manager (OLM). The presenters discuss improvements in the Java Operator SDK, the planned architecture changes for OLM v1, and the integration with the Kube Controller project to simplify operator deployment and management.

End User TAB Town Hall - Moderated by Taylor Dolezal, Cloud Native Computing Foundation

This video discusses the goals and vision of the End User Technical Advisory Board (TAB) at the Cloud Native Computing Foundation (CNCF). The TAB aims to provide end-user guidance, reference architectures, and a platform for collaboration to help navigate the complex CNCF landscape and address the challenges faced by organizations in adopting cloud-native technologies.

Label Wrangling: How to Manage What You Barely Know Exists! - Jeremy Mechouche & Carl Castanier

This talk presents a comprehensive approach to managing labels in cloud-native environments. The speakers discuss the importance of labels, common use cases, the challenges of inconsistent labeling, and a proposed solution involving label extraction, knowledge-based homogenization, and proactive label enforcement through admission controllers.

What's New with Kubectl and Kustomize … and How You Can Help! - Eddie Zaneski & Maciej Szulik

This talk provides an overview of the latest developments in Kubectl and Kustomize, the CLI tools for Kubernetes. The presenters discuss new features, performance improvements, and plans for future enhancements, while also encouraging the audience to get involved in the development of these tools.

How to Stabilize a GenAI-First, Modern Data LakeHouse: Provision 20,000 Ephemeral Data Lakes/Year

This talk presents how LinkedIn stabilized and scaled its GenAI-first, modern data lakehouse by provisioning 20,000 ephemeral data lakes per year. It describes the challenges faced, the approach taken, and the results achieved, including improved developer productivity and faster iteration through the use of Groundhog Day, an ephemeral data lakehouse platform built on Kubernetes.

The Business Benefits of Cloud Native

The panel discussion explores the business benefits of cloud native computing, highlighting the importance of aligning technical goals with business objectives. The speakers emphasize the need for effective communication between technical and non-technical stakeholders to ensure a successful cloud native transformation.

Unlocking New Platform Experiences with Open Interfaces- Thomas Vitale & Mauricio "Salaboy" Salatino

This presentation explores the challenges faced when building platforms and distributed systems, and showcases a range of open-source tools and patterns that can be used to address these challenges, such as Backstage, Cloud Native Buildpacks, Dapr, and Knative Serving. The presenters demonstrate how these tools can help streamline the developer onboarding process, manage state and events in a distributed system, and provide a smooth operational experience for running applications in production.

Navigating the Depth of App Delivery Through Memes - Lian Li, Independent & Thomas Schuetz, WhizUs

The talk explores the depth and breadth of the CNCF's App Delivery Special Interest Group (TAG), highlighting its efforts to bring together projects, users, and vendors to discuss standards, best practices, and emerging topics in the realm of application delivery. The presenters engage the audience in a lighthearted and interactive manner, using memes and humor to showcase the community's work, challenges, and future directions.

Container Image Workflows at Scale with Buildpacks - Juan Bustamante & Aidan Delaney

This talk introduces Cloud Native Buildpacks, a tool for transforming source code into production-ready container images. The speakers demonstrate how Buildpacks can be integrated into various CI/CD workflows and discuss the benefits of using Buildpacks to produce secure, reproducible, and multi-architecture container images.

Kubernetes Controllers in Rust: Fast, Safe, Sane - Matei David, Buoyant

The talk discusses the benefits of using Rust for building Kubernetes controllers, focusing on its performance, safety, and sanity. The speaker also introduces the Cubert library, a Rust-based framework for building read-heavy controllers that provides a more opinionated and application-centric approach compared to the generic Kubernetes controller runtimes.

From Bash Scripts to Kubeflow and GitOps: Our Journey to Operationalizing ML at Scale

This presentation describes the journey of DHL's data and analytics team in operationalizing machine learning at scale, from Bash scripts to adopting Kubeflow as their main platform. The talk covers the challenges faced in their legacy setup, the transition to Kubeflow, and the benefits they have experienced in terms of developer empowerment, reliable experimentation-to-production pipelines, and the formalization of the MLOps role.

Contributing to Kubernetes in Its Second Decade - SIG ContribEx Style!

Kubernetes, an open-source container orchestration system, is entering its second decade, and the Special Interest Group on Contributor Experience (SIG ContribEx) is leading the charge to improve the contributor experience. This talk provides an overview of the Kubernetes community structure, the evolution of contribution processes, and practical guidance for new contributors to get involved and navigate the complex yet rewarding journey of contributing to this influential project.

Rebuilding Your Cloud Native Community: Lessons Learned from Stardew Valley - Imma Valls

This presentation shares the lessons learned from rebuilding the Cloud Native community in Barcelona, drawing parallels to the community-building aspects of the video game Stardew Valley. The speaker highlights the importance of having a diverse team of organizers, developing a local 'recipe' for meetups, and planning ahead to maintain momentum in the community.

Towards Great Documentation: Behind a CNCF-Led Docs Audit - Jorge Lainfiesta, Rootly

This talk explores the importance of great documentation for open-source projects, drawing insights from a CNCF-led documentation audit of the Backstage project. The speaker discusses the key elements of effective documentation, including understanding user personas, structuring content for different needs, and ensuring cohesion across the project's website, repository, and communication channels.

Product Market Misfit: Adventures in User Empathy - Mitch Connors, Aviatrix

The talk discusses the importance of user empathy in product development, using the example of Atari's failed ET video game and the speaker's own experiences with the ISO open-source project. The speaker emphasizes the need to understand and empathize with users' perspectives, challenges, and priorities in order to create successful products that meet their needs.

Kubernetes Maintainers Read Mean Comments - Tim Hockin & Davanum Srinivas

This talk explores the challenges and frustrations faced by the Kubernetes maintainers when dealing with user feedback and bug reports. The presenters share examples of mean comments and discuss effective ways for users to engage with the project and contribute to its development.

The IaC Evolution - on Open Source & Everything Else

The panel discusses the evolution of infrastructure as code (IaC), including the changes in popular tools like Terraform and the emergence of new approaches like Open Tofu. They also explore the challenges of managing the complexity and fragmentation in the IaC ecosystem, the need for better abstractions and platforms, and the potential impact of AI on infrastructure automation.

Reimagining Knative: A Case Study on How Designers Shape Better Documentation

The presentation discusses a case study on how designers can shape better documentation, using the example of reimagining the documentation for Knative, an automation layer for Kubernetes. The presenters highlight the importance of user-centered design and the use of card sorting, a UX research method, to improve the information architecture and navigation of the Knative documentation.

Why Is This so HARD? Conveying the Business Value of Open Source - Bob Killen, Google

The talk discusses the challenges in conveying the business value of open-source contributions to organizations, emphasizing the importance of data enablement through consistent labeling and milestones, as well as effective communication strategies to different stakeholders, such as leadership, product managers, and end-users.

Elevating Cloud Native Education - Langdon White & Anwesha Saha, Boston University

The presenters discuss the challenges of incorporating cloud-native development and cloud computing education into undergraduate university programs, and their efforts to create a collaborative working group within the CNCF to develop practical, academically-focused cloud education content.

The State of Backstage in 2024 - Ben Lambert & Patrik Oldsberg, Spotify

Backstage, an open-source developer portal framework, has seen significant growth and updates in the past year. The talk covers project updates, governance changes, and core framework improvements, including a new backend system and plans for dynamic feature deployment and declarative integration.

Feature Management Improv: Reduce Risk, Conquer Compliance, and Perfect Previews with OpenFeature

This talk explores the benefits of feature management with the open-source OpenFeature project, including reducing risk, improving compliance, and enabling feature previews. The speaker demonstrates dynamic feature flag evaluation using contextual data and discusses key considerations around managing feature flag complexity, such as client-side vs. server-side evaluation and sticky user assignments.

Simplified Inner and Outer Cloud Native Developer Loops - Oleg Šelajev, AtomicJar & Alice Gibbons

This talk presents a simplified approach to the inner and outer cloud-native developer loops, showcasing how tools like Dapper, test containers, and open feature can enhance developer productivity and enable a seamless transition from local development to production deployment. The speakers demonstrate how these tools work together to provide a consistent API-driven experience, allowing developers to focus on building applications without getting bogged down by infrastructure-level concerns.

Building AI-Ready Platforms -Symphony for Developer and Platform Engineer -Thomas Vitale & Lize Raes

This talk explores how to build AI-ready platforms that support the integration of large language models and vector databases, enabling developers to create applications powered by generative AI. The platform provides a developer-friendly experience, abstracting away the complexity of setting up and managing the underlying infrastructure, while ensuring observability, scalability, and secure deployment of these AI-driven applications.

Build Your Contributor Base

This video discusses strategies for building a strong contributor base for open-source projects, covering topics such as accessibility, responsiveness, contributor ladders, recruitment activities, and contributor recognition. The presenters provide practical advice and share experiences from their work within the CNCF Tag Contributor Strategy group.

Flux and the Wider Ecosystem Planning BoF

The video discusses the current state of the Flux project, including its maintenance efforts, community involvement, and plans for the future. It also introduces the wider Flux ecosystem and encourages audience participation in various areas such as infrastructure-as-code, Flux GUI, and documentation improvements.

Supercharging Argo CD’s Manifest Generation Capabilities

This talk discusses the evolution of Argo CD's manifest generation capabilities, including the introduction of Config Management Plugins (CMP) and the improvements made in CMP V2. It also highlights the security challenges faced and the planned enhancements to address them, as well as a call for community contributions to help implement the identified improvements.

Chart Your Course Like a Champion - Andrew Block & Karena Angell, Joe Julian, Scott Rigby

This session provides a comprehensive overview of the Helm chart structure, lifecycle, and advanced features such as library charts, name templates, and testing. The presenters also discuss the benefits of using OCI registries for Helm chart storage and distribution, as well as upcoming developments in Helm 4.

AI-Assisted Runbooks - Instigating Precision and Efficiency in Kubernetes Operations

This presentation explores the use of AI-assisted runbooks to enhance precision and efficiency in Kubernetes operations. The speaker demonstrates a framework for AI governance, a troubleshooting use case, and a low-code platform that integrates AI capabilities to streamline issue resolution and documentation in the Kubernetes ecosystem.

Developers Demand UX for K8s! - Máirín Duffy & Conor Cowman

The talk discusses the results of a user research study conducted by Máirín Duffy and Conor Cowman to understand the challenges faced by developers and platform engineers when working with Kubernetes. The study identified several pain points, such as debugging networking issues, YAML configuration challenges, and the need for better security analysis tools, and provides recommendations for improving the user experience of Kubernetes.

Client-side Feature Flagging with OpenFeature | Project Lightning Talk

This presentation introduces client-side feature flagging with OpenFeature, an open-source, vendor-agnostic feature flagging solution. It highlights the key benefits of feature flags, the OpenFeature project, and the newly released web SDK, which enables consistent API usage across server-side and client-side feature flagging use cases.

Porter: Project vs Maintainer - A Race Against the Execution | Project Lightning Talk

Porter is a project that takes the last 10 years of DevOps decisions and bundles them into a cloud-native application bundle (CNAB) artifact. The talk covers how Porter creates and publishes these CNAB bundles, including the use of Dockerfiles, mixins, and the CNAB specification beyond just Docker.

A Quick Look at the TAG App Delivery | TAG Lightning Talk

The video provides an overview of the TAG (Technical Advisory Group) App Delivery, a CNCF initiative focused on delivering cloud-native applications. The presentation covers the TAG's charter, working groups, and upcoming events, encouraging attendees to engage with the community and learn more about the group's activities.

Dapr: APIs for building secure and reliable microservices | Project Lightning Talk

Dapr is a distributed application runtime that simplifies the development of secure and reliable microservices. It provides a set of APIs that abstract away the underlying infrastructure, enabling developers to focus on their business logic while Dapr handles the complexities of service orchestration, observability, and resiliency.

Accelerate Your Modernization Journey with Konveyor! | Project Lightning Talk

Konveyor is an open-source project that aims to assist organizations in onboarding their traditional workloads into Kubernetes, providing insights, guidance, and automation for the migration and modernization process. The project's latest release includes a new analysis engine that leverages the Language Server Protocol to analyze applications in various programming languages, and future releases will introduce platform awareness, asset generation, and integration with generative AI to further streamline the modernization journey.

The Carvel Way: Packaging APIs stitching together sharp Unix-like tools | Project Lightning Talk

The Carvel Way: Packaging APIs stitching together sharp Unix-like tools is a talk that explores Carvel, a set of composable tools for managing configuration and artifacts, and how it can help organizations overcome challenges in packaging and relocating applications, managing resources, and versioning configurations. The talk covers specific use cases and features of Carvel that make it a powerful tool for managing complex, distributed applications.

SlimToolkit: Improving Developer Experience with Containers: Making it Easy to Understand, Optimi...

SlimToolkit is a tool that helps developers create minimal and production-ready container images by analyzing the container and the application inside, generating security profiles, and providing a seamless debugging experience with a sidecar-based approach. The tool offers an easy-to-use command-line interface and interactive prompt mode, making it a valuable asset for developers working with containers.

Kubernetes-style APIs for SaaS-like Control Planes with kcp | Project Lightning Talk

This talk explores the use of Kubernetes-style APIs to build SaaS-like control planes with the kcp (Kubernetes Control Plane) project. It highlights how kcp provides logical clusters, a way to offer Kubernetes APIs as a commodity, and how it decouples the platform operator and service provider roles, enabling scalable and versioned API management.

How Crossplane is Accelerating Your Cloud Native Control Plane Journey | Project Lightning Talk

Crossplane is a cloud-native solution for provisioning cloud resources, offering self-service capabilities to users. This talk showcases recent improvements to the Crossplane CLI, including features like 'init' for easy project setup, 'render' for previewing resource deployments, 'validate' for offline schema validation, and 'trace' for debugging resource issues.

Lightning Talk: Minecraft Meets Kubernetes: Crafting Future Developers... Jenny Bartz & Enrico Bartz

This talk explores how the presenters combined Minecraft and Kubernetes to create an engaging and accessible learning environment for children, empowering them to explore programming, coding, and computer science concepts. The presenters share their experience of using Hobby Farm, a Kubernetes-native platform, to deliver self-learning tutorials and support children's independent exploration of technology.

Lightning Talk: How Did We Get Here? Why You Need Platform Engineering - Ettie Eyre, Ovo Energy

This lightning talk discusses the need for platform engineering in organizations like Ovo Energy, where teams are responsible for managing their own cloud infrastructure, leading to a significant maintenance burden. The speaker presents Ovo's approach to addressing this challenge by building a centralized, multi-tenant platform that handles the auxiliary services, allowing teams to focus on their core value-adding work.

Lightning Talk: Rust-Based Magic: Streamlined and Secure - Christian Hüning, BWI GmbH

This talk explores the use of Rust, a systems programming language, in the implementation of the Linkerd service mesh. The speaker highlights Rust's safety features and how they can help prevent runtime issues like concurrency race conditions and uninitialized variables, which are commonly encountered in Go-based control plane implementations.

Building a Tool to Debug Minimal Container Images in K8s Docker... - Kyle Quest & Saiyam Pathak

This talk presents a tool that simplifies the debugging experience for minimal container images across different runtimes, including Docker, Kubernetes, and ContainerD. The tool leverages techniques like namespaces and sidecar containers to provide a consistent developer experience, enabling seamless access to the target container's file system and application.

Lightning Talk: Language Inclusivity in Tech: A Call to Action - Ali Dowair, Independent

This talk highlights the importance of language inclusivity in the tech industry, particularly in the cloud native community. The speaker proposes three initiatives - Kubernetes Community Meetups, Kubernetes Documentation Localization, and the Cloud Native Glossary - as ways to bridge language barriers and foster a more diverse and inclusive community.

Emissary-Ingress: Present and Future - Flynn, Buoyant

The video discusses the past, present, and future of Emissary, an open-source, cloud-native, developer-centric, and self-service API Gateway. It highlights the project's history, the current state of the codebase, and the planned improvements to enhance the user experience and community involvement.

GitOps

GitOps Continuous Delivery at Scale with Flux - Stefan Prodan

The presentation discusses the scaling challenges faced by organizations using GitOps with Flux and the strategies to address them, including source optimization, vertical scaling, and horizontal scaling through sharding. The speaker also outlines the roadmap for Flux, including plans for API stabilization, new features, and efforts to make the project more sustainable through a multi-vendor, multi-individual maintenance model.

Enhancing Reliability Through Multi-Cluster Deployment: Leveraging the Power of Karmada

Karmada is a powerful open-source project that enables seamless management of multi-cluster deployments, providing features such as cross-cluster scheduling, service discovery, and fault tolerance. This session provides an in-depth overview of Karmada's architecture, core concepts, and the growing Karmada community, highlighting its potential to revolutionize the way organizations handle complex multi-cloud and hybrid cloud environments.

Tutorial: Progressive Delivery with Argo Rollouts

This tutorial provides a hands-on introduction to progressive delivery using Argo Rollouts, a Kubernetes deployment controller that enables gradual and controlled rollouts of applications. The presentation covers the evolution of software delivery practices, the concepts of GitOps and progressive delivery, and demonstrates how Argo Rollouts can be used to implement Blue-Green and Canary deployments with automated analysis and rollback capabilities.

Effective centric CD pipeline: Toward PipeCD v1.0 | Project Lightning Talk

This talk discusses the effective Centric CD pipeline called PipeCD, a CNCF sandbox project that provides a consistent deployment and operation experience for any application platform. The speaker highlights PipeCD's key features, including multi-provider and multi-tenancy support, secure deployment without SSH access, and its ability to handle large-scale organizations with thousands of applications.

Flux: Secure and Scalable GitOps | Project Lightning Talk

Flux is a secure and scalable GitOps project that provides declarative, Kubernetes-native continuous delivery. It includes Flagger, a sub-project that enables advanced deployment strategies like canary releases and traffic mirroring, with support for popular cloud-native tools and metrics providers.

Growing Better Together: Argo's Community-Driven Development | Project Lightning Talk

The Argo project, comprising four distinct tools for workflow management, event-driven automation, GitOps deployment, and progressive delivery, has gained significant adoption and community support. The presentation highlights Argo's community-driven development, strong user satisfaction metrics, and upcoming features, emphasizing the project's commitment to addressing the evolving needs of its users.

Governance

Defining the Next Decade of Cloud Native: AMA with the CNCF CTO and TOC Moderated by Chris Aniszczyk

The video provides an in-depth look into the governance and structure of the Cloud Native Computing Foundation (CNCF), including the roles of the Technical Oversight Committee (TOC), Technical Advisory Groups (TAGs), and End User Community. The discussion highlights the Foundation's efforts to evolve and adapt to the rapidly changing cloud native ecosystem, as well as the challenges and opportunities ahead in the next decade of cloud native development.

Kubernetes Steering Committee: Genesis, Bootstrap, Now & Future - Nabarun Pal & Paco Xu

The Kubernetes Steering Committee provides governance and oversight for the Kubernetes project, addressing non-technical decisions, community group operations, financial planning, and project values. The presentation covers the committee's recent work on streamlining the annual reporting process, introducing new roles like sub-project leads, and reflecting on the project's unexpected growth and future challenges in the next decade.

HPC

Enabling Coordinated Checkpointing for Distributed HPC Applicati... Radostin Stoyanov & Adrian Reber

This presentation discusses the challenges of enabling coordinated checkpointing for distributed HPC applications and the research work done to extend the Criu checkpoint/restore tool to support distributed applications. The presenters demonstrate the integration of Criu with container runtimes and orchestration platforms like Kubernetes, and discuss the use cases and future work for this technology.

Keynote

Keynote: Cloud Native in its Next Decade - Davanum Srinivas & Lin Sun

The keynote presentation provides an insightful overview of the evolution of the cloud-native ecosystem over the past decade, highlighting the growth and maturity of the CNCF community. It also explores the potential directions for the next decade, focusing on emerging trends such as AI, sustainability, security, and edge computing, as well as the role of simplicity, consolidation, and developer-centric innovations in shaping the future of cloud-native technologies.

Keynote: Closing Remarks

The keynote speaker expresses gratitude to the co-chair for her valuable contributions and acknowledges the upcoming CNCF community events around the world, including Cube Con and Cloud Native Con in various locations, highlighting the vibrant and diverse ecosystem of the cloud native community.

Keynote: Cloud Native Hacks Winner Announcement

This video covers the keynote address at the inaugural Cloud Native Hacks hackathon, organized by the United Nations and the Cloud Native Computing Foundation. The keynote highlights the importance of using technology, particularly open-source and digital public goods, to address pressing global challenges aligned with the UN's Sustainable Development Goals, and announces the winners of the hackathon.

Keynote: Success Not Guaranteed - Bob Wise, CEO, Heroku

The keynote speaker, Bob Wise, CEO of Heroku, recounts the early days of Kubernetes and the challenges faced in its adoption, highlighting the importance of open governance and community collaboration in the project's success. He also announces Heroku's plans to rebase on Kubernetes, completing the circle of influence between the two cloud-native technologies.

Keynote: Opening Remarks

The speaker reflects on the 10-year journey of Kubernetes, celebrating its growth from a small open-source project to one of the largest and most active open-source communities. The talk also highlights the various initiatives and programs the CNCF has undertaken to support and grow the Kubernetes ecosystem, including the launch of a new education ambassador program and the expansion of the Kubernetes training partner program.

Keynote: 🇫🇷 Hip, Hip, Beret! No Cap, Just Cloud Native Facts - Taylor Dolezal

The keynote explores the role of the Cloud Native Computing Foundation (CNCF) in shaping the diverse and rapidly evolving cloud native ecosystem. The speaker emphasizes the CNCF's ability to provide a lens that focuses and clarifies the complex landscape, making it accessible and understandable for the community.

Keynote: Opening Remarks

The keynote speaker expresses gratitude to the end-user community for their invaluable contributions and dedication, which have shaped the cloud-native journey and pushed the boundaries of what is possible. The speaker then introduces the next speaker, Taylor Dool, who will take the audience on an accelerating journey through the cloud-native universe, celebrating innovation and collaboration and offering a glimpse into the future of cloud computing.

Keynote: Closing Remarks

The keynote address concludes with a summary of the event's activities, including the Security Hub, Wellness Lounge, and open space sessions. Attendees are encouraged to explore the various offerings and look forward to the final day of the conference.

Keynote: Awards Ceremony

The video discusses the importance of the end-user community within the Cloud Native Computing Foundation (CNCF) and the newly formed End-User Technical Advisory Board (TAB). The video also announces the winners of the top end-user awards, highlighting the exceptional contributions of organizations like Expedia, Shopify, and CERN in adopting and contributing to CNCF projects.

Keynote: Closing Remarks

The keynote speaker expresses gratitude to the speakers and organizers of the CubeCon Cloud Native Con Paris event. They highlight the upcoming networking and entertainment opportunities, as well as educational sessions on Kubernetes and Cloud Native sign language, the AI Hub, and acknowledgments of sponsors and the program committee.

CNCF Project Lightning Talks Welcome & Opening - Jorge Castro, CNCF

The video presents an overview of the CNCF Project Lightning Talks at the CubeCon conference. The speaker, Jorge Castro, introduces the format of the lightning talks, encourages attendees to explore the various CNCF projects, and provides information about the conference logistics and resources available to the attendees.

CNCF Project Lightning Talks Closing - Jorge Castro, CNCF

This video features the closing remarks of the CNCF Project Lightning Talks, where the speaker encourages attendees to provide feedback on the event and invites them to attend the upcoming Kubecon lightning talks. The video emphasizes the value of the lightning talk format and the dedication of attendees who plan to attend a full day of lightning talks at Kubecon.

Kubernetes

Kubernetes Data Protection WG Deep Dive - Dave Smith-Uchida, Veeam

This presentation provides a deep dive into the Kubernetes Data Protection Working Group, covering key updates, the motivation for establishing the group, and ongoing projects. It highlights the group's efforts to address the limitations of Kubernetes for data protection operations and the various building blocks being developed to support backup and restore workflows.

Networking

At the Intersection of Cilium CNI and Service Mesh - Who Has the Right of Way? - Christine Kim

This talk explores the intersection of Cilium, a Container Network Interface (CNI) project, and service mesh architectures. It delves into the roles and responsibilities of CNIs and service meshes, highlighting how Cilium leverages Envoy as a service mesh component within its architecture.

SIG-Multicluster Intro and Deep Dive - Jeremy Olmsted-Thompson, Google & Stephen Kitt, Red Hat

This talk provides an overview of the SIG-Multicluster group, its focus on building APIs to enable multi-cluster Kubernetes deployments, and the various building blocks they have developed, including the About API, Multi-Cluster Services API, and the upcoming Cluster Inventory API. The presenters emphasize the importance of community involvement to help shape the group's direction and address real-world use cases.

High Performance Multi Regions Messaging with Nats - Cyril Becker & Vincent Bernaud, XBTO

The presentation discusses the high-performance multi-region messaging architecture implemented by XBTO, a crypto trading firm, using the NATS messaging system. The key aspects covered include the use of NATS super clusters, leaf nodes, and custom monitoring tools to achieve scalability, reliability, and observability across their global infrastructure.

Assessing Service Mesh's Net Worth: A Pragmatic Onboarding - Adrien Gillard, Decathlon

The presentation discusses Decathlon's assessment of service mesh and their decision to instead expand their existing API gateway solution to address their users' needs. The key takeaway is that organizations should focus on addressing their specific requirements rather than blindly adopting new technologies like service mesh.

Cilium ClusterMesh in Action: Strengthening Security Across Distributed Kubernetes Clusters

This presentation discusses the implementation of Cilium ClusterMesh to strengthen security across distributed Kubernetes clusters at a large Brazilian credit union. The key challenges faced include managing network policies, automating ClusterMesh configuration, and addressing performance issues related to the service mesh sidecar architecture.

CNI: Recap and Update - Casey Callendrello, Isovalent & Tomofumi Hayashi, Red Hat

This video provides a comprehensive update on the Container Networking Interface (CNI) project, covering the release of version 1.0 in 2021, the upcoming 1.1 version with new features, and a discussion on the future roadmap, including potential additions like drop-in directories, metadata support, and dynamic reconfiguration. The presenters emphasize the importance of community feedback and involvement in shaping the direction of the CNI specification.

Tutorial: From CNI Zero to CNI Hero: A Kubernetes Networking Tutorial Using CNI

The video provides a comprehensive tutorial on Kubernetes networking using the Container Network Interface (CNI). It covers the basics of CNI, including its configuration and plugin development, and demonstrates how to install and debug a CNI plugin, as well as how to create a custom CNI plugin using a simple bash script.

Gateway API: Beyond GA - Mattia Lavacca, Kong; Surya Seetharaman, Nick Young, Lior Lieberman

The Gateway API team presented updates on their 1.1 release, including new features like gRPC route support and policy attachment improvements. They also discussed their efforts to improve the user experience through migration tools and discoverability features, as well as their plans for a new Gateway Kettle CLI tool.

Tutorial: Configuring Your Service Mesh with Gateway API - Mike Morris, Microsoft & Flynn, Buoyant

This tutorial provides a comprehensive introduction to configuring a service mesh using the Gateway API. The presenters demonstrate how to set up traffic routing, canary deployments, and timeouts using the Gateway API with both Linkerd and Istio service meshes.

Keep Calm and Load Balance on KIND - Antonio Ojea & Benjamin Elder, Google

The presentation discusses the development of a load balancer solution for the Kubernetes in Kind (KIND) project, which aims to provide a simple and efficient way to test Kubernetes changes locally. The solution, called Cloud Provider KIND, mimics the behavior of a cloud provider's load balancer to enable testing of advanced Kubernetes networking features like terminating endpoints and rolling updates without disruption.

Network Policy: The Future of Network Policy with AdminNetworkPolicy

This talk provides an overview of the Network Policy API, the new Admin Network Policy API, and the Baseline Admin Network Policy API. It also introduces the Policy Assistant, a command-line tool that helps users validate and troubleshoot network policies before applying them to a production cluster.

Ingress-Nginx and 2024 Plans - Marco Ebert, Giant Swarm & James Strong, Isovalent

The video discusses the challenges and plans for the Ingress-Nginx project, including the migration from Lua to JavaScript, the implementation of Gateway API, and the improvements in the Helm chart testing and release process. The presenters also highlight the various ways the community can get involved in the project, beyond just code contributions.

Connecting Millions of Containers Spanning Dozens of Clusters

This talk presents the challenges of networking and service discovery in a large-scale, multi-cloud, and multi-cluster Kubernetes environment. The speakers discuss the limitations of Kubernetes' native service discovery mechanisms and how they built a custom solution to address their specific requirements, highlighting the trade-offs and complexities involved in such an approach.

Reducing Cross-Zone Egress at Spotify with Custom gRPC Load Balancing

This talk discusses how Spotify reduced cross-zone egress in their backend communication by implementing a custom gRPC load balancing algorithm. The algorithm considers factors like expected latency, error rates, and zone locality to optimize for low latency and cost, resulting in a significant reduction in cross-zone traffic.

Linkerd Project Update: VM Support, Ingress, Security on the Edge, and Rust

The video presentation provides an update on the Linkerd project, including new features such as VM support, ingress, security on the edge, and the use of Rust in the project. The presenter discusses the design philosophy of Linkerd, its unique architectural components, and the project's roadmap for the coming year.

CoreDNS Plugins: A Deep Dive - John Belamaric, Google & Yong Tang, Ivanti

CoreDNS is a powerful, plugin-based DNS server that provides a flexible and extensible solution for service discovery and DNS management in cloud-native environments. This talk explores the core concepts of CoreDNS, its plugin architecture, and demonstrates how to develop a custom plugin to address specific use cases, highlighting the simplicity and ease of extending the platform.

Simplifying Multi-Cluster and Multi-Cloud Deployments with Cilium - Liz Rice, Isovalent

This presentation discusses how Cilium's Cluster Mesh feature can simplify the deployment of services across multiple Kubernetes clusters, even in different cloud environments. It highlights the ease of use, flexibility, and advanced capabilities of Cluster Mesh, such as topology-aware routing, failover, and network policy enforcement across clusters.

How We Are Moving from GitOps to Kubernetes Resource Model in 5G Core

This presentation discusses the transition from GitOps to a Kubernetes Resource Model in the 5G core network. The speakers describe their journey in moving from a static, paper-map-like configuration management approach to a more dynamic, abstracted, and Kubernetes-native solution, leveraging tools like SDC and custom operators to address challenges around reconciliation, in-band orchestration, and configuration complexity.

Persistence Pays Off: The Path to Session Persistence in Gateway API

The session persistence feature in the Gateway API aims to ensure consistent routing of user requests to the same backend server throughout their interactions. The presentation covers the design and implementation of this feature, highlighting the key API concepts, use cases, and the path to its integration into the Gateway API.

Comparing Sidecar-Less Service Mesh from Cilium and Istio - Christian Posta, Solo.io

The video discusses the architectural differences between the sidecar-less service mesh approaches of Cilium and Istio, highlighting their control plane, data plane, and mutual authentication implementations. It also covers the observability and traffic management capabilities of these two open-source service mesh projects.

How Many Network Policies Can I Create? - Nadia Pinaeva, Red Hat & Shaun Crampton, Tigera

This talk presents a framework for scale testing network policies in Kubernetes, addressing the challenge of determining the maximum number of network policies that can be created. The framework, developed by Nadia Pinaeva and Shaun Crampton, uses a network policy scale profile to simplify and standardize the testing process, and provides tools for analyzing existing network policy configurations and approximating the scale limits for different profiles.

The gRPCRoute to Success: Idiomatically Routing and Balancing...- Richard Belleville & Arko Dasgupta

The presentation discusses the new gRPC Route resource in the Gateway API, which provides improved routing and load balancing capabilities for gRPC traffic in Kubernetes. It highlights the benefits of gRPC Route, including method-level matching and integration with service mesh solutions like Gamma, and outlines future feature plans such as gRPC web transcoding and payload-based routing.

A Cilium Introduction: Back to Bee-Sics - Nico Vibert & Dan Finneran, Isovalent

Nico Vibert and Dan Finneran from Isovalent provide an introduction to Cilium, a cloud-native networking and security solution built on eBPF. They explain the challenges of Kubernetes networking and how Cilium addresses them, including features like network policies, observability, and load balancing.

Future of Service Mesh is Sidecar-less with Istio Ambient Mesh | Project Lightning Talk

The presentation discusses the future of service mesh, focusing on Istio's Ambient Mesh, a sidecar-less approach that simplifies operations and reduces costs. The key aspects highlighted include the Zot Trust Tunnel, Waypoint proxy, and the comparison of cost and resource utilization between no mesh, Ambient Mesh, and traditional sidecar-based service mesh.

K8gb: Reliable Global Service Load Balancing without vendor lock-in | Project Lightning Talk

K8gb is an open-source, Kubernetes-native project that provides reliable global service load balancing without vendor lock-in. The project abstracts the complexity of multi-cluster, geographically-distributed traffic management through a single GSLB CRD, offering features like failover, round-robin, and geo-proximity load balancing strategies, as well as continuous application health monitoring.

What is TAG Network? | TAG Lightning Talk

The video provides an overview of the TAG Network, a group within the Cloud Native Computing Foundation (CNCF) that aims to enable widespread and successful development, deployment, and operations of cloud-native networks. It also introduces the Service Mesh Working Group, which is a part of TAG Network and focuses on addressing challenges and developing solutions in the service mesh ecosystem.

Lightning Talk: Expand Your Kubernetes Horizons with Multiple Service CIDRs: A Game-... Antonio Ojea

This talk introduces a new feature in Kubernetes 1.29 that allows for the dynamic expansion of the Service CIDR range. This feature addresses the common problem of IP address exhaustion within a Kubernetes cluster and provides a flexible solution for managing and scaling the IP address space as the cluster grows.

Cilium: Connecting, Observing, and Securing Service Mesh and Beyond with eBPF - Panel

This video provides an overview of the Cilium project, which consists of Cilium (networking), Hubble (observability), and Tetragon (runtime security and observability). The video covers recent updates, including the 1.15 release of Cilium, the 1.0 release of Tetragon, and the use of Cilium in various customer use cases, as well as the community's efforts to improve features like mutual authentication.

Observability

Prometheus Update from the Maintainers - Bryan Boreham, Grafana Labs & Simon Pasquier, Red Hat

The video provides an update on the Prometheus monitoring system, covering recent developments, upcoming features, and the community involvement. The speakers discuss the project's history, growth, and plans for a major version 3 release, focusing on improvements to the data model, support for open standards, and user experience enhancements.

Keynote: Graduated Project Updates

The video provides updates on various CNCF projects, including Serum, Cloudflare, Envoy, Falco, Fluent Bit, Flux, Harbor, Istio, Kubernetes, Linkerd, Open Policy Agent, Rook, and Vitess. The updates cover new features, performance improvements, security enhancements, and community involvement.

Observable Feature Rollouts with OpenTelemetry and OpenFeature - Daniel Dyla & Michael Beemer

This talk explores the use of OpenTelemetry and OpenFeature to enable safe and observable feature rollouts. It demonstrates how to leverage these open-source tools to control the impact radius of a feature change, monitor its effects, and automate rollbacks if necessary, ultimately leading to more reliable and data-driven feature deployments.

Fluent Bit v3: Unified Layer for Logs, Metrics and Traces - Eduardo Silva, Calyptia

Fluent Bit v3 introduces a unified layer for handling logs, metrics, and traces, with new features like processors, SQL processing, and customizable Open Telemetry output. The presentation highlights Fluent Bit's evolution, its adoption and usage across various environments, and the key improvements in the latest version.

How KubeVirt Improves Performance with CI-Driven Benchmarking, and You Can Too

The presentation discusses how KubeVirt, a Kubernetes-based virtualization platform, uses CI-driven benchmarking to measure and improve its performance and scalability. The speakers share insights on monitoring the control plane as a shared resource, analyzing performance metrics, and providing a framework for other developers to benchmark their own Kubernetes-based applications.

Prompt: Help Me Debug a Cluster! - Anusha Ragunathan & Lili Wan, Intuit Inc

The video discusses Intuit's platform engineering challenges in managing a large Kubernetes-based infrastructure, including alert fatigue, and how they have addressed these challenges using cluster golden signals and AI-powered platform debugging. The key takeaways are the streamlining of incident detection and remediation through the use of these techniques.

SIG API Machinery Maintainers (Two Tracks) - Abu Kashem, Red Hat & Mike Spreitzer, IBM

The video covers the API Priority and Fairness (APF) feature in Kubernetes, which regulates the load on the API server to protect it from clients and clients from each other. The speakers discuss the configuration and implementation details of APF, including request classification, dispatching, and handling of priority inversion and watch storms.

Tutorial: Simpler Than Making a Fraisier Cake: Building, Running, and Observing Your First eBPF...

This tutorial provides a practical introduction to eBPF, a powerful tool for kernel-level instrumentation and observability. Participants learn how to build, run, and observe their first eBPF program, starting with a basic example and progressing to more advanced use cases, including network monitoring and out-of-memory exception detection.

Thanos’ Infinity Stones and How You Can Operate Them! - Saswata Mukherjee & Daniel Mohr, Red Hat

This talk explores innovative ways to leverage Thanos, a distributed and scalable monitoring system based on Prometheus, to derive more value from metric data beyond traditional use cases. It covers topics such as leveraging customer telemetry, performing advanced analytics, optimizing single-cluster monitoring, enabling multi-cluster observability, and automating Thanos deployment through an operator, all while highlighting the importance of the Thanos community.

Kubernetes Chronicles: Stories of Triumph in Kubernetes Support - Roman Doroschevici & Sian Meoli

This talk provides insights into the challenges and best practices for supporting Kubernetes in production environments. The presenters share real-life case studies, highlighting common issues with networking, node stability, control plane, and third-party integrations, and offer strategies for effective troubleshooting and mitigation.

SIG Instrumentation Introduction and Deep Dive - Han Kang, Google & Damien Grisonnet, Red Hat

This talk provides a comprehensive overview of the Sig Instrumentation project in Kubernetes, covering its charter, sub-projects, and future plans. The speakers delve into the key observability components, including logs, metrics, and tracing, and highlight the ongoing efforts to improve the overall observability experience for Kubernetes users.

Observability TAG: A Review and the Rise of Gen-AI Observability

The Observability Technical Advisory Group (TAG) is working to grow the ecosystem of open-source observability, identify gaps, share knowledge, and support projects. The group is exploring the use of large language models (LLMs) for observability, including leveraging LLMs for root cause analysis, anomaly detection, and query standardization.

Leverage Contextual and Structured Logging in K8s for Enhanced Monitoring

This talk discusses the recent developments in Kubernetes' structured and contextual logging, which enable enhanced monitoring and observability of Kubernetes clusters. The speaker demonstrates how to leverage these logging capabilities, along with open-source tools like OpenTelemetry and Grafana, to set up a multi-cluster monitoring solution.

Tutorial: Exploring the Power of Distributed Tracing with OpenTelemetry on Kubernetes

This tutorial provides a comprehensive introduction to using OpenTelemetry for distributed tracing on Kubernetes, covering topics such as tracing concepts, auto-instrumentation, manual instrumentation, sampling, span metrics, and the OpenTelemetry Transformation Language for data transformation. The presenters demonstrate how to set up and configure the necessary components to collect and visualize tracing data from a polyglot application running on Kubernetes.

Fink on Kubernetes: Efficient Management of Massive Alert Streams for Astronomical Objects Identific

The talk presents the Fink project, which is a real-time alert broker for astronomical events, leveraging Kubernetes and cloud technologies to efficiently manage and process massive streams of data from observatories. The speakers discuss the architectural choices, the challenges of maintaining and scaling the system, and the community-driven approach to classification and analysis of the astronomical data.

Prometheus and OpenTelemetry: Better Together - Adriana Villela & Reese Lee

This talk explores the interoperability between the Prometheus and OpenTelemetry projects, highlighting how they can work together to improve observability. The speakers discuss the components of the OpenTelemetry Collector, the Prometheus remote write exporter, and the OpenTelemetry Target Allocator, providing a practical demonstration of how these tools can be used to collect and process Prometheus metrics in a Kubernetes environment.

Zonal Outage Operational Stories - Jyoti Ranjan Mahapatra & Shyam Jeedigunta, Amazon Web Services

The presentation covers the operational challenges and strategies employed by the Amazon EKS team to ensure the reliability of Kubernetes clusters during zonal outages and partial failures. It highlights the use of the Swiss Cheese model, redundancy, and failure domains to understand and mitigate complex failure scenarios in distributed systems.

From RUM to Front-End Observability with OpenTelemetry - Purvi Kanal, Honeycomb

This talk discusses the current state of front-end observability, introduces OpenTelemetry as a vendor-agnostic observability framework, and demonstrates how to leverage automatic and manual instrumentation to gain deeper insights into web application performance and user interactions.

Intelligent Observability: The Foundation for Operating Smarter in the Age of AI - Alolita Sharma

The presentation discusses the evolving observability landscape in the age of AI-powered applications. It highlights the key aspects of integrating observability with the model training process, understanding inference pipelines, and considering performance and resource consumption for running AI models.

What not to do when you're updating Istio in a critical environm... David de Lucca & Guilherme Saijo

The video discusses the challenges and lessons learned from updating Istio, a service mesh application, in a critical production environment. The presenters share their experiences, including the importance of gradual rollouts, observability, and the need for a customized approach to ensure a smooth and safe update process.

Troubleshooting Hidden Performance and Costs in Network Traffic Across Multiple AZs with eBPF

This talk explores how eBPF can be used to gain visibility into hidden performance and cost implications of network traffic across multiple Availability Zones (AZs) in a Kubernetes environment. The speakers demonstrate how eBPF-based observability can help identify and address issues related to service complexity, cost, and reliability in a resilient, cloud-native architecture.

From UI to Storage: Unraveling the Magic of Thanos Query Processing

This talk unravels the magic of Thanos query processing, exploring how data flows from the user interface to the storage layer. It highlights the various Thanos components, their roles, and the optimizations implemented in the custom Prometheus query engine to handle distributed queries efficiently.

OpenTelemetry: Project Updates, Next Steps, and AMA

OpenTelemetry, a project aimed at unifying tracing, logging, and metrics, provides updates on its progress, including the stabilization of logging SDKs, the work on semantic conventions, and the addition of client-side instrumentation and profiling capabilities. The project has also applied for graduation from the CNCF, signaling its maturity and stability for long-term adoption.

Building Confidence in Kubernetes Controllers: Lessons Learned from Using E2e-Framework

This talk discusses the use of the E2E-Framework for building confidence in Kubernetes controllers. The speakers from Datadog and Crossplane share their experiences and lessons learned in adopting the framework, highlighting the benefits, challenges, and future improvements they aim to implement.

Distributed Tracing with Jaeger and OpenTelemetry - Pavol Loffay, Red Hat & Jonah Kowall, Aiven

This talk introduces distributed tracing with Jaeger and OpenTelemetry, covering the basics of instrumentation, data collection, and analysis. The speakers also discuss upcoming changes in Jaeger v2, including integration with the OpenTelemetry Collector and support for new storage backends like ClickHouse, as well as opportunities for community involvement in the project.

Keptn: A Deep Dive - Giovanni Liva & Anna Reale, Dynatrace

Keptn: A Deep Dive explores how Keptn, a cloud-native application delivery and automation platform, can address the complexities of Kubernetes deployments, application health monitoring, and artifact promotion across environments. The presentation highlights Keptn's capabilities in providing observability, defining service-level objectives, and automating the deployment and promotion process, ultimately improving the signal-to-noise ratio and streamlining the overall application lifecycle management.

Disintegrated Telemetry: The Pains of Monitoring Asynchronous Workflows - Johannes Tax, Grafana Labs

The talk discusses the challenges of monitoring asynchronous workflows, particularly the disintegration caused by temporal decoupling and the solutions offered by open telemetry semantic conventions. The speaker highlights the need for standardization and collaboration to address the pain points, such as lack of established best practices, complex traces, and issues with sampling.

Monitoring Kubernetes and Cloud Spend with OpenCost | Project Lightning Talk

Open Cost, a CNCF Sandbox project, is an open-source Kubernetes and Cloud cost monitoring solution that provides a specification and implementation for cost allocation and carbon footprint tracking. The project has recently introduced new features, including a plugin architecture for integrating external cost data sources and the ability to measure the carbon footprint of workloads, making it a comprehensive platform for cost and sustainability monitoring in cloud-native environments.

The State of K8sGPT: Your Troubleshooting Companion | Project Lightning Talk

K8sGPT is a new project that aims to make troubleshooting Kubernetes easier by using AI and machine learning. The project has grown rapidly, with over 5,000 GitHub stars, and is now part of the CNCF sandbox, with plans to add more features and integrations to further streamline the Kubernetes troubleshooting process.

OpenTelemetry for OSS! | Project Lightning Talk

The talk explores the importance of observability for open-source software (OSS) libraries, highlighting the challenges of centralized instrumentation and the benefits of library authors maintaining their own observability solutions. The presenters advocate for observability to become a best practice in software development, enabling users to better understand and configure their applications.

Lightning Talk: Help! My Envoy Sidecar Is Consuming 8GBs of Memory! - Krzysztof Słonka, Kong

This talk discusses strategies to reduce Envoy sidecar memory consumption, including using features like auto-reachable services, delta XDS, and on-demand XDS. The speaker provides examples of how these features can be implemented in various service mesh solutions like Kuma, Istio, and Consul.

From Configurations to Conclusions: Lessons from Fine-Tuning Open Telemetry’s... - Vijay & Aishwarya

The talk discusses the lessons learned from fine-tuning the Open Telemetry collector for tracing at eBay, covering their approach to mass adoption, pipeline performance tuning, scaling, and storage optimization. The presenters also introduce their custom Exemplar-based tail sampling technique to efficiently sample and retain important traces.

Cortex Intro: Multi-Tenant Scalable Prometheus - Ben Ye & Friedrich Gonzalez

Cortex is a horizontally scalable, highly available, multi-tenant long-term storage solution for Prometheus that addresses the scalability challenges of Prometheus. The presentation covers the key features of Cortex, such as its ability to handle high cardinality, multi-tenancy, and fast querying, as well as new features and enhancements introduced in recent releases.

Dealing with eBPF’s Observability Data Deluge - Anna Kapuścińska, Isovalent

The talk discusses the use of eBPF for observability, highlighting its advantages such as no instrumentation, complete visibility, and reliability. It also addresses the challenges of dealing with the data deluge and the importance of context and correlation in building effective observability tools using eBPF.

Crossplane Observability &Traceability for Effective Multi-Cloud... - Katharina Sick & Viktor Farcic

This talk demonstrates how Crossplane can be used to enable observability and traceability in a multi-cloud environment, empowering developers to deploy and monitor their applications with ease. By integrating observability tools like Dino Trace directly into Crossplane compositions, the speakers show how platform engineers can provide developers with transparent insights into the health and performance of their services, fostering trust and confidence in the underlying platform.

Operations

Kubernetes Infra SIG: Intro and Updates - Mahamed Ali, Cisco & Benjamin Elder, Google

The Kubernetes Infrastructure SIG (Sig Infra) manages the infrastructure that supports the Kubernetes project, including the container image registry, CI/CD system, and release bucket. The presentation covers the group's priorities for 2023, including migrating remaining infrastructure to the CNCF, improving observability and cost optimization, and updating the CI/CD system and release process.

Future of Intelligent Cluster Ops: LLM-Azing Kubernetes Controllers - Rajas Kakodkar & Amine Hilaly

This talk explores the use of large language models (LLMs) to enhance Kubernetes cluster operations, including automating tasks like cluster upgrades, vulnerability scanning, and chaos simulation. The presenters discuss the challenges of relying solely on LLMs for critical infrastructure management and propose a framework for incorporating LLMs as part of a responsible AI approach, where human oversight and fine-tuning of the models are essential.

State of Platform Maturity in the Norwegian Public Sector - Hans Kristian Flaatten

The talk provides an overview of the state of platform maturity in the Norwegian public sector. It highlights the journey of the Norwegian government's adoption of cloud-native technologies, the formation of a community around platform engineering, and the results of a survey that assessed the maturity of platform engineering practices across various government agencies and state-owned enterprises.

Planning for Maturity: SIG Release's Revamp for a More Stable Kubernetes

The talk discusses the evolution of the Kubernetes SIG Release team, highlighting their efforts to streamline the release process, enhance documentation requirements, and introduce new tools and projects to improve the overall Kubernetes ecosystem. The presentation emphasizes the team's focus on planning for the future maturity of Kubernetes, with a particular emphasis on improving security, artifact validation, and package management.

Agent-Based Design for Automating Large-Scale K8s Operations - Karan MV, GitHub

This talk presents an agent-based design approach for automating large-scale Kubernetes operations at GitHub. The speaker discusses how GitHub's paved path, a comprehensive suite of automated tools and processes, enables efficient deployment, hosting, and management of microservices, and how a custom agent-based solution helps reliably and efficiently operate their Kubernetes infrastructure at scale.

Performance Engineering

Lessons Learned from Let's Profile - Frederic Branczyk, Polar Signals

The presentation discusses lessons learned from the 'Let's Profile' series, where the speaker and their team analyze open-source projects to identify and optimize performance bottlenecks. The speaker emphasizes the importance of understanding resource usage and efficient hardware utilization as the computing industry faces the end of Moore's Law.

Beyond Default: Harnessing CPU Affinity for Enhanced Performance Across Your Workload Portfolio

This talk presents a novel approach to harnessing CPU affinity for enhanced performance across a diverse workload portfolio. By leveraging a highly configurable resource policy for assigning CPUs and containers, the speakers demonstrate how data locality, cache hits, CPU frequency, and workload behavior can be optimized to achieve significant performance gains.

Rust

The Rustvolution: How Rust Is the Future of Cloud Native - Flynn, Buoyant

The presentation explores how Rust, a statically-typed, systems programming language, is poised to become the future of cloud-native development. The speaker highlights Rust's focus on safety, performance, and concurrency, making it a compelling alternative to traditional languages like C and C++ that have long dominated the industry.

Scaling

Scaling Service Mesh: Self Service Beyond 300 Clusters - Sumit Mathur & Sushanth Kamath A, Intuit

The presentation discusses Intuit's journey in scaling their service mesh beyond 300 clusters, focusing on self-service capabilities and the challenges of synchronizing configurations between their API Gateway and Service Mesh. The speakers introduce Navic, a custom component that abstracts and distributes service-level configurations across multiple clusters, and describe how they unified the self-service experience for both API Gateway and Service Mesh.

IoT and WebSockets in K8s: Operating and Scaling an EV Charging Station Network - Saadi Myftija

The talk explores the challenges of operating and scaling an EV charging station network that utilizes WebSockets for real-time communication. The speaker discusses solutions such as optimizing the WebSocket connection flow, implementing blue-green deployments to reduce disruptions, and adopting an event-driven architecture to decouple services and prevent load spike propagation.

TikTok’s Edge Symphony: Scaling Beyond Boundaries with Multi-Cluster Controllers - Naveen Mogulla

The talk discusses TikTok's approach to scaling cluster operators beyond single-cluster boundaries using multi-cluster controllers. It highlights the challenges faced in managing a large number of edge clusters with limited resources and how the team addressed these issues by developing a centralized multi-cluster operator that can efficiently manage and deploy platform features across all clusters.

Deploying with Confidence: Lessons Learned Navigating Deployments of a 100-Strong Development Team

The presentation discusses the journey of a 100-strong development team at Grafana Labs to improve their deployment practices, including separating feature toggles from bug fixes, enabling developers to deploy with confidence, and implementing a rolling release channel system. The key lessons learned include starting with the 'why' behind the changes, making it easy to test and learn about cloud deployments, and ensuring bidirectional knowledge sharing with developers.

Scaling up Without Slowing Down: Accelerating Pod Start Time

This presentation discusses techniques to accelerate pod start times, including on-demand image loading, peer-to-peer (P2P) image distribution, and pre-fetching of required data. The solutions demonstrated, such as using the overlay block device (OverlayBD) project and integrating P2P with OverlayBD, aim to significantly reduce the time required to start pods, especially for large container images and machine learning workloads.

How Spotify Re-Created Our Entire Backend Without Skipping a Beat

Spotify rebuilt its entire backend infrastructure without disrupting user experience, by implementing a flexible networking architecture, optimizing workloads, and developing a safe migration strategy that minimizes developer interaction. The presentation covers Spotify's approach to managing a complex Kubernetes ecosystem at scale, including differentiated cluster offerings, governance, and cost control.

Scaling New Heights with KEDA: Performance, Extensions, and Beyond - Jorge Turrado & Zbynek Roubalik

Scaling New Heights with KEDA: This talk explores the capabilities of KEDA, an open-source, Kubernetes-based event-driven autoscaler, including its performance, extensions, and future roadmap. The presenters discuss KEDA's ability to scale applications based on external metrics, its support for scaling jobs and HTTP traffic, and upcoming features like predictive scaling and storage scaling.

To Infinity and Beyond: Seamless Autoscaling with in-Place Resource Resize for Kubernetes Pods

This presentation explores an Alpha feature in Kubernetes called in-place resource resizing, which enables pods to modify resource limits without recreation or restart. The feature aims to address the challenges of resource management and autoscaling in Kubernetes clusters, providing a more seamless and efficient approach to scaling resources as needed.

Intro + Deep Dive: Kubernetes SIG Scalability - Wojciech Tyczyński & Shyam Jeedigunta

The video provides an overview of the Kubernetes SIG Scalability, its focus areas, and the tools and processes used to ensure Kubernetes scalability. The presenters discuss the definition of scalability, the importance of measuring and improving scalability, and the ongoing efforts to push the limits and ensure reliability at scale.

SIG Autoscaling Updates and Feature Highlights - Jonathan Innis, AWS & Maciek Pytel, Google

The presenters provided updates on the Kubernetes Autoscaling Special Interest Group (SIG), including efforts to align the Cluster Autoscaler and Carpenter projects, feature highlights, and future plans. The talk covered improvements in performance, scheduling, and disruption handling across the two projects.

Is There Room for Improving Kubernetes’ HPA?

The presentation explores the challenges and limitations of Kubernetes' Horizontal Pod Autoscaler (HPA) when dealing with microservice architectures. The authors propose potential improvements, such as incorporating a more comprehensive understanding of the microservice graph and using compound scaling signals, to address the observed issues and enhance the performance of autoscaling in complex distributed systems.

The Party Must Go on - Resume Pods After Spot Instance Shut Down - Muvaffak Onuş, QA Wolf

The talk discusses the challenges of resuming pods after spot instance shutdown in the context of a QA testing platform. The speaker shares the technical solutions and lessons learned in using Checkpoint/Restore in Userspace (CRIU) to address issues like PID collisions, file descriptor management, and runtime dependencies.

CRD Vs Dedicated etcd as Storage Backend : Lessons from Taming High Churn Clusters

This talk explores the trade-offs between using CRDs (Custom Resource Definitions) and a dedicated etcd store as the storage backend for Cilium, an open-source networking, observability, and security project for Kubernetes. The speakers discuss the scalability challenges faced by Cilium, including managing high churn rates of pods, services, and network policies, and present the solutions they have developed to address these issues, such as the introduction of Cilium Endpoint Slices.

Fantastic Ordinals and How to Avoid Them: Auto-Scaling Challenges in a Cloud Database

This talk discusses the challenges of auto-scaling in a cloud database like ClickHouse, focusing on the problems with vertical scaling and the approach of 'make before break' to address them. The speaker also covers the limitations of Kubernetes' built-in features and the custom solutions developed to enable flexible and non-disruptive scaling of stateful services.

Seamless Multi-Cloud Kubernetes: A Practical Guide - Justin Santa Barbara & Ciprian Hacman

This talk provides a practical guide to implementing seamless multi-cloud Kubernetes deployments. It covers the challenges and strategies for managing multiple Kubernetes clusters across different cloud providers, focusing on application portability, cluster management, and minimizing the complexity of cross-cluster communication.

Bloomberg's Journey to a Multi-Cluster Workflow Orchestration Platform - Yao Lin & Reinhard Tartler

The presentation discusses Bloomberg's journey to a multi-cluster workflow orchestration platform, highlighting the challenges of cross-data center resiliency, static resource management, and the use of Kubernetes and databases to address these issues. The solution involves using Kind to expose a Kubernetes-compatible API server backed by a relational database, allowing for consistent deployment of static resources across multiple clusters.

Super Reliable Cloud Native Data Processing Using Apache Spark and Cloud Shuffle Manager

This talk presents a cloud-native data processing solution using Apache Spark and a Cloud Shuffle Manager (CSM) to improve reliability and cost-efficiency. The CSM decouples compute and storage, storing shuffle data in cloud storage, enabling the use of cost-effective spot VMs and dynamic executor allocation without the risk of data loss.

Building a Large Scale Multi-Cloud Multi-Region SaaS Platform with Kubernetes Controllers

The presentation discusses how Elastic has redesigned its cloud platform to leverage Kubernetes controllers and operators to manage a large-scale, multi-cloud, multi-region SaaS platform. The speaker shares insights into the architectural decisions, such as treating Kubernetes clusters as disposable resources and implementing a custom controller pattern to decouple the desired state from the Kubernetes API server.

Architecting Resilience: Lessons from Managing 7K+ Kubernetes Clusters at Scale

The presentation discusses the challenges and lessons learned from managing over 7,000 Kubernetes clusters at scale, including issues like data center failures, omission of multi-redundancy, conflicting jobs, and missing GSLB health check configurations. The speakers share their new architectural approach, which focuses on making it easier for developers to deploy multi-zone clusters and optimize network performance to reduce latency.

A Decade of High-Volume Data and APIs: The Evolution of SIG-Apps - Maciej Szulik & Janet Kuo

This presentation covers the evolution of the Kubernetes SIG-Apps over the past decade, highlighting the significant changes and challenges faced in the development of workload controllers, batch processing, and support for diverse application workloads. The talk provides insights into the community's efforts to promote API stability, improve performance and scalability, and engage with the broader AI/ML community to address emerging use cases.

Cloud-Agnostic Approach to Bin-Packing Pods in Managed Kubernetes in AWS, GCP and Azure

This talk discusses a cloud-agnostic approach to bin-packing pods in managed Kubernetes across AWS, GCP, and Azure. The key solution involves leveraging the Kubernetes scheduler's most-allocated scoring policy to improve node utilization and achieve 10-15% cost savings for the presenters' Click House Cloud platform.

Scheduling

SIG-Scheduling Intro & Deep Dive - Wei Huang, Apple & Kante Yin, DaoCloud

This session provides an overview of the Kubernetes scheduler framework, including recent updates and sub-project developments. The presenters discuss features like pod scheduling readiness, main demands in pod topology spread, and the decoupling of the taint manager from the node lifecycle controller, as well as the progress made in projects like C, Q, and the scheduler plugin ecosystem.

Cloud Native Batch Computing with Volcano: Updates and Future - William Wang & Mengxuan Li

This talk presents the latest updates and future roadmap of the Volcano open-source project, a cloud-native batch computing framework. The key highlights include new features such as job flow management, load-aware scheduling, and GPU sharing, as well as plans to enhance the system's performance, cost-efficiency, and support for diverse hardware accelerators in AI and big data workloads.

Kubernetes SIG Node Intro and Deep Dive - Dixita Narang & Dawn Chen, Matthias Bertschy, Peter Hunt

This video provides an overview of the Kubernetes SIG Node group, highlighting their work on various node-related features and future plans. The presenters discuss the critical components of the node, the progress made on several Kubernetes Enhancement Proposals (KEPs), and the group's future direction in supporting workload-centric features and addressing challenges related to AI/ML inference workloads.

WG-Batch Updates: What’s New and What Is Next? - Michał Woźniak, Google & Yuki Iwai, CyberAgent, Inc

The presentation covers the recent updates and future plans of the Kubernetes Batch Working Group, which aims to improve the support for batch workloads in the Kubernetes ecosystem. Key features discussed include pod failure policy, job replacement policy, job success policy, job set, and the Q job scheduler, among others, to enhance the capabilities of the core Kubernetes job API and address the needs of diverse batch-oriented applications.

Trimaran: Load-Aware Scheduling for Power Efficiency and Performance Stability

This video presents three Kubernetes scheduler plugins, Trimaran, that aim to improve power efficiency, performance stability, and limit-aware scheduling. The plugins leverage load monitoring data to optimize pod placement based on target utilization, load variation, and risk of limit overcommitment.

CNCF BSI-WG Intro - Klaus Ma, Nvidia & Alexander Scammon, G-Research

The CNCF Batch Working Group aims to align different batch scheduling projects, starting with education and understanding the existing options. The group plans to broaden the discussion from just batch schedulers to a more holistic system view of batch processing in Kubernetes.

Empowering Efficiency: PEAKS - Orchestrating Power-Aware Kubernetes Scheduling

The presentation discusses the PEAKS project, a Kubernetes scheduler plugin that aims to optimize the aggregate power consumption of a cluster by using a machine learning-based approach to predict the most suitable node for pod placement. The solution is demonstrated through various use cases, including pod deployment, scaling, migration, and cluster autoscaling, showcasing the potential energy savings achievable with PEAKS.

Advanced Resource Management for Running AI/ML Workloads with Kueue - Michał Woźniak, Yuki Iwai

The video presents Kueue, a job-level scheduler for running AI/ML workloads on Kubernetes. Kueue focuses on advanced resource management capabilities, such as supporting all-or-nothing semantics, team-level quotas, and integrating with various job frameworks.

Advanced Multi Cluster Scheduling with Open Cluster Management | Project Lightning Talk

Open Cluster Management, a CNCF Sandbox project, provides vendor-neutral APIs for managing multiple Kubernetes clusters, enabling advanced scheduling and workload placement capabilities. The presentation explores the key features of Open Cluster Management, including cluster inventory, workload definition, and cluster set management, showcasing how it can help organizations effectively manage and orchestrate their distributed Kubernetes environments.

Security

Keeping Kubernetes Safe: The Lowdown on Locked Namespaces - Marco De Benedictis, ControlPlane

The talk discusses the importance of Kubernetes namespaces in ensuring the security of a cluster, highlighting how misconfiguration can lead to various security vulnerabilities. The speaker proposes several mitigation strategies, including the use of least privilege, immutable labels, and policy engines, to enhance the security of Kubernetes deployments.

Kubernetes Security Blind Spot: Misconfigured System Pods - Shaul Ben Hai, Palo Alto Networks

The presentation explores the security challenges posed by misconfigured system pods in Kubernetes, a critical component responsible for maintaining the cluster's functionality. It delves into a real-world case study demonstrating how a combination of default misconfigurations can lead to a privilege escalation attack that compromises the entire Kubernetes cluster.

IAM Confused: Analyzing 8 Identity Breach Incidents - Maya Levine, Sysdig

This presentation provides a comprehensive analysis of 8 identity breach incidents, highlighting the challenges and vulnerabilities in identity management, particularly in the cloud environment. The speaker emphasizes the importance of implementing least-permissive access, robust identity hygiene, network segmentation, and cloud detection and response capabilities to mitigate the evolving threats in identity security.

You Shall Not Pass! Unless You Are GUAC Verified - Parth Patel, Kusari & Dejan Bosanac, Red Hat

This talk explores the use of tools like GUAC and Trustifi to address software supply chain security challenges, such as managing software bill of materials (SBOMs), vulnerability data, and build provenance. The presenters demonstrate how these tools can be integrated with Opa Gatekeeper to automate policy-based decisions for container deployments in Kubernetes, ensuring secure and verified software components.

It's Not Just About SBOMs: Perspectives on Cloud Native Supply Chain Security

This panel discussion explores perspectives on cloud native supply chain security, going beyond just Software Bill of Materials (SBOMs) to address the broader challenges and solutions in securing the software development lifecycle. The panelists discuss the complexity of the problem, the need for a holistic approach, and the importance of building trust and transparency throughout the supply chain.

Living off the Land Techniques in Managed Kubernetes Clusters - Ronen Shustin & Shay Berkovich

This presentation discusses living-off-the-land (LOL) techniques in managed Kubernetes clusters, highlighting the evolving nature of attack vectors and the need to update security definitions and baselines. The speakers demonstrate various LOL techniques, such as persistence through the Node Problem Detector, data collection via Fluent Bit, and privilege escalation using Azure Kubernetes Service's identity management, emphasizing the importance of addressing these issues in the Kubernetes ecosystem.

Lessons Learned from Generating 100M SBOMs: Google’s Approach to SBOM Compliance

This talk discusses Google's journey in generating over 100 million software bills of materials (SBOMs) in response to the U.S. Executive Order on Improving the Nation's Cybersecurity. The presenters share insights on the technical challenges and best practices they encountered, including the importance of using build tools for accurate and complete SBOMs, ensuring the trustworthiness of SBOM data, and the limitations of SBOMs for Software-as-a-Service (SaaS) products.

Kubernetes MLSec: Securing AI in Space - Francesco Beltramini & James Callaghan, ControlPlane

The talk explores the security challenges in the machine learning (ML) lifecycle, with a focus on securing AI systems in the space industry. The presenters discuss threat modeling and security controls to mitigate risks across different stages of the ML lifecycle, including data ingestion, model training, and model deployment.

The Leading Edge of AuthN and AuthZ by Keycloak - Takashi Norimatsu & Thomas Darimont

This talk explores the latest security features in Keycloak, including support for passwordless authentication using passkeys and the OAuth 2.1 authorization framework. The second part demonstrates how Keycloak can be integrated with Open Policy Agent to provide flexible and customizable authorization policies, allowing developers to manage access control logic as code.

Cryptographically Signed Swag: Cert-Manager’s Stamped Certificates

This talk discusses the cryptographically signed swag created by the Cert-Manager team, including physical certificates with QR codes that allow users to download and use the certificates. The team also discusses the history of Cert-Manager, its upcoming graduation from the CNCF incubator, and the technical details behind the certificate issuance process.

SIG Security Update: Growing Together

The video provides an update on the Kubernetes SIG Security community, highlighting its focus on improving Kubernetes security through collaborative efforts across different SIGs. The presentation covers the work of various SIG Security sub-projects, such as security tooling, self-assessments, audits, and documentation, and emphasizes the inclusive and community-driven approach of the SIG.

From Chaos to Control: Cloud Native Governance with Kyverno!

This presentation introduces Kyverno, an open-source policy engine that integrates natively with Kubernetes, and showcases how it can be used to enforce cloud-native governance. The speakers also discuss the limitations of Kyverno's Kubernetes-centric design and present a new tool, Kyverno JSON, which aims to provide a more portable and flexible policy management solution.

Next-Level Security: Implementing MTLS in Istio Multi-Cluster Environments Using SPIRE

This talk presents a comprehensive approach to implementing Mutual TLS (mTLS) in Istio multi-cluster environments using the SPIRE (Secure Production Identity Framework for Everyone) platform. The speaker demonstrates how SPIRE's agent-server architecture and workload attestation process can be integrated with Istio to seamlessly manage certificates and identities across multiple Kubernetes clusters, enabling secure communication between applications deployed in different trust domains.

No 'Soup' for You! Enforcing Network Policies for Host Processes via eBPF - Vinay Kulkarni, eBay

This talk presents a novel approach to efficiently enforce network policies for host processes using eBPF, a powerful in-kernel programming framework. The speaker demonstrates how eBPF can be leveraged to identify host processes and assign network identities, enabling secure communication between host processes and containerized applications.

SBOMs That You Can Trust - the Good, the Bad, and the Ugly - Miguel Martinez & Daniel Liszka

The presentation discusses the importance of trustworthy Software Bill of Materials (SBOMs) and how the open-source project Chain Loop can help organizations collect, secure, and distribute supply chain metadata in a trusted manner. The speakers demonstrate how Chain Loop can enforce SBOM generation, store and verify the integrity of SBOMs, and integrate with various tools to provide a comprehensive supply chain visibility and security solution.

How I Met Your Software – an Image’s Sitcom of Consuming and Securing Software in Cloud Native!

This talk explores the challenges and solutions in securely consuming software, particularly in the cloud-native ecosystem. It highlights the importance of robust software supply chain security, the pros and cons of different distribution models, and various tools and practices to enhance container and application security.

Operating a Production TUF Repository - Kairo De Araujo, TestifySec & Fredrik Skogman, Github

The presentation discusses how to operate a secure production TUF (The Update Framework) repository, including the use of threshold-based signing, hardware security modules, and tools like Reposar and Tonci. The speakers also provide a detailed example of how a software company integrated TUF and R-STuF to secure their internal and external artifact distribution processes.

Brewing the Kubernetes Storm Center: Open Source Threat Intelligence for the Cloud Native Ecosystem

This talk proposes the creation of the 'Kubernetes Storm Center', an open-source platform for collecting and sharing threat intelligence within the cloud-native ecosystem. The speakers outline a four-step approach involving threat modeling, instrumentation, event tracing, and data dissemination, with the goal of empowering the community to collectively defend against emerging threats.

VEXinating Your Container Images: The European Way - Dina Truxius & Jose Antonio Carmona Fombella

This talk discusses how to use open-source tools like SBOM, Vex, and CSAF to improve the security of container images. The speakers provide practical demonstrations on using these tools to identify and manage vulnerabilities in container-based applications.

Misconfigurations in Helm Charts: How Far Are We from Automated Detection and Mitigation?

This talk discusses the challenges of detecting and mitigating misconfigurations in Helm charts, a popular tool for deploying applications in Kubernetes. The presenters describe an automated pipeline they developed to identify and fix misconfigurations, as well as the limitations of existing tools and the need for standardization in this space.

Why Barricade the Door if the Window Is Open? Making Sense of Kubernetes Initial Access Vectors

The presentation explores various initial access vectors to Kubernetes clusters, including issues in the control plane (such as API access and configuration mismanagement) and the data plane (such as vulnerable service execution and malicious images). The speaker proposes a framework to help attendees understand and address these risks, with a focus on detection and protection methods.

Bringing SPIFFE to Linkerd for Mesh Expansion - Zahari Dichev, Buoyant

The talk discusses how Linkerd, a service mesh, expanded its identity system to allow external workloads to be part of the mesh, using the SPIFFE (Secure Production Identity Framework for Everyone) standard and Spire as the reference implementation. The key focus is on the importance of workload identity and security guarantees, and how Linkerd leverages SPIFFE to provide the same level of identity, encryption, and authorization for workloads outside the Kubernetes cluster.

Memory Armor for SPIRE: Fortifying SPIRE with Confidential Containers (CoCo)

This talk explores how the open-source projects SPIRE and Confidential Containers (CoCo) can be combined to enhance the security of SPIRE's identity management server by leveraging confidential computing technologies. The presenters discuss potential improvements, such as adding plugin support for confidential data stores and enhancing the attestation process for SPIRE agents and workloads.

Kubernetes Policy Time Machine: Where to Next? - Jim Bugwadia, Nirmata & Andy Suderman, Fairwinds

The video discusses the evolution of Kubernetes policy management, including the introduction of pod security policies, their deprecation, and the emergence of validating admission policies. It also compares the different policy options available and highlights the role of dynamic admission controllers and the policy working group in shaping the future of Kubernetes policy management.

Cloud Native Security: Cell-Based Architecture & K8s - Rostyslav Myronenko & Shweta Vohra

This talk presents the cell-based architecture (CBA), a decentralized architecture pattern that addresses the challenges faced by companies transitioning from monolithic to microservices-based architectures. The speakers discuss the business drivers, technical implementation, and lessons learned from adopting CBA at Booking.com, highlighting its benefits in improving security, compliance, scalability, and developer flexibility.

Securing 900 Kubernetes Clusters Without PSP - Mercedes-Benz' Journey to ValidatingAdmissionPolicies

In this talk, the presenters from Mercedes-Benz Tech Innovation discuss their journey in securing 900 Kubernetes clusters without using Pod Security Policies (PSP). They explore the challenges they faced with various policy enforcement solutions, including Kube Policy and Open Policy Agent, and ultimately settle on using Validating Admission Policies, a new feature introduced in Kubernetes 1.26, to achieve lightning-fast policy enforcement.

Navigating the Software Supply Chain Defense Landscape - Marina Moore & Aditya Sirish A Yelgundhalli

This talk discusses the complex landscape of software supply chain security tools and provides a framework for navigating and mapping these tools to specific security requirements. The speakers highlight the need for a collaborative effort to create open-source tools that can simplify the process of securing the software supply chain end-to-end.

Stop Leaking Kubernetes Service Information via DNS! - John Belamaric, Google & Yong Tang, Ivanti

The video discusses the potential leakage of Kubernetes service information through DNS and proposes solutions to address this issue, including the use of the Corefile plugin, pods verified mode, and per-tenant DNS services with mutating webhooks and network policies.

Fortifying AI Security in Kubernetes with Confidential Containers (CoCo)

This talk presents a solution for fortifying AI security in Kubernetes using Confidential Containers (CoCo), which leverages confidential computing to provide memory encryption and remote attestation to protect sensitive data and models from privileged entities. The talk demonstrates how CoCo can be integrated with the Kserve inference platform to enable secure model serving while preserving confidentiality and integrity.

Keep Hackers Out of Your Cluster with These 5 Simp... Christophe Tafani-Dereeper & Frederic Baguelin

This presentation provides a comprehensive overview of threat modeling and security best practices for securing Kubernetes clusters in a cloud environment. The speakers discuss common attack vectors, real-world threat intelligence, and practical steps to harden Kubernetes workloads and the control plane, emphasizing the importance of a layered security approach and leveraging open-source tools for runtime threat detection and attack path identification.

Federated IAM for Kubernetes with OpenFGA - Jonathan Whitaker, Okta

This talk discusses the use of Federated Identity and Access Management (IAM) for Kubernetes using OpenFGA, an open-source authorization engine. It demonstrates how OpenFGA can be integrated with identity providers like Keycloak to manage complex access control policies, including temporal and conditional access, across different components of the Kubernetes ecosystem.

Running PCI-DSS Certified Kubernetes Workloads in the Public Cloud

The talk explores the journey of running PCI-DSS certified Kubernetes workloads in the public cloud, highlighting the use of CNCF and open-source projects to meet the security and compliance requirements. The presenters share their practical experience and strategies for navigating the complexities of PCI-DSS certification in a cloud-native environment.

Securing Connections: Defending Telco Workloads in the Cloud Era - Barun Acharya, Accuknox

The talk discusses securing connections for defending Telco workloads in the cloud era, focusing on understanding attack vectors for 5G and Telco workloads, and orchestrating security against these attack factors using tools like Cube Armor and the Nimbus project, which provides an intent-driven approach to security.

The Hard Life of Securing a Particle Accelerator - Antonio Nappi & Sebastian Lopienski, CERN

The talk presents the challenges and solutions in securing a particle accelerator at CERN, focusing on the implementation of a centralized single sign-on service using the open-source identity and access management solution, Keycloak. It highlights the benefits of moving the infrastructure to Kubernetes, which has improved the reliability, scalability, and maintainability of the system.

OAuth2 Token Exchange for Microservice API Security - Ahmet Soormally & Letz Yaara, Tyk

This talk explores the complexities of identity propagation and API security in a microservices environment, and presents a solution using the relatively unknown OAuth2 token exchange specification. The speakers demonstrate how this approach can enable user impersonation, internal-to-internal token exchange, and external-to-internal token exchange, providing a more secure and scalable way to manage identity across a distributed system.

Safety or Usability: Why Not Both? Towards Referential Auth in K8s - Rob Scott, Google & Mo Khan

The presentation discusses the challenges of secure and usable authorization in Kubernetes, particularly around Ingress controllers and cross-namespace references. The speakers propose a novel referential authorization approach that aims to provide fine-grained access control without compromising usability.

Playing Defense: The Reactive Cloud Native Security Battle - Ayse Kaya, Slim.AI

The video discusses the challenges faced by organizations in the cloud-native security landscape, including the reactive approach to vulnerability management, the complexity of the software supply chain, and the lack of trust and collaboration between software producers and consumers. The speaker highlights the need for a more proactive and transparent approach to security, with the goal of building trust and enabling effective collaboration across the software supply chain.

Securing the Supply Chain with Sigstore Artifacts Signatures at Scale

The video discusses how Yahoo's security team implemented a secure supply chain using Sigstore's artifact signatures at scale. The key highlights include using Sigstore's keyless signing approach to improve the security and reduce the operational overhead of their existing image signing system.

SLSA and FRSCA: Beyond Snacks and Soda! - Christopher Hanson, RX-M, llc.

The presentation covers the SLSA (Supply Chain Levels for Software Artifacts) framework, which aims to provide transparency and trust in the software supply chain. It also introduces FRSCA (Fresca), a reference implementation of a secure software factory, and demonstrates how it can be used to achieve higher levels of SLSA compliance through the use of tools like Spire and Vault.

Enabling the Software Supply Chain Ecosystem with Notary Project

The Notary project, a maintainer of the Notation tool, enables the software supply chain ecosystem by providing a plugin framework for signing and verifying software artifacts. The presentation showcases integrations with partners like Venafi and Carvana, demonstrating how Notation can be extended to support various enterprise code signing solutions and verification workflows within Kubernetes clusters.

Project Harbor, All the Year Around, and What Comes Next - Vadim Bauer & Yan Wang

This talk provides an overview of the latest developments and future plans for Project Harbor, a CNCF-graduated container registry. The speakers discuss new features such as the Harbor CLI, Harbor Satellite for edge deployments, support for OCI 1.0 artifacts, and enhanced scanning capabilities for security and compliance.

Policy as Code: A Game-Changer for Stack Security - Raz Cohen, Permit.io

This talk explores how 'policy as code' can revolutionize stack security by decoupling authorization and access control policies from application code. The speaker discusses various access control models, the importance of authorization, and how tools like Open Policy Agent and Permit.io's Opel can enable scalable, version-controlled, and deployable authorization policies.

TAG Security Highlights - Marina Moore, New York University; Michael Lieberman

The TAG Security Highlights session provides an overview of the work and initiatives of the TAG Security group within the Cloud Native Computing Foundation (CNCF). The session covers recent efforts such as the relaunch of the Security Pals program, the ongoing work on software supply chain security best practices, and various other working groups and projects the group is involved in, while also encouraging audience participation and inviting attendees to get involved in the community.

Dapr in 2024: Deployments Beyond Sidecars, Distributed Scheduler API and App-Level Zero Trust

This talk provides an overview of the Dapr project, its goals, and the upcoming features planned for 2024, including deployments beyond sidecars, a distributed scheduler API, and enhanced application-level zero trust security. The speaker also engages the audience to gather feedback on the community's interest in various Dapr-related topics, such as multi-cloud architectures, workflows as code, and cross-cluster communications.

I'll Let Myself In: Kubernetes Privilege Escalation Tactics - Andrew Martin & Iain Smart

This talk explores the challenges of securing Kubernetes clusters, highlighting the importance of offensive security in identifying and addressing vulnerabilities. The speaker demonstrates various privilege escalation tactics, emphasizing the need for organizations to proactively address the risks of compromised cluster admin access and the potential for long-term persistence within their infrastructure.

An Acronym Free Introduction to Software Supply Chain Security - Joshua Lock, Verizon

This talk provides a comprehensive introduction to software supply chain security, highlighting the importance of secure practices throughout the software development lifecycle. The speaker covers key principles such as maintaining good hygiene, understanding dependencies, minimizing attack surfaces, and ensuring consistent and repeatable build processes to enhance the overall security of software supply chains.

Choose Your Own Adventure: The Struggle for Security - Whitney Lee, VMware & Viktor Farcic, Upbound

In this talk, the presenters guide the audience through a series of security-related choices for a Kubernetes-based application, demonstrating how to implement policies, runtime security, secrets management, and secure pod-to-pod communication using various CNCF tools. The session culminates in a live voting process where the audience selects the solutions to be implemented, showcasing the flexibility and power of the chosen technologies.

Dungeons and Deployments V2: The Clusters of Chaos

In this interactive talk, the presenters explore the challenges of securing a Kubernetes cluster through a role-playing game where they embody different characters. The talk covers topics such as authentication, authorization, secrets management, and configuration management, highlighting the importance of a holistic approach to cluster security.

Backstage's new auth system - avoiding foot-guns and config overload | Project Lightning Talk

This talk presents Backstage's new authentication system, which aims to make installations secure by default and improve service-to-service authentication and user token management. The new design introduces self-signed service tokens, secure identity proofs, and a default-off policy to ensure plugins are protected, while also providing a future-proof API design to handle new authentication types without breaking existing code.

Kyverno Top 10: Automate Kubernetes Security With Policy as Code | Project Lightning Talk

Kyverno, a policy engine for Kubernetes, has evolved to address a wide range of use cases beyond just Kubernetes, providing features like automated security enforcement, supply chain security, and multi-tenancy management. The talk covers the top 10 features of Kyverno, highlighting its community engagement, adoption by large companies, and the growing Kyverno ecosystem of related projects.

Achieving Balance Between Security and Performance in Falco | Project Lightning Talk

This lightning talk discusses the challenges of achieving a balance between security and performance in the Falco open-source security solution for threat detection. The presenter highlights the need to address user demands for more detections, better performance, and secure installation, while also optimizing the system-level implementation and maintaining compatibility across various hardware architectures and kernel versions.

OpenFGA: The Cloud Native way to implement Fine Grained Authorization | Project Lightning Talk

OpenFGA is a cloud-native authorization system that implements a relationship-based access control model, allowing developers to define complex authorization policies and manage permissions for their applications. The talk highlights how OpenFGA is being used by various companies and open-source projects to simplify and enhance their authorization management, showcasing its flexibility and adoption in the community.

Want to secure K8s clusters? Think Paralus. | Project Lightning Talk

Paralus is a CNCF Sandbox project that provides zero-trust access management for Kubernetes clusters, offering just-in-time access control, integration with existing SSO providers, and a proxy architecture that simplifies cluster onboarding. The project has seen nine releases since its initial launch in 2022 and is actively maintained, with recent improvements including support for cosign and cluster health checks.

Enforceable Software Supply Chain Policies and Attestations... Alan Chung Ma & Santiago Torres-Arias

This presentation discusses the use of the in-toto framework to create enforceable software supply chain policies and attestations. It covers the key concepts of in-toto, including attestations, layouts, and policies, and demonstrates how these can be used to verify the integrity of a software supply chain, even in the face of potential attacks.

Falco: A Grand Promenade Through Cloud Native Runtime Security - Panel

Falco, a cloud-native runtime security tool, has undergone significant advancements and is now a graduated project within the CNCF. The presentation highlights Falco's improvements in detection, rules management, performance, and its growing plugin ecosystem, as well as the community's roadmap for further enhancing the project's maturity and stability.

Serverless

Unikernels in K8s: Performance and Isolation for Serverless Computing with Knative

This talk presents a novel approach to integrating unikernels into Kubernetes and Knative, a serverless framework, in order to achieve low latency and strong isolation for serverless computing. The work involves developing a container runtime called UrunC that treats unikernels as processes, allowing them to be seamlessly integrated with the existing container ecosystem.

Knative Functions Deep-Dive: Why You Should Use Knative Functions For Your Next Microservi...

Knative Functions is a powerful tool that simplifies the deployment and management of serverless functions on Kubernetes. The presentation showcases the easy-to-use CLI, the flexibility of the function runtime, and the integration with Knative Serving for autoscaling and traffic management, making it an attractive choice for building and running microservices on Kubernetes.

Leveling up Wasm Support in Kubernetes - Matt Butcher, Fermyon

The talk discusses the use of WebAssembly in Kubernetes, including the introduction of Spin and Spin Cube, which provide a developer-friendly framework for building and running serverless applications with WebAssembly. The presenters highlight the advantages of WebAssembly, such as its security, cross-platform capabilities, and fast startup times, making it a compelling choice for cloud computing workloads.

Maximizing Go's Capabilities with the WebAssembly System Interface

This presentation explores the integration of Go (Golang) and WebAssembly (Wasm), focusing on the WebAssembly System Interface (Wasi) and its capabilities. It covers the history, architecture, and use cases of Wasi in Go applications, highlighting the potential for building lightweight, secure, and portable plugins using Wasm.

Faster, Safer, Serverless - Empowering Apache Spark Standalone Cluster on Kubernetes - Huichao Zhao

This presentation discusses a novel approach to running Apache Spark standalone clusters on Kubernetes, aiming to provide faster, safer, and more serverless-like experiences for users. The key highlights include reducing startup times, enhancing security and isolation, and enabling long-running Spark clusters to support various workloads, including machine learning frameworks like Ray.

Fast and Efficient Log Processing with Wasm and eBPF - Michael Yuan, Second State

This talk explores the use of WebAssembly (Wasm) and eBPF for efficient and lightweight log processing, as well as their potential applications in cloud-native computing and AI/ML workloads. The speaker discusses how Wasm can provide a more granular and secure isolation compared to containers, and how it can be integrated with eBPF for streamlined deployment and data collection.

OCI as a Standard for ML Artifact Storage and Retrieval - Peyman Norouzi & Eric Koepfle, Bloomberg

The presenters discuss the challenges of managing machine learning models and propose a solution based on the Open Container Initiative (OCI) standard for artifact storage and retrieval. They describe how OCI can provide a foundation for a model registry that addresses scalability, security, and discoverability while integrating with their existing data science platform.

Cloud-Native LLM Deployments Made Easy Using LangChain - Ezequiel Lanza & Arun Gupta, Intel

This talk discusses how to deploy large language models (LLMs) in a cloud-native environment using LangChain, a framework that provides a unified way to interact with various LLM models. The speakers cover the key steps involved, including model selection, API integration, containerization, and deployment on a Kubernetes cluster, highlighting the benefits of this approach for addressing the diverse needs of different business units within an organization.

Empowering Developers with Easy, Scalable Stream Processing Technologies on Kubernetes

This talk provides an overview of Numa Flow, a Kubernetes-native stream data processing platform developed by Intuit. Numa Flow aims to address the challenges of traditional stream processing systems by offering a lightweight, language-agnostic, and auto-scaling solution that is tightly integrated with Kubernetes.

Crossplane Intro and Deep Dive - the Cloud Native Control Plane Framework

Crossplane is a Cloud-native control plane framework that allows users to manage and compose resources across multiple cloud providers. The talk covers the latest advancements in Crossplane, including the introduction of composition functions, environment configs, and server-side apply, which make it easier to use and extend the platform.

How to Choose the Best Kubernetes AI Edge Deployment Patterns for Your Use Case

The presentation discusses the challenges and solutions for deploying AI workloads at the edge using Kubernetes. It covers the definition of edge, the key characteristics of edge computing, and the specific challenges faced in Telco use cases, such as manageability, latency, security, and AI/ML model deployment. The proposed solution leverages Kubernetes and GitOps principles to address these challenges, enabling flexible and automated deployment of AI models across the edge-cloud continuum.

Dragonfly V2.2.0 - Intro, Updates, Model Distribution in AI Inference and Data Distribution in Serve

Dragonfly is an open-source project that provides image acceleration and file distribution using peer-to-peer technology, serving as a best practice and standard solution in cloud-native architectures. The talk covers Dragonfly's features, milestones, and its applications in AI inference and data distribution, highlighting the project's readiness for graduation and its future focus on AI-related use cases.

How to Save Millions Over Years Using KEDA? - Solene Butruille, BlackRock

This talk discusses how BlackRock used Kubernetes Event-Driven Autoscaling (KEDA) to efficiently manage their Aladin Compute application, which allows users to start sessions and execute code. The key benefits were the ability to automatically scale down idle sessions to zero, reducing resource costs, while still providing a good user experience by allowing sessions to be quickly restarted.

CloudEvents - Don't Call Us, We'll Call You | Project Lightning Talk

CloudEvents is a vendor-neutral specification for describing event data in a common way, enabling interoperability across services, platforms, and clouds. By providing a standard set of metadata and protocol bindings, CloudEvents simplifies the development and integration of event-driven applications, allowing developers to focus on the business logic rather than the underlying event handling mechanisms.

wasmCloud: Declarative WebAssembly Orchestration for Cloud Native Applications | Project Lightnin...

The wasmCloud project is a Sandbox project in the CNCF that focuses on declarative WebAssembly orchestration for cloud-native applications. The project provides a platform and tools for running WebAssembly components as the unit of compute, enabling cross-platform and cloud-agnostic deployments with integrated observability and standards-based integration.

Service Mesh

What’s New in Kuma: Advanced Service Mesh Capabilities | Project Lightning Talk

This talk introduces the latest features in Kuma, an open-source service mesh that provides security, observability, and advanced routing capabilities for microservices. The presentation covers auto-reachable services, mesh load balancing strategy, target ref policies, and the transition from standalone mode to federation, highlighting the performance and flexibility improvements in Kuma's service mesh capabilities.

Storage

Kubernetes SIG Storage: Intro & Deep Dive - Xing Yang, VMware & Jan Šafránek, Red Hat

The video provides an overview of the Kubernetes SIG Storage group, including their recent accomplishments, ongoing work, and future plans. The presenters discuss features such as volume attribute classes, volume group snapshots, and CSI migration, as well as how to get involved in the SIG's efforts.

Accelerating Kubernetes Data Intensive APPs with Cloud Native Local Storage

The presentation discusses the challenges and solutions for accelerating Kubernetes data-intensive applications using cloud-native local storage. The proposed solution, called Hamestore, is a Kubernetes-native storage solution that unifies and manages local disks to provide high-performance, high-availability, and enterprise-level data management features for various use cases, including middleware, machine learning, virtualization, and edge computing.

Rook: Intro and Deep Dive with Ceph

This talk provides an introduction to the Rook project, which aims to make storage available to Kubernetes applications and manage it in a native way. It also covers the project's current state, new features, real-life examples, and the challenges of application disaster recovery and day-two operations in a Rook and Ceph environment.

Cloud Native Storage: The CNCF Storage TAG Projects, Technology & Landscape

This talk provides an overview of the CNCF Storage Technical Advisory Group (TAG), its role, and the various cloud-native storage projects it is working on. The talk covers the key attributes of cloud-native storage, the different storage patterns and architectures, and the ongoing efforts to address challenges in running data-intensive workloads on Kubernetes.

Longhorn: Intro, Deep Dive and Q&A - David Ko, SUSE

Longhorn is a highly available, reliable, and performant distributed block storage system based on Kubernetes. The presentation covers Longhorn's architecture, features, performance improvements, and upcoming roadmap, highlighting its suitability for various use cases and its growing adoption in the Kubernetes ecosystem.

CNCF Storage TAG and the Cloud Native Storage Landscape | TAG Lightning Talk

The CNCF Storage TAG (Technical Advisory Group) provides a community for the cloud native storage space, where users can learn, get advice, and find out about projects in the ecosystem. The TAG focuses on educating end-users, reviewing projects, engaging with the community, and providing subject matter expertise in the field of cloud native storage.

Sustainability

Saving the Planet One Cluster at a Time: Operationalising Sustainability in Kubernetes

This talk explores strategies for operationalizing sustainability in Kubernetes, focusing on measuring and reducing carbon emissions. The presenters discuss embodied and operational emissions, propose techniques like time-shifting and follow-the-sun approaches, and demonstrate the impact of optimizing node utilization on energy consumption and emissions.

CASPIAN: A Carbon-Optimized Multi-Cluster Job Scheduler - Tayebeh Bahreini & Asser Tantawi

CASPIAN is a carbon-optimized multi-cluster job scheduler that aims to minimize the carbon footprint of long-running machine learning jobs by scheduling them at the right time and place based on the carbon intensity of the underlying energy mix. The system leverages open-source components like MCAD and CpuStella to manage job queuing, dispatching, and execution across multiple clusters, while the CASPIAN scheduler makes decisions to optimize for both carbon emissions and job completion time.

Unlock Energy Consumption in the Cloud with eBPF - Leonard Pahlke

The talk explores the importance of understanding energy consumption in cloud computing, highlighting the need for software engineers to be aware of the resource usage of the applications they develop. It discusses the various layers involved, from hardware to cloud infrastructure, and how tools like eBPF can be leveraged to provide more transparency and enable better decision-making around energy efficiency.

Keynote: Building IT Green: A Journey of Platforms, Data, and Developer Empowerment at Deutsche Bahn

The keynote explores Deutsche Bahn's journey in building a sustainable digital infrastructure, focusing on the empowerment of developers through platforms, data, and tools that enable them to measure and optimize the environmental impact of their applications. The presentation highlights the importance of starting small, fostering a grassroots movement, and leveraging developer motivation to drive the transition towards green IT.

Cloud Native Sustainability Efforts in the Community - TAG Environmental Sustainability

This talk provides an overview of the Cloud Native Sustainability Efforts in the Community, focusing on the TAG (Technical Advisory Group) Environmental Sustainability. It highlights the group's mission, structure, working groups, and ongoing projects aimed at improving the environmental sustainability of cloud-native and open-source technologies.

Keynote: Innovating Responsibly: How to Navigate Sustainability in the Era of Kubernetes

This talk explores how cloud providers and cloud consumers can work together to build and operate sustainable cloud infrastructure. The speakers discuss key metrics like power usage efficiency and water usage efficiency, as well as practical steps like optimizing compute resources, implementing intelligent autoscaling, and collaborating across the cloud ecosystem.

Heating Pools with Cloud Power: A New Wave in Green Computing - Saiyam Pathak & Mark Bjornsgaard

This talk explores a novel approach to heating swimming pools using the heat generated by data centers, a sustainable solution that aims to address the growing energy demands of the data center industry. The presenters, Saiyam Pathak from SEO and Mark Bjornsgaard from Deep Green, discuss the technological and economic aspects of this innovative concept, highlighting its potential to reduce carbon emissions and provide cost-effective heating for various facilities.

The Data Pipelines Behind Forest Carbon Credits – Why Pachama Uses Flyte to Orchestrate Workflows

The video discusses the data pipelines behind forest carbon credits and why Pachama, a company focused on making the carbon credit market more transparent, chose to use the open-source workflow orchestration system Flyte. The talk covers Pachama's use cases for Flyte, including resource allocation, cost optimization, caching, and dependency isolation, as well as common pitfalls when getting started with Flyte.

Is Serverless Powerfully Powerless? - Jose Gomez-Selles & Kevin Dubois, Red Hat

The presenters explore the power consumption of serverless and non-serverless deployments using the Kepler tool, which provides granular power consumption metrics. They find that the power usage depends on the workload and that serverless can provide some power savings, but the results vary depending on the specific use case.

Tutorial: Cloud Native Sustainable LLM Inference in Action

This tutorial covers the development of cloud-native sustainable large language model (LLM) inference, including techniques like continuous batching, KV cache optimization, and model quantization to improve efficiency and reduce energy consumption. The presentation also demonstrates a chatbot application powered by LLMs and the use of the Kepler project to measure and visualize the energy usage of these models.

Sustainable Computing: Measuring Application Energy Consumption in Kubernetes Environments with K...

This talk presents the latest updates to Kepler, a CNCF and COSI project that measures the energy consumption of processes, containers, and pods in Kubernetes environments. The talk covers new features such as support for GPU virtualization, the use of pre-trained power models to estimate power consumption, and the introduction of a new project called SSCAL (Sustainability Queries for AI Applications) that aims to bridge the gap between high-level energy metrics and application-level energy consumption reporting.

Lightning Talk: Debunking Myths About Environmental Sustainab... Niki Manoledaki & Kristina Devochko

This lightning talk aims to debunk common myths about environmental sustainability in the cloud-native space. The speakers address misconceptions around carbon offsets, cloud sustainability, and the use of cost as a proxy for sustainability, while also highlighting the risk of greenwashing and the importance of transparency, accountability, and community engagement.

Telco

Testing K8s Cluster and VNFs in Telco Staging Environments - Hiromu Asahina & Kentaro Ogawa

This presentation demonstrates a Kubernetes-based approach to testing 5G and network function virtualization (VNF) systems in a telco staging environment. The presenters discuss the challenges of integrating legacy systems, managing test scenarios, and automating the deployment and upgrade processes using GitOps principles.