Jun 13, 2025

Why I Stopped Managing Kubernetes the Traditional Way

Discover why traditional Kubernetes ops fall short at scale, and how internal platforms like CORD make it sustainable, scalable, and developer-friendly.

MeetUp

Author

Boudhayan GhoshTechnical Content Writer

Why I Stopped Managing Kubernetes the Traditional Way

Book a call

Editor’s Note: This blog is adapted from a talk by Bharath Nallapeta, Senior Software Engineer at Marant. In this session, he reflects on the evolution of Kubernetes operations and the growing need for internal developer platforms. Drawing from his experience building CORD, an open source multi-cluster platform, Bharath shares why platform engineering is not just a scaling strategy—it is a response to operational chaos, and a way to make Kubernetes sustainable for the long run.

Why I Stopped Managing Kubernetes the Traditional Way

My name is Bharath Nallapeta, and I work as a Senior Software Engineer at Marant. I sit in the Open Source Program Office, a place where we do more than just build with open source—we help shape it. Our job is to work across internal infrastructure, community projects, and enterprise systems, trying to keep all of it coherent.

Over the last few years, I have been deeply involved in Kubernetes operations. What started as a technical challenge turned into a larger question: how do we make this sustainable for teams? This is not a technical breakdown. This is a reflection on the shifts I have seen and why I believe the future of Kubernetes is not about more tools, but about better design.

What Red Hat Taught Me About Thinking Long-Term

When I first joined Marant, I thought open source meant sharing code. That was true, but it was only part of the picture.

Red Hat does something clever. They take upstream open source projects and turn them into products. What changes is the structure around it: packaging, security layers, guarantees, and support. This split—upstream for innovation, downstream for stability—is what gives their model power.

I began to see the same pattern in other companies, too. Microsoft's open-sourcing parts of GitHub Copilot was not merely an act of goodwill. It was a strategic decision. Developers trust what they can see. Momentum lives in the open.

That mindset shaped how we work at Marant. If we build something useful, we do not just polish it for internal use. We put it out there, upstream first. Feedback comes faster. Adoption becomes organic. And trust builds itself.

Kubernetes Fixed a Pain, Then Uncovered a Pattern

Kubernetes gave us a shared language for deployments. Suddenly, what worked in one environment worked everywhere. The pain of "it runs on my machine" started to disappear.

But solving that problem revealed another one.

As companies embraced Kubernetes, they also opened the door to a flood of tooling. Different clusters for different teams. Different policies for different clouds. Layer after layer of configurations, most of them disconnected.

At some point, we stopped asking whether something could be done in Kubernetes and started asking how many moving parts we were willing to juggle. That is when I realised that the next problem to solve was not at the level of the cluster—it was at the level of the system.

Internal Platforms Are Not a Luxury Anymore

I have heard people say that platform engineering is just DevOps with a facelift. That has not been my experience.

DevOps was about breaking down walls between development and operations. It worked. But only up to a point.

When you are running ten Kubernetes clusters across multiple regions, you need something more repeatable than handoffs and scripts. You need a system that knows how to scale with you. That is what internal platforms do. They let teams ask for environments without waiting. They give operations teams clarity without compromise.

I often describe it this way: the platform should feel invisible until something breaks. Then it should become very visible, very fast.

The Industry Is Scaling Clusters Without a Map

A recent survey showed that more than 75 percent of companies are running multiple Kubernetes clusters. That is not surprising. What is surprising is that less than a third have any kind of platform engineering practice in place.

The result is technical debt that grows in silence.

At first, things work. But then clusters multiply. Teams start solving the same problems in different ways. Security, cost tracking, and logging—each takes on a slightly different shape. Eventually, no one has the full picture.

This is not an engineering failure. It is an architecture gap. And the fix is not another tool. It is a decision to invest in common ground.

That is why we built CORD.

Building CORD: A Platform That Gets Out of the Way

CORD is our internal Kubernetes platform. It is open source, designed for enterprise use, and focused on three ideas: make cluster provisioning easy, make state management reliable, and make observability automatic.

Provisioning
A developer should not need to think about cloud APIs or base images. They should be able to say what they want—a cluster with three nodes, GPU enabled, running a specific stack—and get it, fast.

State Management
Most teams end up writing scripts to install the same tools—cert-manager, ingress controllers, logging agents—on every cluster. With CORD, we write the config once. It propagates across everything.

Observability
Everyone wants visibility, but no one wants to maintain a separate stack for every environment. We built observability into the platform using open tools: Grafana, Prometheus, VictoriaMetrics, OpenTelemetry, and OpenCost. Metrics are real-time. Costs are traceable. Logs are linkable.

The Stack That Powers It

CORD is not built on proprietary abstractions. It is powered by components we trust:

K0s, a lightweight Kubernetes distribution, handles lean deployment.
Cluster API (CAPI) uniformly abstracts cloud provisioning.
Pelto lets us manage state across environments without rewriting logic.
VictoriaMetrics gives us efficient time-series storage.
KOF, our wrapper around OpenCost, extends cost insights across clusters.

Each piece is modular. We can swap parts out as our needs evolve. That flexibility is what makes the system future-proof.

A Shared App Store for Infrastructure

One of my favourite features in CORD is the catalogue. It is a collection of preconfigured tools such as ArgoCD, External Secrets, Dagger, and more that teams can install with a single Helm command.

It started with five services. Now we have dozens.

This small piece of UX turned out to be incredibly powerful. It helped teams move faster without opening tickets. And it gave us a way to share best practices without writing long documents.

Run It, Break It, Learn from It

CORD is open source. You can find it on GitHub, run it on a laptop using K0s, and add a Raspberry Pi as a worker node if you want to test things for real. Everything is transparent. Feedback is welcome.

We did not build it to replace Kubernetes. We built it to make Kubernetes livable.

We are still improving it. But we have already seen what a difference it makes when platform thinking is applied early, not late.

If you are running clusters and feeling the strain, this might be the step you have been waiting to take.

SHARE ON

Subscribe to Our Newsletter

Subscribe to RSS

Press & Media Hub RSS Feed

More from the engineering frontline.

Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

From MVP to Scale: Designing Architecture for AI-First Products

Article

May 11, 2026

From MVP to Scale: Designing Architecture for AI-First Products

A panel of architects and engineering leaders at thegeekconf mini 2026 discuss how to build and scale AI-first products — from MVP decisions to production-level challenges. The conversation covers data quality, model selection, security, token economics, and the mindset teams need to navigate a fast-moving AI landscape.

The AI native Enterprise Evolution | Saurabh Sahu

Article

May 7, 2026

The AI native Enterprise Evolution | Saurabh Sahu

Explore Saurabh Sahu’s insights on AI-native enterprise, AI gateways, model governance, agentic SDLC, and workspace.build for scalable AI adoption from thegeekconf mini 2026.

The Next Era of AI Builders: Building Autonomous Systems for Frontier Firms — Pallavi Lokesh Shetty

Article

May 5, 2026

The Next Era of AI Builders: Building Autonomous Systems for Frontier Firms — Pallavi Lokesh Shetty

Discover Pallavi Shetty’s view on the next era of AI builders, covering autonomous systems, trusted agents, data quality, and frontier firms from thegeekconf mini 2026

The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar

Article

May 5, 2026

The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar

Akash Kamerkar’s thegeekconf mini 2026 talk explores the ACDC framework for building safer agentic workflows with clean code guards, sandbox testing, and AI-driven software development.

OpenClaw: Build Your Autonomous Assistant | Deepak Chawla

Article

May 4, 2026

OpenClaw: Build Your Autonomous Assistant | Deepak Chawla

Discover how Deepak Chawla explains OpenClaw for building autonomous AI assistants through data preparation, knowledge bases, AI engines, and agent automation.

From Prompt Chaos to Production AI: Spec-driven Development for AI Engineers | Vishal Alhat

Article

May 4, 2026

From Prompt Chaos to Production AI: Spec-driven Development for AI Engineers | Vishal Alhat

Learn how Vishal Alhat’s thegeekconf mini 2026 session explains spec-driven development and how AI engineers can move beyond prompt chaos to build production-ready applications.

Scroll for more

View all articles