May 7, 2026

The AI native Enterprise Evolution | Saurabh Sahu

Explore Saurabh Sahu’s insights on AI-native enterprise, AI gateways, model governance, agentic SDLC, and workspace.build for scalable AI adoption from thegeekconf mini 2026.

Author

Saurabh Sahu
Saurabh SahuChief Technology Officer (CTO)

Subject Matter Expert

Apoorva Sahu
Apoorva SahuDirector
The AI native Enterprise Evolution | Saurabh Sahu

Table of Contents

Editor's Note: This blog post is adapted from a talk delivered at thegeekconf mini 2026 by Saurabh Sahu, Chief Technical Officer at GeekyAnts. With hands-on experience building AI systems for enterprise teams, Saurabh walks through three real decisions his team made: building an AI gateway for model governance, shifting to an agentic software development life cycle, and developing workspace.build, a tool that runs agents on cloud virtual machines so developers can work across multiple projects without waiting.

I am Saurabh Sahu, CTO at GeekyAnts. I am here to share some of the experiments and R&D we are doing at GeekyAnts, and how our thinking around AI has changed as a company.

The Incident That Changed How We Think About AI Tools

A few weeks back, on a Friday morning at 11 AM, our engineers were shipping code to production using an AI agentic coding tool we had rolled out to all developers. I am not supposed to name it as per policy, but that is what it was. All of a sudden, the entire engineering team got logged out of the tool.

The reason was a policy change going through on the vendor's side. We started getting a lot of emails. That is when we realised how comfortable we had become with a single tool. We had built entire workflows around it, built agent tools on top of it, and were shipping code through it. One day, productivity dropped to zero.

That day we learned something: a single vendor AI is a single point of failure. You cannot rely on one vendor. If that vendor changes their policy or doubles their price, you are at risk.

Control, Safety, and Configuration

We identified three needs that came out of that day.

First, control over the models. If users access Opus 4.7 without restriction, costs go through the roof. We needed an admin panel where we could control which teams get access to which models and at what cost.

Second, better safety and guardrails. Anyone could upload financial data to an external LLM and it would go out. We needed the ability to block requests that carry sensitive data.
Third, a common configuration layer for agents, tools, and MCPs so that all developers work from a shared foundation.

Building the AI Gateway

Our CEO, Kumar Pratik, who is here today, called a two-day hackathon. Together with solution architects, we built an AI gateway. Every request, whether from Cursor or Claude, goes through this gateway. It gives us a layer in between to control the models, configure local LLMs, configure agents, and set guardrail rules to block certain requests.

Before this, with Claude Code, we had no visibility into how people used it. No logs. The gateway solved that. It was built on open source. What would have taken a month, we built using an open source project called LiteLLM. If you want to build something similar for your organisation, that is a good place to start.

This gave us governance over our AI usage. That was learning number one.

Agentic Software Development Life Cycle

The second learning was about how we build software using these models. We moved from a software development life cycle to an AI software development life cycle.

We built several agents on our Claude Code setup. The BRD agent takes questions from users, writes user stories, and covers all edge cases. A human must approve before it moves forward. Once approved, it breaks user stories into smaller executable tasks and writes them to a markdown file. The next agent picks up that file and starts implementing the features and writing the code.

We also built a review agent that gives suggestions on pull requests and identifies improvement areas. Finally, a test agent connects to the Chrome MCP, runs all the written test cases, and gives the developer feedback on whether the tests pass or fail. Once approved, the code gets merged.

Each command is an agent. Every step requires human approval, so the developer stays in control. This setup is tool-agnostic: it works on Cursor rules, Windsurf, or Claude Code.

What This Means for Developers

As AI agents take on more of the work, the expectations on developers grow. Developers now need to step into the role of a product owner. They need to level up their communication skills and talk to clients. A single developer now does a lot more because the agent handles a lot of the execution. That was a clear learning for us.

R&D: workspace. build

The next problem we wanted to solve was waiting. With the current agentic setup, a developer sits at their laptop and waits five to ten minutes for an agent to finish before they can approve and move to the next step. We wanted to remove that wait.

The idea was to run agents on a cloud virtual machine. The developer kicks off the agent, closes their laptop, goes home, and does whatever they want. They come back to a dashboard that shows progress, a video recording of what ran on the virtual machine, and logs. They can then approve or reject and decide whether to merge. Running multiple agents in parallel on multiple projects becomes possible, which raises productivity.

This R&D project is called workspace.build. We are still working on it internally and will start rolling it out to users on a request basis over the coming months.

Deployment from the workspace. build

We also wanted to solve deployment from within the workspace. build. Once the agent work is done, you can deploy to any cluster, whether on-premise or on any cloud, by bringing your own cluster. The approach is GitOps-driven. You push a commit instead of SSHing into a server. Rollbacks are easier, and anyone can configure them.

From Incident to Ecosystem

Our ecosystem at GeekyAnts sits on three pillars. First, the AI gateway built on LiteLLM controls the model layer and maintains governance. Second, an agentic software development life cycle where agents handle the steps and humans approve at each checkpoint. Third, workspace.build, where agents run on virtual machines and ship code autonomously.

Our CEO, Pratik, is here if you want to connect offline. Suresh, our keynote speaker, is also here. This is a high-level bird's-eye view of what we are building and how we use AI at GeekyAnts.

Q&A

  • On tool agnosticism and vendor dependency

The agents we built were on Claude Code. The skills were already there. We made the setup tool-agnostic so we can configure it on Cursor, Windsurf, and other tools. The goal is not to rely on any single tool.

  • On managing multiple vendor costs

Question: You said you do not want to rely on one vendor, but using multiple vendors means paying for multiple. How do you manage that?

The AI gateway helps here. We are still configuring it to route to the right models. We have better governance and control over which models are open to which users. The organisation chooses the models, not the individual user.

  • On routing tasks to the right model

Question: If you have five LLMs running, how do you decide which task goes to which model?

We are still figuring that out. We are experimenting with Qwen and local LLM setups to find what gives the best output at the right cost. Right now, we have Sonnet open for developers. The direction is to use a lower-level model for routine tasks and reserve the more powerful models for tasks that need them. We want to control that routing through the AI gateway.

  • On migrating legacy enterprise software

Question: What is your take on using AI agents to migrate enterprise-level legacy systems, for example, from Oracle to SQL Server?

That needs to be broken down into smaller steps. Moving an enterprise stack is a huge task. We did this with an enterprise company in the US around 2022, and it took around two years. We did the frontend first, then cloud migration, and kept breaking it into smaller chunks. It is a long and debatable topic. Large enterprise ecosystems take a lot of time, and the right approach is to go step by step.

  • On human approval and overnight runs

Question: If I kick off tasks in the evening and the agent needs human approval mid-run, does it stop or keep running?

It is an agent configuration. For example, the BRD agent generates user stories but it needs human approval before it moves to the next step. The agent sends you an alert on your mobile, similar to how Claude's dispatch works. You can approve from your phone. If you sleep through it, it waits until the next morning.

  • On job-specific agents for UI and database work

Question: Do you have separate agents for specific roles like UI design or database architecture?

Yes. For example, we have an integration with the Figma MCP so it can generate designs. Those are separate internal agents. The slides showed the high-level pointers that sit within the software development life cycle. The job-specific ones exist separately.

  • On code quality and model selection

Question: Different models give different code quality. Which model works best for maintaining legacy code?

Opus 4.7, launched two to three days ago, gives the best output right now. But it is expensive. At the same time, 60% of the code a developer writes on any given day is general enough that a basic model can handle it. You do not need the most powerful model for every task. The AI gateway lets us route smaller tasks to a lower-level model and reserve the high-end models for what needs them.

SHARE ON

Related Articles.

More from the engineering frontline.

Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

Scaling AI Products: What Leaders Must Validate Before the Big Push
Article

May 6, 2026

Scaling AI Products: What Leaders Must Validate Before the Big Push

AI pilots are over. Learn what leaders must validate before scaling AI products for real business impact, trust, compliance, and profitability.

Why Security Readiness is the Ultimate Revenue Gatekeeper for AI
Article

May 6, 2026

Why Security Readiness is the Ultimate Revenue Gatekeeper for AI

Discover why security readiness is the real revenue gatekeeper for AI, helping firms close deals faster, reduce churn, and win enterprise trust.

The Next Era of AI Builders: Building Autonomous Systems for Frontier Firms — Pallavi Lokesh Shetty
Article

May 5, 2026

The Next Era of AI Builders: Building Autonomous Systems for Frontier Firms — Pallavi Lokesh Shetty

Discover Pallavi Shetty’s view on the next era of AI builders, covering autonomous systems, trusted agents, data quality, and frontier firms from thegeekconf mini 2026

The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar
Article

May 5, 2026

The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar

Akash Kamerkar’s thegeekconf mini 2026 talk explores the ACDC framework for building safer agentic workflows with clean code guards, sandbox testing, and AI-driven software development.

OpenClaw: Build Your Autonomous Assistant | Deepak Chawla
Article

May 4, 2026

OpenClaw: Build Your Autonomous Assistant | Deepak Chawla

Discover how Deepak Chawla explains OpenClaw for building autonomous AI assistants through data preparation, knowledge bases, AI engines, and agent automation.

From Prompt Chaos to Production AI: Spec-driven Development for AI Engineers | Vishal Alhat
Article

May 4, 2026

From Prompt Chaos to Production AI: Spec-driven Development for AI Engineers | Vishal Alhat

Learn how Vishal Alhat’s thegeekconf mini 2026 session explains spec-driven development and how AI engineers can move beyond prompt chaos to build production-ready applications.

Scroll for more
View all articles