May 7, 2026
The AI native Enterprise Evolution | Saurabh Sahu
Explore Saurabh Sahu’s insights on AI-native enterprise, AI gateways, model governance, agentic SDLC, and workspace.build for scalable AI adoption from thegeekconf mini 2026.
Author

Subject Matter Expert


Book a call
Table of Contents
Editor's Note: This blog post is adapted from a talk delivered at thegeekconf mini 2026 by Saurabh Sahu, Chief Technical Officer at GeekyAnts. With hands-on experience building AI systems for enterprise teams, Saurabh walks through three real decisions his team made: building an AI gateway for model governance, shifting to an agentic software development life cycle, and developing workspace.build, a tool that runs agents on cloud virtual machines so developers can work across multiple projects without waiting.
The Incident That Changed How We Think About AI Tools
A few weeks back, on a Friday morning at 11 AM, our engineers were shipping code to production using an AI agentic coding tool we had rolled out to all developers. I am not supposed to name it as per policy, but that is what it was. All of a sudden, the entire engineering team got logged out of the tool.
The reason was a policy change going through on the vendor's side. We started getting a lot of emails. That is when we realised how comfortable we had become with a single tool. We had built entire workflows around it, built agent tools on top of it, and were shipping code through it. One day, productivity dropped to zero.
That day we learned something: a single vendor AI is a single point of failure. You cannot rely on one vendor. If that vendor changes their policy or doubles their price, you are at risk.
Control, Safety, and Configuration
We identified three needs that came out of that day.
First, control over the models. If users access Opus 4.7 without restriction, costs go through the roof. We needed an admin panel where we could control which teams get access to which models and at what cost.
Building the AI Gateway
Our CEO, Kumar Pratik, who is here today, called a two-day hackathon. Together with solution architects, we built an AI gateway. Every request, whether from Cursor or Claude, goes through this gateway. It gives us a layer in between to control the models, configure local LLMs, configure agents, and set guardrail rules to block certain requests.
Before this, with Claude Code, we had no visibility into how people used it. No logs. The gateway solved that. It was built on open source. What would have taken a month, we built using an open source project called LiteLLM. If you want to build something similar for your organisation, that is a good place to start.
This gave us governance over our AI usage. That was learning number one.
Agentic Software Development Life Cycle
The second learning was about how we build software using these models. We moved from a software development life cycle to an AI software development life cycle.
We built several agents on our Claude Code setup. The BRD agent takes questions from users, writes user stories, and covers all edge cases. A human must approve before it moves forward. Once approved, it breaks user stories into smaller executable tasks and writes them to a markdown file. The next agent picks up that file and starts implementing the features and writing the code.
We also built a review agent that gives suggestions on pull requests and identifies improvement areas. Finally, a test agent connects to the Chrome MCP, runs all the written test cases, and gives the developer feedback on whether the tests pass or fail. Once approved, the code gets merged.
What This Means for Developers
As AI agents take on more of the work, the expectations on developers grow. Developers now need to step into the role of a product owner. They need to level up their communication skills and talk to clients. A single developer now does a lot more because the agent handles a lot of the execution. That was a clear learning for us.
R&D: workspace. build
The next problem we wanted to solve was waiting. With the current agentic setup, a developer sits at their laptop and waits five to ten minutes for an agent to finish before they can approve and move to the next step. We wanted to remove that wait.
The idea was to run agents on a cloud virtual machine. The developer kicks off the agent, closes their laptop, goes home, and does whatever they want. They come back to a dashboard that shows progress, a video recording of what ran on the virtual machine, and logs. They can then approve or reject and decide whether to merge. Running multiple agents in parallel on multiple projects becomes possible, which raises productivity.
Deployment from the workspace. build
We also wanted to solve deployment from within the workspace. build. Once the agent work is done, you can deploy to any cluster, whether on-premise or on any cloud, by bringing your own cluster. The approach is GitOps-driven. You push a commit instead of SSHing into a server. Rollbacks are easier, and anyone can configure them.
From Incident to Ecosystem
Our ecosystem at GeekyAnts sits on three pillars. First, the AI gateway built on LiteLLM controls the model layer and maintains governance. Second, an agentic software development life cycle where agents handle the steps and humans approve at each checkpoint. Third, workspace.build, where agents run on virtual machines and ship code autonomously.
Q&A
- On tool agnosticism and vendor dependency
The agents we built were on Claude Code. The skills were already there. We made the setup tool-agnostic so we can configure it on Cursor, Windsurf, and other tools. The goal is not to rely on any single tool.
- On managing multiple vendor costs
Question: You said you do not want to rely on one vendor, but using multiple vendors means paying for multiple. How do you manage that?
The AI gateway helps here. We are still configuring it to route to the right models. We have better governance and control over which models are open to which users. The organisation chooses the models, not the individual user.
- On routing tasks to the right model
Question: If you have five LLMs running, how do you decide which task goes to which model?
We are still figuring that out. We are experimenting with Qwen and local LLM setups to find what gives the best output at the right cost. Right now, we have Sonnet open for developers. The direction is to use a lower-level model for routine tasks and reserve the more powerful models for tasks that need them. We want to control that routing through the AI gateway.
- On migrating legacy enterprise software
Question: What is your take on using AI agents to migrate enterprise-level legacy systems, for example, from Oracle to SQL Server?
That needs to be broken down into smaller steps. Moving an enterprise stack is a huge task. We did this with an enterprise company in the US around 2022, and it took around two years. We did the frontend first, then cloud migration, and kept breaking it into smaller chunks. It is a long and debatable topic. Large enterprise ecosystems take a lot of time, and the right approach is to go step by step.
- On human approval and overnight runs
Question: If I kick off tasks in the evening and the agent needs human approval mid-run, does it stop or keep running?
It is an agent configuration. For example, the BRD agent generates user stories but it needs human approval before it moves to the next step. The agent sends you an alert on your mobile, similar to how Claude's dispatch works. You can approve from your phone. If you sleep through it, it waits until the next morning.
- On job-specific agents for UI and database work
Question: Do you have separate agents for specific roles like UI design or database architecture?
Yes. For example, we have an integration with the Figma MCP so it can generate designs. Those are separate internal agents. The slides showed the high-level pointers that sit within the software development life cycle. The job-specific ones exist separately.
- On code quality and model selection
Question: Different models give different code quality. Which model works best for maintaining legacy code?
Related Articles.
More from the engineering frontline.
Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

May 6, 2026
Scaling AI Products: What Leaders Must Validate Before the Big Push
AI pilots are over. Learn what leaders must validate before scaling AI products for real business impact, trust, compliance, and profitability.

May 6, 2026
Why Security Readiness is the Ultimate Revenue Gatekeeper for AI
Discover why security readiness is the real revenue gatekeeper for AI, helping firms close deals faster, reduce churn, and win enterprise trust.

May 5, 2026
The Next Era of AI Builders: Building Autonomous Systems for Frontier Firms — Pallavi Lokesh Shetty
Discover Pallavi Shetty’s view on the next era of AI builders, covering autonomous systems, trusted agents, data quality, and frontier firms from thegeekconf mini 2026

May 5, 2026
The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar
Akash Kamerkar’s thegeekconf mini 2026 talk explores the ACDC framework for building safer agentic workflows with clean code guards, sandbox testing, and AI-driven software development.

May 4, 2026
OpenClaw: Build Your Autonomous Assistant | Deepak Chawla
Discover how Deepak Chawla explains OpenClaw for building autonomous AI assistants through data preparation, knowledge bases, AI engines, and agent automation.

May 4, 2026
From Prompt Chaos to Production AI: Spec-driven Development for AI Engineers | Vishal Alhat
Learn how Vishal Alhat’s thegeekconf mini 2026 session explains spec-driven development and how AI engineers can move beyond prompt chaos to build production-ready applications.