May 7, 2026
The AI native Enterprise Evolution | Saurabh Sahu
Explore Saurabh Sahu’s insights on AI-native enterprise, AI gateways, model governance, agentic SDLC, and workspace.build for scalable AI adoption from thegeekconf mini 2026.
Author

Subject Matter Expert


Book a call
Table of Contents
Editor's Note: This blog post is adapted from a talk delivered at thegeekconf mini 2026 by Saurabh Sahu, Chief Technical Officer at GeekyAnts. With hands-on experience building AI systems for enterprise teams, Saurabh walks through three real decisions his team made: building an AI gateway for model governance, shifting to an agentic software development life cycle, and developing workspace.build, a tool that runs agents on cloud virtual machines so developers can work across multiple projects without waiting.
The Incident That Changed How We Think About AI Tools
A few weeks back, on a Friday morning at 11 AM, our engineers were shipping code to production using an AI agentic coding tool we had rolled out to all developers. I am not supposed to name it as per policy, but that is what it was. All of a sudden, the entire engineering team got logged out of the tool.
The reason was a policy change going through on the vendor's side. We started getting a lot of emails. That is when we realised how comfortable we had become with a single tool. We had built entire workflows around it, built agent tools on top of it, and were shipping code through it. One day, productivity dropped to zero.
That day we learned something: a single vendor AI is a single point of failure. You cannot rely on one vendor. If that vendor changes their policy or doubles their price, you are at risk.
Control, Safety, and Configuration
We identified three needs that came out of that day.
First, control over the models. If users access Opus 4.7 without restriction, costs go through the roof. We needed an admin panel where we could control which teams get access to which models and at what cost.
Building the AI Gateway
Our CEO, Kumar Pratik, who is here today, called a two-day hackathon. Together with solution architects, we built an AI gateway. Every request, whether from Cursor or Claude, goes through this gateway. It gives us a layer in between to control the models, configure local LLMs, configure agents, and set guardrail rules to block certain requests.
Before this, with Claude Code, we had no visibility into how people used it. No logs. The gateway solved that. It was built on open source. What would have taken a month, we built using an open source project called LiteLLM. If you want to build something similar for your organisation, that is a good place to start.
This gave us governance over our AI usage. That was learning number one.
Agentic Software Development Life Cycle
The second learning was about how we build software using these models. We moved from a software development life cycle to an AI software development life cycle.
We built several agents on our Claude Code setup. The BRD agent takes questions from users, writes user stories, and covers all edge cases. A human must approve before it moves forward. Once approved, it breaks user stories into smaller executable tasks and writes them to a markdown file. The next agent picks up that file and starts implementing the features and writing the code.
We also built a review agent that gives suggestions on pull requests and identifies improvement areas. Finally, a test agent connects to the Chrome MCP, runs all the written test cases, and gives the developer feedback on whether the tests pass or fail. Once approved, the code gets merged.
What This Means for Developers
As AI agents take on more of the work, the expectations on developers grow. Developers now need to step into the role of a product owner. They need to level up their communication skills and talk to clients. A single developer now does a lot more because the agent handles a lot of the execution. That was a clear learning for us.
R&D: workspace. build
The next problem we wanted to solve was waiting. With the current agentic setup, a developer sits at their laptop and waits five to ten minutes for an agent to finish before they can approve and move to the next step. We wanted to remove that wait.
The idea was to run agents on a cloud virtual machine. The developer kicks off the agent, closes their laptop, goes home, and does whatever they want. They come back to a dashboard that shows progress, a video recording of what ran on the virtual machine, and logs. They can then approve or reject and decide whether to merge. Running multiple agents in parallel on multiple projects becomes possible, which raises productivity.
Deployment from the workspace. build
We also wanted to solve deployment from within the workspace. build. Once the agent work is done, you can deploy to any cluster, whether on-premise or on any cloud, by bringing your own cluster. The approach is GitOps-driven. You push a commit instead of SSHing into a server. Rollbacks are easier, and anyone can configure them.
From Incident to Ecosystem
Our ecosystem at GeekyAnts sits on three pillars. First, the AI gateway built on LiteLLM controls the model layer and maintains governance. Second, an agentic software development life cycle where agents handle the steps and humans approve at each checkpoint. Third, workspace.build, where agents run on virtual machines and ship code autonomously.
Q&A
- On tool agnosticism and vendor dependency
The agents we built were on Claude Code. The skills were already there. We made the setup tool-agnostic so we can configure it on Cursor, Windsurf, and other tools. The goal is not to rely on any single tool.
- On managing multiple vendor costs
Question: You said you do not want to rely on one vendor, but using multiple vendors means paying for multiple. How do you manage that?
The AI gateway helps here. We are still configuring it to route to the right models. We have better governance and control over which models are open to which users. The organisation chooses the models, not the individual user.
- On routing tasks to the right model
Question: If you have five LLMs running, how do you decide which task goes to which model?
We are still figuring that out. We are experimenting with Qwen and local LLM setups to find what gives the best output at the right cost. Right now, we have Sonnet open for developers. The direction is to use a lower-level model for routine tasks and reserve the more powerful models for tasks that need them. We want to control that routing through the AI gateway.
- On migrating legacy enterprise software
Question: What is your take on using AI agents to migrate enterprise-level legacy systems, for example, from Oracle to SQL Server?
That needs to be broken down into smaller steps. Moving an enterprise stack is a huge task. We did this with an enterprise company in the US around 2022, and it took around two years. We did the frontend first, then cloud migration, and kept breaking it into smaller chunks. It is a long and debatable topic. Large enterprise ecosystems take a lot of time, and the right approach is to go step by step.
- On human approval and overnight runs
Question: If I kick off tasks in the evening and the agent needs human approval mid-run, does it stop or keep running?
It is an agent configuration. For example, the BRD agent generates user stories but it needs human approval before it moves to the next step. The agent sends you an alert on your mobile, similar to how Claude's dispatch works. You can approve from your phone. If you sleep through it, it waits until the next morning.
- On job-specific agents for UI and database work
Question: Do you have separate agents for specific roles like UI design or database architecture?
Yes. For example, we have an integration with the Figma MCP so it can generate designs. Those are separate internal agents. The slides showed the high-level pointers that sit within the software development life cycle. The job-specific ones exist separately.
- On code quality and model selection
Question: Different models give different code quality. Which model works best for maintaining legacy code?
Subscribe to Our Newsletter
Subscribe to RSS
Press & Media Hub RSS FeedRelated Articles.
More from the engineering frontline.
Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

Jun 16, 2026
Integrating AI with Wearable Healthcare Apps: Architecture, Compliance & ROI
A technical and compliance-focused guide for U.S. healthcare founders and providers on building AI-enabled wearable healthcare apps across architecture, compliance, and ROI.

Jun 16, 2026
HL7 and FHIR for AI Healthcare Platforms: What It Takes to Build for Production
A practical guide covering the HL7 and FHIR standards, production readiness requirements, implementation roadmap, architecture considerations, and compliance controls that AI healthcare teams need to address before enterprise deployment.

Jun 12, 2026
Cloud-Native and Cloud-Agnostic Are Not Ideologies; They Are Business-Stage Decisions
This blog explains how organizations can balance speed, scalability, and operational flexibility as they grow from startup to enterprise scale.

Jun 12, 2026
How AI-Driven Fraud Prevention Reduces Financial Losses and Operational Costs
This blog examines how AI-driven fraud detection reduces financial losses and operational costs, backed by real data from HSBC, the US Treasury, Visa, and Forter.

Jun 11, 2026
How AI-Powered Financial Platforms Are Increasing Customer Retention and Revenue
This blog breaks down how AI helps financial institutions retain customers and grow revenue, using real data from banks like DBS and NatWest to show what that looks like in practice.

Jun 11, 2026
KYC and AML Compliance for AI-Powered Fintech Products: What Teams Must Get Right Before Launch
A practical guide for fintech teams on building KYC and AML compliance into AI-powered products before launch.