May 7, 2026

The AI native Enterprise Evolution | Saurabh Sahu

Explore Saurabh Sahu’s insights on AI-native enterprise, AI gateways, model governance, agentic SDLC, and workspace.build for scalable AI adoption from thegeekconf mini 2026.

MeetUp

Artificial Intelligence

Agentic AI

Events And Conferences

Author

Saurabh SahuChief Technology Officer (CTO)

Subject Matter Expert

Apoorva SahuDirector

The AI native Enterprise Evolution | Saurabh Sahu

Book a call

Table of Contents

Editor's Note: This blog post is adapted from a talk delivered at thegeekconf mini 2026 by Saurabh Sahu, Chief Technical Officer at GeekyAnts. With hands-on experience building AI systems for enterprise teams, Saurabh walks through three real decisions his team made: building an AI gateway for model governance, shifting to an agentic software development life cycle, and developing workspace.build, a tool that runs agents on cloud virtual machines so developers can work across multiple projects without waiting.

I am Saurabh Sahu, CTO at GeekyAnts. I am here to share some of the experiments and R&D we are doing at GeekyAnts, and how our thinking around AI has changed as a company.

The Incident That Changed How We Think About AI Tools

A few weeks back, on a Friday morning at 11 AM, our engineers were shipping code to production using an AI agentic coding tool we had rolled out to all developers. I am not supposed to name it as per policy, but that is what it was. All of a sudden, the entire engineering team got logged out of the tool.

The reason was a policy change going through on the vendor's side. We started getting a lot of emails. That is when we realised how comfortable we had become with a single tool. We had built entire workflows around it, built agent tools on top of it, and were shipping code through it. One day, productivity dropped to zero.

That day we learned something: a single vendor AI is a single point of failure. You cannot rely on one vendor. If that vendor changes their policy or doubles their price, you are at risk.

Control, Safety, and Configuration

We identified three needs that came out of that day.

First, control over the models. If users access Opus 4.7 without restriction, costs go through the roof. We needed an admin panel where we could control which teams get access to which models and at what cost.

Second, better safety and guardrails. Anyone could upload financial data to an external LLM and it would go out. We needed the ability to block requests that carry sensitive data.

Third, a common configuration layer for agents, tools, and MCPs so that all developers work from a shared foundation.

Building the AI Gateway

Our CEO, Kumar Pratik, who is here today, called a two-day hackathon. Together with solution architects, we built an AI gateway. Every request, whether from Cursor or Claude, goes through this gateway. It gives us a layer in between to control the models, configure local LLMs, configure agents, and set guardrail rules to block certain requests.

Before this, with Claude Code, we had no visibility into how people used it. No logs. The gateway solved that. It was built on open source. What would have taken a month, we built using an open source project called LiteLLM. If you want to build something similar for your organisation, that is a good place to start.

This gave us governance over our AI usage. That was learning number one.

Agentic Software Development Life Cycle

The second learning was about how we build software using these models. We moved from a software development life cycle to an AI software development life cycle.

We built several agents on our Claude Code setup. The BRD agent takes questions from users, writes user stories, and covers all edge cases. A human must approve before it moves forward. Once approved, it breaks user stories into smaller executable tasks and writes them to a markdown file. The next agent picks up that file and starts implementing the features and writing the code.

We also built a review agent that gives suggestions on pull requests and identifies improvement areas. Finally, a test agent connects to the Chrome MCP, runs all the written test cases, and gives the developer feedback on whether the tests pass or fail. Once approved, the code gets merged.

Each command is an agent. Every step requires human approval, so the developer stays in control. This setup is tool-agnostic: it works on Cursor rules, Windsurf, or Claude Code.

What This Means for Developers

As AI agents take on more of the work, the expectations on developers grow. Developers now need to step into the role of a product owner. They need to level up their communication skills and talk to clients. A single developer now does a lot more because the agent handles a lot of the execution. That was a clear learning for us.

R&D: workspace. build

The next problem we wanted to solve was waiting. With the current agentic setup, a developer sits at their laptop and waits five to ten minutes for an agent to finish before they can approve and move to the next step. We wanted to remove that wait.

The idea was to run agents on a cloud virtual machine. The developer kicks off the agent, closes their laptop, goes home, and does whatever they want. They come back to a dashboard that shows progress, a video recording of what ran on the virtual machine, and logs. They can then approve or reject and decide whether to merge. Running multiple agents in parallel on multiple projects becomes possible, which raises productivity.

This R&D project is called workspace.build. We are still working on it internally and will start rolling it out to users on a request basis over the coming months.

Deployment from the workspace. build

We also wanted to solve deployment from within the workspace. build. Once the agent work is done, you can deploy to any cluster, whether on-premise or on any cloud, by bringing your own cluster. The approach is GitOps-driven. You push a commit instead of SSHing into a server. Rollbacks are easier, and anyone can configure them.

From Incident to Ecosystem

Our ecosystem at GeekyAnts sits on three pillars. First, the AI gateway built on LiteLLM controls the model layer and maintains governance. Second, an agentic software development life cycle where agents handle the steps and humans approve at each checkpoint. Third, workspace.build, where agents run on virtual machines and ship code autonomously.

Our CEO, Pratik, is here if you want to connect offline. Suresh, our keynote speaker, is also here. This is a high-level bird's-eye view of what we are building and how we use AI at GeekyAnts.

Q&A

On tool agnosticism and vendor dependency

The agents we built were on Claude Code. The skills were already there. We made the setup tool-agnostic so we can configure it on Cursor, Windsurf, and other tools. The goal is not to rely on any single tool.

On managing multiple vendor costs

Question: You said you do not want to rely on one vendor, but using multiple vendors means paying for multiple. How do you manage that?

The AI gateway helps here. We are still configuring it to route to the right models. We have better governance and control over which models are open to which users. The organisation chooses the models, not the individual user.

On routing tasks to the right model

Question: If you have five LLMs running, how do you decide which task goes to which model?

We are still figuring that out. We are experimenting with Qwen and local LLM setups to find what gives the best output at the right cost. Right now, we have Sonnet open for developers. The direction is to use a lower-level model for routine tasks and reserve the more powerful models for tasks that need them. We want to control that routing through the AI gateway.

On migrating legacy enterprise software

Question: What is your take on using AI agents to migrate enterprise-level legacy systems, for example, from Oracle to SQL Server?

That needs to be broken down into smaller steps. Moving an enterprise stack is a huge task. We did this with an enterprise company in the US around 2022, and it took around two years. We did the frontend first, then cloud migration, and kept breaking it into smaller chunks. It is a long and debatable topic. Large enterprise ecosystems take a lot of time, and the right approach is to go step by step.

On human approval and overnight runs

Question: If I kick off tasks in the evening and the agent needs human approval mid-run, does it stop or keep running?

It is an agent configuration. For example, the BRD agent generates user stories but it needs human approval before it moves to the next step. The agent sends you an alert on your mobile, similar to how Claude's dispatch works. You can approve from your phone. If you sleep through it, it waits until the next morning.

On job-specific agents for UI and database work

Question: Do you have separate agents for specific roles like UI design or database architecture?

Yes. For example, we have an integration with the Figma MCP so it can generate designs. Those are separate internal agents. The slides showed the high-level pointers that sit within the software development life cycle. The job-specific ones exist separately.

On code quality and model selection

Question: Different models give different code quality. Which model works best for maintaining legacy code?

Opus 4.7, launched two to three days ago, gives the best output right now. But it is expensive. At the same time, 60% of the code a developer writes on any given day is general enough that a basic model can handle it. You do not need the most powerful model for every task. The AI gateway lets us route smaller tasks to a lower-level model and reserve the high-end models for what needs them.

SHARE ON

More from the engineering frontline.

Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

Building an AI Fintech Robo-Advisor Platform: Architecture, Compliance, and Key Features

Article

May 26, 2026

Building an AI Fintech Robo-Advisor Platform: Architecture, Compliance, and Key Features

A technical guide for CTOs and engineering leaders on building a compliant, production-grade AI robo-advisory platform for the US market, covering architecture, compliance, and cost.

AI in Insurance: Building Production-Ready Products for Claims, Underwriting, and Customer Experience

Article

May 22, 2026

AI in Insurance: Building Production-Ready Products for Claims, Underwriting, and Customer Experience

This blog breaks down what it takes to build production-ready AI in insurance across claims, underwriting, and customer experience. It covers the gap between AI pilots and live deployments, the architecture and governance requirements that determine whether a system holds up at scale, and what insurers need to get right across data infrastructure, compliance, and human oversight before going live.

Cursor vs. Lovable vs. Replit: Which Vibe Coding Tool Builds the Most Production-Ready Code?

Article

May 21, 2026

Cursor vs. Lovable vs. Replit: Which Vibe Coding Tool Builds the Most Production-Ready Code?

This guide breaks down Cursor, Lovable, and Replit across the criteria that matter most to CTOs, founders, and engineering leaders, making platform decisions with real operational consequences.

Explainable AI in Insurance Underwriting: Balancing Accuracy and Compliance

Article

May 21, 2026

Explainable AI in Insurance Underwriting: Balancing Accuracy and Compliance

Discover how XAI helps insurers improve underwriting accuracy while meeting regulatory, auditability, and transparency requirements.

Build vs Buy: Choosing the Right AI Strategy for Insurance Companies

Article

May 15, 2026

Build vs Buy: Choosing the Right AI Strategy for Insurance Companies

Build or buy AI for insurance? Learn how to avoid vendor lock-in, lower AI operating costs, and build scalable, compliant insurance platforms.

Beyond AI Pilots: Building Production-Ready RCM Platforms for Denial Prevention, Coding Accuracy, and Smarter Billing

Article

May 15, 2026

Beyond AI Pilots: Building Production-Ready RCM Platforms for Denial Prevention, Coding Accuracy, and Smarter Billing

Build production-ready RCM platforms for denial prevention, coding accuracy, smarter billing, compliance, and scalable healthcare AI revenue operations.

Scroll for more

View all articles

The AI native Enterprise Evolution | Saurabh Sahu

The Incident That Changed How We Think About AI Tools

Control, Safety, and Configuration

Building the AI Gateway

Agentic Software Development Life Cycle

What This Means for Developers

R&D: workspace. build

Deployment from the workspace. build

From Incident to Ecosystem

Q&A

On tool agnosticism and vendor dependency

On managing multiple vendor costs

On routing tasks to the right model

On migrating legacy enterprise software

On human approval and overnight runs

On job-specific agents for UI and database work

On code quality and model selection

More from the engineering frontline.

Building an AI Fintech Robo-Advisor Platform: Architecture, Compliance, and Key Features

AI in Insurance: Building Production-Ready Products for Claims, Underwriting, and Customer Experience

Cursor vs. Lovable vs. Replit: Which Vibe Coding Tool Builds the Most Production-Ready Code?

Explainable AI in Insurance Underwriting: Balancing Accuracy and Compliance

Build vs Buy: Choosing the Right AI Strategy for Insurance Companies

Beyond AI Pilots: Building Production-Ready RCM Platforms for Denial Prevention, Coding Accuracy, and Smarter Billing

The Right Conversation Can Save You Six Months.