May 5, 2026
The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar
Akash Kamerkar’s thegeekconf mini 2026 talk explores the ACDC framework for building safer agentic workflows with clean code guards, sandbox testing, and AI-driven software development.
Author


Book a call
Table of Contents
Editor's Note: This blog post is adapted from a talk delivered at thegeekconf mini 2026 by Akash Kamerkar, Data Scientist at ABB and founding member of Devil Squad. With experience building and teaching agentic systems, Akash walks through the anatomy of AI agents, the breakdown of traditional SDLC under agentic development, and the ACDC framework: a Guide, Generate, Verify, Solve cycle that brings deterministic code quality checks into the generation phase.
Anatomy of an AI Agent
The LLM as the Brain
We have many agents and agentic models today, but the core, the brain, is the LLM. The current state-of-the-art models include GPT-4, Claude 3.7, Gemini 1.5, and LLaMA 4. The base is the LLM. Let's look at how these models evolved.
Evolution of AI Models
Before 2020, we used ML models for prediction: classification, regression, and anomaly detection. In 2022, LLMs arrived. We use them to generate content and text; multimodal models can also generate images. In 2024, chat assistants came into the picture. These are reasoning models: you give them text, and they produce output. But they are stateless; they have no memory between interactions. That limitation is what gave rise to agents.
What Makes an Agent Different
The Building Blocks of Agentic Systems
Stateless vs. Stateful Models
A basic LLM is stateless: you pass a prompt, it returns an output, and it retains nothing. A stateful model has memory and a runtime. It stores each prompt and uses that history to produce better outputs for the next prompt.
Chaining vs. Workflow
Claude Ecosystem Evolution
Claude started around 2016 as a chatbot: give it a prompt, get an answer. We used it for research, reading, summarising, analysis, and conversational Q&A. Around 2024–25, Claude Code arrived. Claude Code can access files on your laptop, run commands, and act as a capable agentic coding system.
Evolution of the SDLC
Traditional SDLC
In the traditional SDLC, you start with application or software requirements, move to design, pass the features to developers to code, send the code to QA for testing, and then ship to production. That is the classic software development lifecycle.
Where the Traditional SDLC Breaks Down with Agents
When you generate code with agents, through Cursor or similar tools, problems emerge. You write a prompt, and the agent generates 100 lines of code in seconds. But a human reviewer takes at least an hour to read and validate those 100 lines. Every step in the pipeline — from requirements to design, from design to code — still requires human feedback. We generate at machine speed but verify at human speed, and that mismatch slows the entire process.
The core issue: Agile methods were built to make humans faster. They cannot manage a system that works at the speed of agents.
Four Pillars of Traditional SDLC
Drawbacks of the Traditional Approach
1. Review Latency
Agents write in seconds. Humans review in hours. That is the first and most visible gap.
2. Verification Gap
Today, we generate code and push it to production through a CI/CD pipeline. The pipeline catches issues, SonarQube checks, Meta Defender, and Trivy checks, but only at the CI stage. The problem is that we generate the entire codebase first, then verify. What if we ran those checks at the moment of code generation? Running SonarQube, Meta Defender, and Trivy at generation time would produce clean code from the start.
3. Cascading Hallucination
Without real guards, agents build broken code. Around 60% of a typical codebase lacks proper SonarQube or Trivy checks. When you pass that codebase to an agent, the agent builds on top of that mess. The output inherits and amplifies the existing problems. That is a cascading hallucination.
4. Context Trap
Jira tickets lack repository awareness. When a human works on a bug, they understand the full context: which feature caused it, what the surrounding code does. When an agent receives that same ticket, it only sees the code. It misses the context behind the issue. That gap leads to incomplete or incorrect fixes.
The Numbers Behind the Problem
The ACDC Framework
Sonar's solution is the Agent-Centric Development Cycle: ACDC. It is to Guide, then Generate, then Verify, then Solve a cycle for autonomous development, compatible with agentic workflows.
Why Human PR Review Fails at Agent Scale
When a human developer commits, the difference is small, two or three lines. A senior can review that in 15 to 30 minutes. When an agent generates code, the difference is 100 to 200 lines. The senior still gets the PR, but they will not review it with the same care. If the code works, that tends to be enough. Quality issues get missed. ACDC addresses this.
Step 1: Guide
This step is about context. Pass your architecture rules and codebase structure to the LLM before it generates anything. You can use Sonar Context if you have it, or create an agents.md file with your architecture metadata. The agent then generates code with awareness of your standards and constraints.
Step 2: Generate
Use any code generation tool: Cursor, GitHub Copilot, Claude Code, or anything else. This step is the generation itself.
Step 3: Verify
Instead of running SonarQube and Meta Defender checks at the CI/CD stage, run them here, at generation time. Integrate sonar-scan, sandbox code execution, and Trivy checks into this step. The code the agent produces passes all quality gates before it enters the pipeline.
Step 4: Solve
Three Structural Shifts from ACDC
1. Agentic Sandbox
The remediation agent resolves bugs inside an isolated sandbox. If something breaks, it breaks there, not in your actual codebase.
2. Dynamic Context
Passing the entire codebase to an LLM is a problem. The agent does not need all of it. Too much context degrades the output just as much as too little. Dynamic context means identifying the minimum relevant context for each task and passing that. Not too much, not too little.
3. Deterministic Verification
Results from ACDC
The numbers, drawn from research, show meaningful improvements. A project that takes 10 days with standard agentic development takes 4 days with ACDC. Bug count post-CI drops from 100 to 40, a 60% reduction. Security hotspots flagged in the CI pipeline have been reduced by 80%.
Implementing ACDC in Your Organisation
For the guide phase, no special tooling is needed. Create an agents.md file with your architecture metadata and pass it to the LLM. If you want Sonar integration, use Sonar Context. For generating, use Cursor, Claude Code, or any code agent. To verify, integrate SonarScan and sandbox execution at generation time rather than at CI. To solve, route bug tickets through the remediation agent.
Audience Q&A
Large PR Memory Issues
Audience: We create prompts for PR review, but partner code PRs can contain 6,000 to 7,000 files. Running that on a local machine consumes all memory and forces a restart. How do we solve that?
Akash: Run it in a sandbox. Instead of running on your local machine or a VM, route it through a sandbox. That is the solve phase of the ACDC framework; whatever runs in the sandbox stays in the sandbox and does not affect your local environment.
Governance of the Solve Phase
Audience: The solve phase is useful, but it still needs governance. Someone needs to review what the remediation agent produces.
Tools, Tradeoffs, and What's Next
A Note on Using LLMs for Presentations
This presentation was generated using Claude Sonnet, not Opus. Opus consumes tokens at a much higher rate. I bought Claude Pro yesterday, gave a prompt to the Claude 4.7 model, and exhausted my token quota in one session. I bought additional credits, switched to Sonnet, and finished the deck. If you want to generate presentations with an LLM, use Sonnet. Gemini produces lower-quality output for this.
The Cost Question
Related Articles.
More from the engineering frontline.
Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

May 22, 2026
AI in Insurance: Building Production-Ready Products for Claims, Underwriting, and Customer Experience
This blog breaks down what it takes to build production-ready AI in insurance across claims, underwriting, and customer experience. It covers the gap between AI pilots and live deployments, the architecture and governance requirements that determine whether a system holds up at scale, and what insurers need to get right across data infrastructure, compliance, and human oversight before going live.

May 21, 2026
Cursor vs. Lovable vs. Replit: Which Vibe Coding Tool Builds the Most Production-Ready Code?
This guide breaks down Cursor, Lovable, and Replit across the criteria that matter most to CTOs, founders, and engineering leaders, making platform decisions with real operational consequences.

May 21, 2026
Explainable AI in Insurance Underwriting: Balancing Accuracy and Compliance
Discover how XAI helps insurers improve underwriting accuracy while meeting regulatory, auditability, and transparency requirements.

May 15, 2026
Build vs Buy: Choosing the Right AI Strategy for Insurance Companies
Build or buy AI for insurance? Learn how to avoid vendor lock-in, lower AI operating costs, and build scalable, compliant insurance platforms.

May 15, 2026
Beyond AI Pilots: Building Production-Ready RCM Platforms for Denial Prevention, Coding Accuracy, and Smarter Billing
Build production-ready RCM platforms for denial prevention, coding accuracy, smarter billing, compliance, and scalable healthcare AI revenue operations.

May 15, 2026
Why AI Insurance Projects Fail in Production
Why do most AI insurance projects fail in production? Discover the hidden architectural, compliance, and scaling gaps behind failed AI deployments.