May 5, 2026

The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar

Akash Kamerkar’s thegeekconf mini 2026 talk explores the ACDC framework for building safer agentic workflows with clean code guards, sandbox testing, and AI-driven software development.

MeetUp

Artificial Intelligence

Agentic AI

Events And Conferences

Author

Apoorva PathakContent Writer

The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar

Book a call

Table of Contents

Editor's Note: This blog post is adapted from a talk delivered at thegeekconf mini 2026 by Akash Kamerkar, Data Scientist at ABB and founding member of Devil Squad. With experience building and teaching agentic systems, Akash walks through the anatomy of AI agents, the breakdown of traditional SDLC under agentic development, and the ACDC framework: a Guide, Generate, Verify, Solve cycle that brings deterministic code quality checks into the generation phase.

My name is Akash. I work as a data scientist at ABB and wear multiple hats. I am a founding member of Devil Squad, where we work in the DevRel space in India. I do content creation. I have around 80K followers on LinkedIn and am part of the LinkedIn Editorial Program, which is for creators who produce good content. I teach data science on platforms like Physics Wallah, GeeksForGeeks, and Great Learning. I started working in the freelancing space and have something of my own in the works. This talk covers four areas. First, the anatomy of an AI agent. Second, the evolution of the SDLC. The first two sections are for those newer to the field. For those with more experience, we go into a Sonar ACDC deep dive and an implementation playbook.

Anatomy of an AI Agent

The LLM as the Brain

We have many agents and agentic models today, but the core, the brain, is the LLM. The current state-of-the-art models include GPT-4, Claude 3.7, Gemini 1.5, and LLaMA 4. The base is the LLM. Let's look at how these models evolved.

Evolution of AI Models

Before 2020, we used ML models for prediction: classification, regression, and anomaly detection. In 2022, LLMs arrived. We use them to generate content and text; multimodal models can also generate images. In 2024, chat assistants came into the picture. These are reasoning models: you give them text, and they produce output. But they are stateless; they have no memory between interactions. That limitation is what gave rise to agents.

What Makes an Agent Different

Agents are stateful. They have memory and access to external tools, so they act with more than just language; they act with context. Consider a travel booking scenario: you want to go to Pond and need to book tickets. One agent fetches the flight details, a second agent fetches the hotel details, each does its own task, and together they return a complete result. This is how agentic systems work. The bottom line is that we are moving from AI as a tool to AI as a worker.

The Building Blocks of Agentic Systems

Stateless vs. Stateful Models

A basic LLM is stateless: you pass a prompt, it returns an output, and it retains nothing. A stateful model has memory and a runtime. It stores each prompt and uses that history to produce better outputs for the next prompt.

Chaining vs. Workflow

In chaining, you give the LLM a fixed sequence of instructions: do step one, then step two, then step three. In a workflow, there is a feedback loop. The system has tools, code, and a verification step. For example, if you ask an agent to generate LinkedIn content and the output reads like AI-generated text, the workflow detects that, loops back, and tries to produce more human-sounding content. That feedback loop is what defines a workflow. Workflows are powered by orchestrators like LangGraph and CrewAI.

Claude Ecosystem Evolution

Claude started around 2016 as a chatbot: give it a prompt, get an answer. We used it for research, reading, summarising, analysis, and conversational Q&A. Around 2024–25, Claude Code arrived. Claude Code can access files on your laptop, run commands, and act as a capable agentic coding system.

Do you know who built Claude Cowork and how long it took? Anthropic built it, and the first iteration took two weeks, built using Claude Code itself. They are not just building solutions; they are productising those solutions at scale. That a production-grade tool got built in two weeks is the point.

Evolution of the SDLC

Traditional SDLC

In the traditional SDLC, you start with application or software requirements, move to design, pass the features to developers to code, send the code to QA for testing, and then ship to production. That is the classic software development lifecycle.

Where the Traditional SDLC Breaks Down with Agents

When you generate code with agents, through Cursor or similar tools, problems emerge. You write a prompt, and the agent generates 100 lines of code in seconds. But a human reviewer takes at least an hour to read and validate those 100 lines. Every step in the pipeline — from requirements to design, from design to code — still requires human feedback. We generate at machine speed but verify at human speed, and that mismatch slows the entire process.

The core issue: Agile methods were built to make humans faster. They cannot manage a system that works at the speed of agents.

Four Pillars of Traditional SDLC

The traditional SDLC rests on four models: Waterfall, Agile, Iterative, and V-Model. The industry most commonly uses Waterfall and Agile. All four require human intervention at each step. These models were built for humans, not for agentic development.

Drawbacks of the Traditional Approach

1. Review Latency

Agents write in seconds. Humans review in hours. That is the first and most visible gap.

2. Verification Gap

Today, we generate code and push it to production through a CI/CD pipeline. The pipeline catches issues, SonarQube checks, Meta Defender, and Trivy checks, but only at the CI stage. The problem is that we generate the entire codebase first, then verify. What if we ran those checks at the moment of code generation? Running SonarQube, Meta Defender, and Trivy at generation time would produce clean code from the start.

3. Cascading Hallucination

Without real guards, agents build broken code. Around 60% of a typical codebase lacks proper SonarQube or Trivy checks. When you pass that codebase to an agent, the agent builds on top of that mess. The output inherits and amplifies the existing problems. That is a cascading hallucination.

4. Context Trap

Jira tickets lack repository awareness. When a human works on a bug, they understand the full context: which feature caused it, what the surrounding code does. When an agent receives that same ticket, it only sees the code. It misses the context behind the issue. That gap leads to incomplete or incorrect fixes.

The Numbers Behind the Problem

Research shows that 40–45% of AI-generated code contains security flaws. AI-generated Java has a 72% security failure rate. 24.2% of AI-introduced issues persist in production as silent debt, and 17% lower manual review coverage. The point is, agents create applications in weeks, but they also accumulate silent technical debt, bad code, and messy code that we ignore. All of this we can address with the ACDC framework.

The ACDC Framework

Sonar's solution is the Agent-Centric Development Cycle: ACDC. It is to Guide, then Generate, then Verify, then Solve a cycle for autonomous development, compatible with agentic workflows.

Why Human PR Review Fails at Agent Scale

When a human developer commits, the difference is small, two or three lines. A senior can review that in 15 to 30 minutes. When an agent generates code, the difference is 100 to 200 lines. The senior still gets the PR, but they will not review it with the same care. If the code works, that tends to be enough. Quality issues get missed. ACDC addresses this.

Step 1: Guide

This step is about context. Pass your architecture rules and codebase structure to the LLM before it generates anything. You can use Sonar Context if you have it, or create an agents.md file with your architecture metadata. The agent then generates code with awareness of your standards and constraints.

Step 2: Generate

Use any code generation tool: Cursor, GitHub Copilot, Claude Code, or anything else. This step is the generation itself.

Step 3: Verify

Instead of running SonarQube and Meta Defender checks at the CI/CD stage, run them here, at generation time. Integrate sonar-scan, sandbox code execution, and Trivy checks into this step. The code the agent produces passes all quality gates before it enters the pipeline.

Step 4: Solve

Sonar's remediation agent handles automated bug resolution. You create a ticket for a bug or hotfix and pass it to the agent. The agent resolves the issue and tests the fix inside a sandbox. Nothing touches the actual codebase until the test cases pass. If something breaks, it breaks in the sandbox. Once all tests pass, the fix is merged to the main branch and moves to production.

Three Structural Shifts from ACDC

1. Agentic Sandbox

The remediation agent resolves bugs inside an isolated sandbox. If something breaks, it breaks there, not in your actual codebase.

2. Dynamic Context

Passing the entire codebase to an LLM is a problem. The agent does not need all of it. Too much context degrades the output just as much as too little. Dynamic context means identifying the minimum relevant context for each task and passing that. Not too much, not too little.

3. Deterministic Verification

Quality checks run before generation completes, not after. The code that reaches your pipeline already satisfies SonarQube, Trivy, and other checks. Verification is a precondition, not a gate.

Results from ACDC

The numbers, drawn from research, show meaningful improvements. A project that takes 10 days with standard agentic development takes 4 days with ACDC. Bug count post-CI drops from 100 to 40, a 60% reduction. Security hotspots flagged in the CI pipeline have been reduced by 80%.

Implementing ACDC in Your Organisation

For the guide phase, no special tooling is needed. Create an agents.md file with your architecture metadata and pass it to the LLM. If you want Sonar integration, use Sonar Context. For generating, use Cursor, Claude Code, or any code agent. To verify, integrate SonarScan and sandbox execution at generation time rather than at CI. To solve, route bug tickets through the remediation agent.

Audience Q&A

Large PR Memory Issues

Audience: We create prompts for PR review, but partner code PRs can contain 6,000 to 7,000 files. Running that on a local machine consumes all memory and forces a restart. How do we solve that?

Akash: Run it in a sandbox. Instead of running on your local machine or a VM, route it through a sandbox. That is the solve phase of the ACDC framework; whatever runs in the sandbox stays in the sandbox and does not affect your local environment.

Governance of the Solve Phase

Audience: The solve phase is useful, but it still needs governance. Someone needs to review what the remediation agent produces.

Akash: That is correct. Human review cannot disappear when you develop with agents, but we can minimise it. That is the point.

Tools, Tradeoffs, and What's Next

A Note on Using LLMs for Presentations

This presentation was generated using Claude Sonnet, not Opus. Opus consumes tokens at a much higher rate. I bought Claude Pro yesterday, gave a prompt to the Claude 4.7 model, and exhausted my token quota in one session. I bought additional credits, switched to Sonnet, and finished the deck. If you want to generate presentations with an LLM, use Sonnet. Gemini produces lower-quality output for this.

The Cost Question

We talk about replacing developers with agents from Anthropic, OpenAI, and Google, but at what cost? You pay for those API calls. There are token limits and capability boundaries. A company has to decide: do you keep your developers, or do you replace them with agents? That is a decision each organisation has to make.

SHARE ON

More from the engineering frontline.

Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

AI in Insurance: Building Production-Ready Products for Claims, Underwriting, and Customer Experience

Article

May 22, 2026

AI in Insurance: Building Production-Ready Products for Claims, Underwriting, and Customer Experience

This blog breaks down what it takes to build production-ready AI in insurance across claims, underwriting, and customer experience. It covers the gap between AI pilots and live deployments, the architecture and governance requirements that determine whether a system holds up at scale, and what insurers need to get right across data infrastructure, compliance, and human oversight before going live.

Cursor vs. Lovable vs. Replit: Which Vibe Coding Tool Builds the Most Production-Ready Code?

Article

May 21, 2026

Cursor vs. Lovable vs. Replit: Which Vibe Coding Tool Builds the Most Production-Ready Code?

This guide breaks down Cursor, Lovable, and Replit across the criteria that matter most to CTOs, founders, and engineering leaders, making platform decisions with real operational consequences.

Explainable AI in Insurance Underwriting: Balancing Accuracy and Compliance

Article

May 21, 2026

Explainable AI in Insurance Underwriting: Balancing Accuracy and Compliance

Discover how XAI helps insurers improve underwriting accuracy while meeting regulatory, auditability, and transparency requirements.

Build vs Buy: Choosing the Right AI Strategy for Insurance Companies

Article

May 15, 2026

Build vs Buy: Choosing the Right AI Strategy for Insurance Companies

Build or buy AI for insurance? Learn how to avoid vendor lock-in, lower AI operating costs, and build scalable, compliant insurance platforms.

Beyond AI Pilots: Building Production-Ready RCM Platforms for Denial Prevention, Coding Accuracy, and Smarter Billing

Article

May 15, 2026

Beyond AI Pilots: Building Production-Ready RCM Platforms for Denial Prevention, Coding Accuracy, and Smarter Billing

Build production-ready RCM platforms for denial prevention, coding accuracy, smarter billing, compliance, and scalable healthcare AI revenue operations.

Why AI Insurance Projects Fail in Production

Article

May 15, 2026

Why AI Insurance Projects Fail in Production

Why do most AI insurance projects fail in production? Discover the hidden architectural, compliance, and scaling gaps behind failed AI deployments.

Scroll for more

View all articles

The Autonomous Factory: Architecting Agentic Workflows with Clean Code Guards | Akash Kamerkar

Anatomy of an AI Agent

The LLM as the Brain

Evolution of AI Models

What Makes an Agent Different

The Building Blocks of Agentic Systems

Stateless vs. Stateful Models

Chaining vs. Workflow

Claude Ecosystem Evolution

Evolution of the SDLC

Traditional SDLC

Where the Traditional SDLC Breaks Down with Agents

Four Pillars of Traditional SDLC

Drawbacks of the Traditional Approach

1. Review Latency

2. Verification Gap

3. Cascading Hallucination

4. Context Trap

The Numbers Behind the Problem

The ACDC Framework

Why Human PR Review Fails at Agent Scale

Step 1: Guide

Step 2: Generate

Step 3: Verify

Step 4: Solve

Three Structural Shifts from ACDC

1. Agentic Sandbox

2. Dynamic Context

3. Deterministic Verification

Results from ACDC

Implementing ACDC in Your Organisation

Audience Q&A

Large PR Memory Issues

Governance of the Solve Phase

Tools, Tradeoffs, and What's Next

A Note on Using LLMs for Presentations

The Cost Question

More from the engineering frontline.

AI in Insurance: Building Production-Ready Products for Claims, Underwriting, and Customer Experience

Cursor vs. Lovable vs. Replit: Which Vibe Coding Tool Builds the Most Production-Ready Code?

Explainable AI in Insurance Underwriting: Balancing Accuracy and Compliance

Build vs Buy: Choosing the Right AI Strategy for Insurance Companies

Beyond AI Pilots: Building Production-Ready RCM Platforms for Denial Prevention, Coding Accuracy, and Smarter Billing

Why AI Insurance Projects Fail in Production

The Right Conversation Can Save You Six Months.