Apr 7, 2026
How We Built a Real-Time AI System That Stops Fraud in 200ms
A breakdown of how we built an AI fraud detection system that makes accurate decisions in under 200ms without blocking legitimate transactions.
Author


Book a call
Table of Contents
Financial fraud evolves faster than rule books can be updated. Our internal engineering teams built a high-scale detection system to solve this problem, processing live payment flows across 9 fraud categories under a strict 200ms service level agreement (SLA).

Why Real-Time AI Fraud Detection Is Difficult to Build
Most fraud detection systems face a four-way tension. The system must be fast enough to operate within a live payment flow. It must be accurate enough to avoid false positives. It must be explainable enough for regulators. And it must be adaptive enough to handle fraud patterns that did not exist last month.
The Transaction Pipeline: Seven Agents, One Decision
Every transaction travels through a sequential pipeline. Each agent builds on the output of the previous one, with total end-to-end latency under 200ms in production.

After the pipeline completes, six post-decision tasks execute in parallel without adding to user-facing latency: graph updates, trust score adjustments, shadow mode scoring, database persistence, WebSocket broadcast, and Prometheus metrics recording.
Four Sub-Agents, 30+ Features

Agent A1, the Signal Processor, orchestrates four specialized sub-agents in parallel:
- Device Intelligence: Scores device trust based on age, OS, and emulator detection.
- Geo-Risk Analyzer: Detects impossible travel, VPN/Tor usage, and IP reputation.
- Behavioral Baseline: Computes per-customer Z-scores. It compares each transaction against that specific customer's history, not population averages, making it resistant to account warming strategies.
- Threat Intelligence: Cross-references breach databases and dark web correlation feeds.
The Risk Scorer: 38 Models, One Calibrated Score
The Risk Scorer is where all signals converge. The system uses a four-way adaptive ensemble that blends signal sources based on the type of transaction being evaluated.
| Component | Count | Role | Default Weight |
|---|---|---|---|
| LightGBM | 9 models (one per category) | Primary ML signal | 60% |
| XGBoost | 9 models | Ensemble partner | 40% |
| Isolation Forest | 10 models | Unsupervised anomaly detection | Dynamic |
| Isotonic Calibrators | 9 models | Probability calibration | Post-processing |
| Rule Engines | 9 YAML-driven engines | Domain logic/compliance | Dynamic |
| Graph Scores | NetworkX DiGraph | Network topology risk | Dynamic |
Three blend modes govern runtime weighting. rule_dominant gives rules 50% weight and ML 20%, suited for heavily regulated categories like KYC fraud. ml_dominant inverts that ratio, preferred when novel attack patterns have not yet been encoded into rules. balanced splits at 40/30.
Graph-Level Fraud Detection: Finding Mule Networks
Individual signals have limits. A hundred accounts funneling money through a single intermediary form a mule network, a pattern that only becomes visible at the network level. The Graph Anomaly Agent maintains a live NetworkX directed graph of transaction relationships. It identifies nodes with abnormally high in-degree and circular money flows that would pass undetected if evaluated in isolation.
Decision Logic and Continuous Learning
Every transaction resolves to one of three states: APPROVE, CHALLENGE, or DECLINE. The system also includes a background continuous learning pipeline:
- Drift Detector: Monitors Population Stability Index (PSI) to identify distribution shifts.
- Shadow Mode: Scores live transactions with challenger models without affecting decisions, tracking performance before promotion.
- Adaptive Thresholds: Shifts thresholds when live FPR and FNR drift from targets.
Compliance Reporting: Six Regulatory Frameworks, One Report Per Transaction
Every decision generates a compliance report tailored to the relevant jurisdiction. The Reasoning Generator holds 27 deterministic templates (9 fraud categories x 3 decisions) covering regulatory references, required actions, and audit log entries across RBI, PMLA, FATF, PCI DSS, GDPR, and SOX.
Architecture Decisions That Shaped the System
Three design choices had more structural impact than any others.
Deterministic reasoning over generative explanations.
Structured templates replace LLM-generated compliance reports. Determinism and validation are non-negotiable when an auditor reads the output. The same SHAP factors that drive the risk score drive the human-readable explanation, with no secondary model to introduce drift or inconsistency.
Fire-and-forget post-decision parallelism.
Separating the synchronous decision path from all graph updates, trust adjustments, and database writes made the 200ms SLA achievable and sustainable under load.
Layered intelligence over monolithic models.
Building Trust, One Decision at a Time
Fraud detection is a trust problem. Every false positive erodes customer confidence. Every false negative erodes financial system integrity. The challenge is building a system precise enough to minimize both at scale, in real time.
Subscribe to Our Newsletter
Subscribe to RSS
Press & Media Hub RSS FeedRelated Articles.
More from the engineering frontline.
Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

Jun 26, 2026
GeekyAnts Wins AI and Digital Transformation Excellence Award at ET Now Business Conclave 2026
This blog covers GeekyAnts winning the "Excellence in AI & Digital Transformation" award at the ET Now Business Conclave & Awards 2026, Gujarat Edition, held in Ahmedabad on June 16, 2026.

Jun 25, 2026
Analytics Insight Features GeekyAnts' Blueprint for Future-Ready Manufacturing
Analytics Insight features GeekyAnts CEO Kumar Pratik's take on why isolated automation efforts fall short, and what it takes to build truly future-proof manufacturing systems.

Jun 25, 2026
Automating Loan Origination Workflows: From SAR Prep to Fraud Checks
A guide to automating SAR preparation and fraud checks within the loan origination workflow, covering U.S. regulatory requirements and how lenders can adopt automation without disrupting operations.

Jun 17, 2026
Google I/O 2026 Mobile Playbook: AI Studio, Android CLI, and Antigravity for App Development
Google I/O 2026 shifted mobile development from code assistance to full lifecycle delivery. This blog breaks down what that means for Android, Flutter, and React Native teams.

Jun 17, 2026
Beyond the Chatbot: Architecting Enterprise Workflows with Managed Agents in the Gemini API
A practical guide to building production-ready agentic workflows with Google's Managed Agents API, covering architecture, governance, and where enterprise teams should start.

Jun 16, 2026
Integrating AI with Wearable Healthcare Apps: Architecture, Compliance & ROI
A technical and compliance-focused guide for U.S. healthcare founders and providers on building AI-enabled wearable healthcare apps across architecture, compliance, and ROI.