Mapping Mythology with GenAI: Building a Mahabharata Chatbot Using GraphRAG
Dive into a Mahabharata chatbot built with GraphRAG & Neo4j! GenAI meets epic data to deliver smart, contextual answers rooted in India’s greatest story.
Author

Date

Book a call
Table of Contents
Editor’s Note: This blog is adapted from the talk by Siddhant Agarwal, Developer Relations lead for APAC at Neo4j. In this session, he shared how he combined GenAI, graph theory, and ancient Indian epics to create a knowledge-rich chatbot powered by Neo4j and GraphRAG—bringing the Mahabharata to life through data, relationships, and intelligent querying.
From Graphs to Epics: Where It All Began
Hi, I am Siddhant Agarwal. I lead DevRel for APAC at Neo4j, a native graph database company that’s been at the forefront of graph technology for nearly two decades. Before Neo4j, I worked at IBM, Google, and a few startups. And for the past 10+ years, I’ve been building communities, solving developer problems, and exploring the overlap between data and human context.
Neo4j is built on graph theory—nodes and relationships forming a web of connected information. It’s how LinkedIn shows you second-degree connections or how fraud detection systems map suspicious behavior. But this time, I wanted to use it for something... different.
I wanted to answer a simple question: What happens when Generative AI meets one of the most complex stories ever told—the Mahabharata?
Why Mahabharata? Because It’s a Network
The Mahabharata is not just an epic. It’s a data goldmine. With over 1.8 million words, 200+ key characters, and countless relationships—father-son, siblings, rivals, mentors—it reads like a massive interconnected web of knowledge. Perfect for a graph-based representation.
But traditional storage systems struggled to model it intuitively. Tables and joins made the data hard to explore, and building a chatbot on top of that structure would have been chaotic. That’s where Neo4j came in.
I started by manually mapping out nodes (characters) and edges (relationships). Using Cypher, Neo4j’s query language, I created 191 nodes and over 500 connections, each with meaningful metadata—names, roles, relationships, and more. But this was just the foundation.
Phase Two: Bringing the Epic to Life
The goal was not to stop at a static dataset. I wanted people to interact with the Mahabharata. So I built an early chatbot interface on top of Neo4j. You could ask it questions like, “Who were Ashwatthama’s parents?” and it would reply with one-word answers—rudimentary, but functional.
That version worked for conferences. But when someone asked, “Can it answer contextual questions too?”, I realized it needed to do more. I had only scratched the surface of what was possible.
Scaling the Dataset: 18 Books, 5,400 Pages, 10 M+ Characters
To make the chatbot truly comprehensive, I had to ingest the entire Mahabharata corpus—18 books, each around 300 pages. That’s 5,400 pages, 10.8 million characters, and 2.7 million tokens if you’re counting LLM compute costs.
Handling this volume with standard GenAI approaches would be expensive and inefficient. That’s when I turned to Vector RAG.
Beyond Vector Search: Enter GraphRAG
Vector RAG helps by chunking documents into small sections, converting them into embeddings, and storing them in a vector database. At query time, the LLM fetches relevant chunks and combines them with the prompt. But even that has limitations—mainly around context, relationship traversal, and answer depth.
GraphRAG solves this.
Instead of storing isolated vector chunks, GraphRAG builds knowledge graphs by extracting entities, mapping relationships, and connecting context-rich content across the dataset. This allows for semantic traversal, entity resolution, and far more accurate responses.
Neo4j’s LLM Graph Builder Tool made this seamless. I connected it to my instance, uploaded the PDFs, and the tool did the rest—no code, no APIs, no stress. The result: over 51,000 nodes and 500,000 relationships, all extracted from raw text and visually explorable in Neo4j.
The Chatbot, Reimagined
With the knowledge graph in place, I connected it to a GenAI interface using GPT-4. Users could now ask nuanced questions like “Why did Bhima take a vow of celibacy?” or “How did that affect the throne of Hastinapura?”—and get clear, relevant answers rooted in the Mahabharata's structure and narrative.
Even better, the tool allowed me to test different LLMs, customize chunking parameters, handle orphan nodes, de-duplicate overlapping entities (e.g., “Bangalore” vs. “BLR”), and add post-processing layers like community detection.
Full Control, Full Flexibility
For those who prefer working with code, we have launched the GraphRAG Python package, which gives you full programmatic control. It supports various retrievers—vector, cipher, full-text—and lets you fine-tune everything from schema extraction to token limits.
Neo4j also remains schema-flexible. You can dynamically modify or generate schemas as your data evolves—ideal for anyone building domain-specific graph applications.
What’s Next: Multilingual and Mixed Reality
The next steps? I am working on multilingual support to make the chatbot accessible in Indian languages, because you can’t build a mythological AI system in English alone. Beyond that, I am exploring VR interfaces that let you converse with characters like Arjuna or Krishna in a fully immersive environment.
This journey—from ancient epics to GenAI, from static data to dynamic intelligence—has only just begun.
Related Articles.
More from the engineering frontline.
Dive deep into our research and insights on design, development, and the impact of various trends to businesses.

Feb 12, 2026
The Enterprise AI Reality Check: Notes from the Front Lines
Enterprise leaders reveal the real blockers to AI adoption, from skill gaps to legacy systems, and what it takes to move beyond the first 20% of implementation.

Feb 10, 2026
The Three-Year Rule: Why Tech Change Takes Time
Successful enterprise technology transformation depends on a three-year investment strategy that prioritizes cultural readiness, leadership alignment, and robust governance frameworks to modernize legacy systems and improve operational efficiency.

Feb 9, 2026
Building the Workforce and Culture for the Future
AI won’t replace people—unprepared organizations will. Learn how to build skills, culture, and leadership for the AI era.

Feb 9, 2026
The Constant Core: Why Engineering Principles Matter More Than AI Tools
Successful AI integration requires a return to core engineering principles and technical foundations to ensure the workforce can solve deep architectural issues and manage complex systems when they fail.

Feb 9, 2026
Impact of AI on Software Engineering
7 billion lines of AI-generated code. 50x ROI. More hiring, not less. Explore the real impact of AI on software engineering roles and value.

Feb 9, 2026
Accelerating Revenue Velocity: The Blueprint for Content-Aware Sales Agents
Learn how content-aware AI sales agents and MCP reduce sales response time from days to minutes, helping enterprises accelerate revenue velocity.