The Curious Explorer
June 21, 2025

A talk with ancient Indian Culture

Posted on June 21, 2025  •  4 minutes  • 768 words

Vast amounts of wisdom and knowledge lie dormant in the ancient Indian cultural and spiritual texts—Puranas, Sastras, the Bhagavad Gita, and teachings of Bhagwan Sri Sathya Sai Baba. For generations, these treasures have been locked away in dense tomes and scattered PDFs, their voices muffled by time and technology. I longed to make this ocean of knowledge easily searchable, interactive, and accessible to anyone curious enough to ask. What if you could talk to these texts, question them, and let them answer in their own timeless voice?

With AI and language models developing so quickly, I knew it had to make it easier for me to talk to my culture. I set out to build a bridge between the digital present and the spiritual past—a way to converse with the wisdom of the ages using the language of AI.

My pilgrimage began in the digital wilderness, gathering a comprehensive library of texts that represents Indian Culture. I started collecting a comprehensive library of PDFs: Agni Purana, Bhagavata Purana, Brahmanda Purana, Kurma Purana, Linga Purana, Markandeya Purana, Narada Purana, Padma Purana, Shiva Purana, Garuda Purana, Varaha Purana, Vayu Purana, Natya Sastra, Bhagavad Gita, Ramayana, Mahabharatha, Kama Sutra, Artha Sastra, and all available works of Bhagwan Sri Sathya Sai Baba from ssssahitya.org. In hind sight, I do feel that I have missed a few major works like Chanakya Neethi, Bhaskaracharya’s works, or Charaka Samhita, and many more. Finding English translations for these was a slightly challenging task. What’s more is only find translations and not purports and explainations. But when I found these texts, each text felt like a river, carrying stories, philosophy, and guidance from the mountains of antiquity. But how to make their waters flow into the present? I turned to OCR to convert scanned pages into living text, then built a vector database to enable semantic search and retrieval—a digital river of love, bringing scattered wisdom together.

The RAG system’s architecture, like a temple with many sanctums, is designed for both flexibility and scale. The project involves scanned PDFs transforming into text, automatically categorized, and prepared to feed into AI. A vector database (Weaviate) powers similarity-based document retrieval, ensuring answers are contextually relevant. The number of sources adapts to the depth of your inquiry—simple questions draw from a handful of texts, while deeper quests summon a chorus of scriptures. The system even detects your hardware and selects the best available large language model, from Gemma3 27B for GPUs to Gemma2 2B for CPUs. Everything runs in Docker, making it as portable as a traveling sage, ready to set up camp wherever wisdom is sought. The whole system primarily built to run in my laptop that has minimal specs for computation. I have published the whole project in my git repo. Please go and check it out here.

After weeks of work, I found myself “talking” to Indian culture and spirituality in a way I never imagined. Yes, the general language models like ChatGPT or Gemini have some knowledge on Indian culture, directly referencing the authentic sources and providing exact references gives me more confidence. Since the answers are completely based on the given sources, I see lesser noise and conflicting opinions. Given that the whole system is in-house, sensitive and NSFE questions can also be asked. While this RAG system excels at searching and retrieving, it does not truly “understand” the texts in the way a human seeker might. The responses, though accurate and often quite grand, lack the depth and nuance of a scholar steeped in these hallowed traditions. The retrieval itself is not complex enough. At least, I could not make it work the best. Sometimes, specific terminology must be asked as the retrieval system does not completely “understand” what I ask. To go further—to create an AI that not only searches but deeply understands and interprets these texts—would require fine-tuning language models on this specialized corpus. That is a journey for another day, a yagna that demands more resources.

In this quest for truth, a path we tread, Through ancient words, by wisdom led. But does the answer, clearly spun, Reveal the self, or journeys just begun? We ask, we seek, for light to gleam, In digital scrolls, a vibrant dream. Yet does the mind, through answers bright, Truly ascend to spiritual height? Or is the path, a silent art, A deeper question in the seeker’s heart? For truth within, a quiet flow, Beyond the words, where spirits grow. The journey has only begun — a river that flows from the mountains of memory, through the valleys of code, into the ocean of possibility.

Follow me

I like to simplify Biology.