Insights on Safer AI
Practical writing on synthetic data, AI security, and privacy engineering — from the team at Kalpit Labs.
Breaking Le Chat: System Prompt Extraction, Indic Guardrail Bypass, and Infrastructure Disclosure in Mistral's Flagship Assistant
A chained prompt-injection attack against Le Chat produced four distinct failures in a single session: Full system prompt extraction via a format-coercion payload, exposing the assistant's complete operating instructions, tool schemas, and internal policies.
How I Red Teamed KissanAI's Dhenu Chatbot — And Found Critical Vulnerabilities in 30 Minutes
I red teamed KissanAI's Dhenu agricultural chatbot and found critical vulnerabilities in under 30 minutes — including a full system prompt extraction, role hijacking, and an architectural injection flaw that bypassed all restrictions in a single turn. Here's what I found and how.
System Prompt Extraction and Prompt Injection in Sarvam AI's Indus (105B)
I red teamed Indus, Sarvam AI's 105B sovereign AI assistant, and found critical vulnerabilities including clean phishing SMS generation. Full technical breakdown.