[email protected] · Mumbai, India · aryanrajpurkar.com · LinkedIn · GitHub
B.Tech CSE (Data Science) · D.J. Sanghvi College of Engineering · CGPA 9.34 / 10 · 2022–2026
I design and ship production-grade AI systems — agentic pipelines, large-scale data platforms, and hybrid search infrastructure. My work spans:
- Autonomous AI agents — multi-tool, RAG-backed, LLMOps-ready
- Data engineering — real-time & batch ETL, streaming (Kafka/Airflow), 100K+ daily records
- Search & retrieval — BM25 + vector hybrid, knowledge graphs, re-ranking
Currently: AI Engineer Intern @ Atlan · Freelance AI Engineer @ Aretis Labs · Founding Engineer, Data Platforms @ VisaFriendly
Highlights
- Scaled AI automation consultancy to 10+ enterprise clients across industries in 5 months (Aretis Labs)
- Built ETL pipelines processing 100K+ monthly job entries with 28% accuracy improvement (VisaFriendly)
- Designed multi-agent evaluation pipeline automating 8K documents/month, improving interview response rates 22%
- Architected private on-premise RAG knowledge engines improving brand citation rates 30% across LLMs
- Engineered LangGraph recruitment workflows with 40% latency reduction and 20% lower cost per candidate
Sahayak AI — Metadata-First AI Document Operations Platform
Intelligent document intelligence platform with precedent relationship graph modeling provenance, dependencies, and citation lineage across government documents. Enables graph-traversal queries over document metadata.
- Hybrid RAG: BM25 + FAISS vector retrieval, cross-encoder re-ranking, compliance checks via Apache Kafka
- MCP-published catalog operations (hybrid search, contradiction analysis, version chains, grounded Q&A)
- Stack: Next.js · Python (Flask) · MongoDB (GridFS) · Neo4j · BM25 · FAISS · Elasticsearch · OCR · Kafka
Agentic AI content studio with DAG-based orchestration across 6 CrewAI LLM agents. Citation-style provenance ties every output to specific sources with retrieval quality gates.
- Multi-model image generation (Flux, Nano Banana) as agent-callable tools
- RAG pipeline with BM25 sparse + dense vector retrieval, configurable top-k and 0–100 relevance scoring
- Stack: TypeScript · React · CrewAI · ChromaDB · FastAPI · DALL-E · Stable Diffusion · WebSockets
Enterprise AI evaluation and safety layer with multi-layer detection (toxicity, bias, hallucination, jailbreak) backed by a fine-tuned toxicBERT running 500+ adversarial test cases.
- SHAP/LIME explainability via REST API for enterprise audit trails
- Multi-tenant architecture with per-tenant guardrail configs
- Stack: Python (FastAPI) · React/TypeScript · MongoDB · PyTorch · Transformers · SHAP · LIME · Redis · Docker
Languages
AI / ML
Web & APIs
Cloud & Data
DevOps
- Winner — Smart India Hackathon 2024 · National hackathon by the Government of India
- Amazon ML Summer School Scholar 2024 · Selected from 61,000 applicants nationwide
- Best Student Chair 2025 · Society of Data Science, India — organised 10+ workshops
- Chairperson, DJS-S4DS · National Data Science Committee of India, mentored 200+ students
- 5× National Hackathon Winner · ₹2L+ prize pool
Open to high-impact roles in AI Engineering, Data Engineering, and Agentic Systems.
[email protected]



