SOMIL JAIN somiljain7

BONJOUR MON AMI 👋

Somil Jain (@coderatwork7)

🌐 Portfolio: somiljain7.github.io
📍 Location: Bengaluru, Karnataka, India
💼 Current Role: Founding ML Engineer @ Vaani

👨‍💻 About Me

I'm an ML engineer and researcher who specializes in breaking systems apart, understanding their failure modes, and rebuilding them to withstand production reality. My work sits at the intersection of language research, speech systems, and production-grade AI infrastructure.

I've moved past the era of impressive demos. What drives me now is latency optimization, system robustness, and shipping AI models that actually survive user traffic at scale. I believe the most exciting ML work happens when you're debugging edge cases at 3 AM, not when you're hitting 99% on a clean benchmark.

What gets me excited:

🎯 Production ML systems that handle millions of requests without breaking
🗣️ Voice AI that feels natural, responsive, and culturally aware
🔬 Research with real-world impact — papers are great, but I care more about what ships
⚡ Low-latency architectures where every millisecond counts
🌍 Multilingual & localized AI that works for more than just English speakers

🚀 Current Focus

🧑‍💻 Founding ML Engineer @ Vaani

Building the core product capabilities and voice AI infrastructure that powers hundreds of Voice AI companies and millions of deployments globally.

Key responsibilities:

Architecting scalable ML pipelines for real-time speech processing
Optimizing inference latency for STT, LLM, and TTS components
Building production-grade voice AI systems that handle edge cases gracefully
Developing tools and infrastructure for rapid experimentation and deployment
Working on multilingual and code-mixed speech recognition

Impact:

Enabling voice AI at scale for diverse use cases across industries
Contributing to infrastructure that processes millions of voice interactions
Pushing the boundaries of what's possible in real-time conversational AI

💼 Professional Experience

🤖 AI/ML Developer @ Aerocraft Engineering

Developed machine learning solutions for aerospace applications, focusing on predictive maintenance and operational optimization.

🔬 Junior Research Fellow @ NIT Karnataka, Surathkal

Conducted research in machine learning and AI, contributing to academic publications and experimental systems in collaboration with faculty and fellow researchers.

🎤 Tech Speaker & Educator

Delivered technical talks and seminars on Machine Learning and Blockchain at various institutions
Mentored students and junior developers in ML/AI fundamentals
Led workshops on practical ML implementation and deployment strategies

🤝 President, Mozilla Campus Club BVUCOEP (2021–2022)

Led a community of 100+ students passionate about open-source technology
Organized hackathons, workshops, and tech talks on web technologies and open-source contribution
Fostered collaboration between students and the broader Mozilla community
Promoted web literacy and open-source values on campus

🧩 Featured Projects

🕵️ Moriarty's Game – Real-time Multimodal Voice Agent

An immersive voice AI experience featuring "Moriarty Bhai" — a localized Hindi/English persona that brings character and cultural context to conversational AI.

What makes it special:

Ultra-low latency pipeline: Real-time STT → LLM → Emotional TTS loop optimized for <500ms end-to-end latency
Cultural localization: Seamless code-switching between Hindi and English with culturally appropriate responses
Emotional intelligence: Dynamic TTS that adapts tone and emotion based on conversation context
Immersive experience: Designed as an interactive game/experience, not just a chatbot
Production-grade architecture: Built to handle real user interactions, not just demos

Technical highlights:

Custom STT preprocessing for Hindi-English code-mixing
Optimized LLM inference with streaming responses
Real-time emotional TTS generation with voice cloning
WebSocket-based bidirectional communication for minimal latency
Robust error handling and graceful degradation

Built for: OpenAI x Peak XV Ventures Hackathon
🔗 Repository: [Coming soon]
🎥 Demo: [Coming soon]

🗣️ Speech Recognition for Low-Resource Languages

Research and implementation of ASR systems for Indian languages with limited training data.

Key contributions:

Transfer learning from high-resource to low-resource language models
Data augmentation techniques for speech in noisy environments
Fine-tuning strategies for code-mixed conversations

🔊 Real-time Audio Processing Pipeline

Built a production-ready audio processing system for voice applications.

Features:

WebRTC integration for low-latency audio streaming
Noise suppression and audio enhancement
VAD (Voice Activity Detection) for efficient processing
Multi-speaker diarization
Scalable microservice architecture

🛠️ Technical Stack

Languages

Primary: Python (PyTorch, TensorFlow), C++ (for performance-critical components)
Secondary: JavaScript/TypeScript (for tooling and APIs), Bash/Shell scripting

ML/AI Frameworks & Tools

Deep Learning: PyTorch (primary), TensorFlow, JAX
NLP/Speech: HuggingFace Transformers, Whisper, Wav2Vec2, FastSpeech2
LLMs: OpenAI API, Anthropic Claude, Llama, Mistral
Audio Processing: librosa, soundfile, pydub, WebRTC
ML Ops: Weights & Biases, MLflow, TensorBoard

Infrastructure & DevOps

Containerization: Docker, Docker Compose, Kubernetes
Cloud Platforms: AWS (SageMaker, Lambda, EC2), GCP (Vertex AI, Cloud Run)
CI/CD: GitHub Actions, GitLab CI, Jenkins
Monitoring: Prometheus, Grafana, ELK Stack
Databases: PostgreSQL, Redis, MongoDB, Vector DBs (Pinecone, Weaviate)

Specialized Tools

Speech Tools: Kaldi, ESPnet, Coqui TTS, XTTS
Optimization: ONNX, TensorRT, quantization techniques
API Development: FastAPI, Flask, gRPC
Real-time Communication: WebSockets, WebRTC, Socket.io

📚 Research Interests

I'm particularly interested in the following areas:

🗣️ Speech Processing: ASR, TTS, speaker recognition, emotion detection
🌐 Multilingual NLP: Code-mixing, low-resource languages, cross-lingual transfer
⚡ Model Optimization: Quantization, pruning, knowledge distillation for edge deployment
🎭 Conversational AI: Dialog systems, persona consistency, context management
🔊 Audio Understanding: Sound event detection, audio classification, music information retrieval
🤖 Production ML: MLOps, model monitoring, A/B testing, serving infrastructure

Open to collaborations on:

Speech and language technology for Indian languages
Real-time multimodal AI systems
Open-source ML tools and frameworks
Research with clear paths to production deployment

📝 Writing & Talks

I occasionally write about ML engineering, production AI systems, and lessons learned from shipping models to production.

Topics I cover:

Production ML war stories and debugging techniques
Latency optimization for real-time AI systems
Building voice AI that doesn't sound like a robot
The gap between research and production (and how to bridge it)

Blog/Medium links coming soon

📊 GitHub Statistics

🤝 Let's Connect

I'm always interested in connecting with fellow ML engineers, researchers, and builders. Whether you want to discuss a potential collaboration, talk about production ML challenges, or just chat about the latest in AI — feel free to reach out!

📫 Contact Information

Platform	Link
💬 Telegram	@coderatwork7
📧 Email	somiljain71100@gmail.com
💼 LinkedIn	somiljain7
🐙 GitHub	@somiljain7
🐦 Twitter/X	@coderatwork7
🌐 Portfolio	somiljain7.github.io

💡 Best ways to reach me:

Quick questions: Twitter/X DM
Technical discussions: Email or LinkedIn
Casual chat: Telegram
Collaboration proposals: Email with subject line "Collaboration: [Brief Topic]"

🎯 Current Goals (2026)

Ship production voice AI systems serving 10M+ users
Open-source a key component of our voice AI stack
Contribute to 3+ impactful open-source ML projects
Write 12+ technical blog posts on production ML
Speak at 2+ ML/AI conferences or meetups
Mentor 10+ aspiring ML engineers

🏆 Achievements & Recognition

🏅 OpenAI x Peak XV Ventures Hackathon - Moriarty's Game (Voice AI track)
🎓 Junior Research Fellow - Selected for research position at NIT Karnataka
👥 Mozilla Campus Club President - Led 100+ member community (2021-2022)
📢 Tech Speaker - Delivered talks at multiple institutions on ML & Blockchain

💭 Philosophy

"The best ML models are the ones that ship. The second-best are the ones that actually work when users touch them. Everything else is just interesting math."

I believe in:

Pragmatism over perfectionism — 90% accuracy in production beats 99% in a notebook
Latency as a feature — fast models create better user experiences
Learning by shipping — you learn more from one production failure than ten successful benchmarks
Open collaboration — the best ideas come from diverse perspectives
Continuous improvement — today's state-of-the-art is tomorrow's baseline

🔍 What I'm Looking For

I'm always interested in:

🤝 Collaborations on speech/NLP projects, especially for Indian languages
💼 Opportunities to work on challenging production ML problems
🧑‍🤝‍🧑 Connections with researchers and engineers pushing the boundaries of voice AI
📚 Knowledge sharing — if you're working on similar problems, let's chat!

📜 License

This README is licensed under CC BY 4.0. Feel free to use it as inspiration for your own profile!

I like hard problems, real systems, and work that ships.

Let's build something that matters. 🚀

_{Last updated: February 2026}