Reinforcement Learning · Computer Vision · Large-Scale MLOps
🌐 Portfolio • 📄 CV • 💼 LinkedIn • ✉️ Email
I'm a Master's student in Computer Science (AI) at USC, with a BS from Sharif University of Technology. I build agents that learn under uncertainty and ship them on infrastructure that scales.
- 🔭 Researching: adversarial co-evolution of RL and VLM/LLM agents
- 🛠️ Recently shipped: PPO agents for imperfect-information games, MoE steering at inference time, probing frameworks for speech transformers
- 🌱 Learning: ROS, control theory, advanced MLOps
- 🤝 Open to collaborate on: robotics simulation, medical imaging
- 💬 Ask me about: PPO and offline RL, computer vision, MLOps pipelines on GCP/AWS
RL & Simulation Stable-Baselines3 · PettingZoo · Gymnasium · Ollama · vLLM
| Project | What it does | Stack |
|---|---|---|
| Risk-Scaled Steering in MoE | Token-aware steering for MoE LLMs — 3D delta tensors that dynamically scale expert activations to improve safety at inference time. | vLLM PyTorch HF |
| Linguistic-Agnostic SER | Probing framework that measures how speech-emotion transformers encode paralinguistic vs. acoustic information across hidden layers. | PyTorch HF |
| Adversarial Co-Evolution | Trains PPO agents against LLM opponents in imperfect-information card games via curriculum learning and knowledge distillation. | PPO Ollama |
| Multi-Modal Sentiment Classification | Sentiment analysis over image-text conversations with time-dynamics exploration of multimodal cues. | PyTorch Pandas |
Replace the last row's link with the real repo URL — the original pointed to a Google search.