Yash Vardhan Malik
Full-stack developer and data practitioner
Summary
Full-stack developer and data practitioner with hands-on experience in AI/ML, data analysis, and production engineering — shaped through multiple industry internships and real-world projects.
Education
B.Tech Mechanical Engineering
10th: 94%
Experience
Centre for Railway Information Systems (CRIS)
- Processed 450K+ freight records and engineered 20+ domain-specific features for ML-based time prediction pipelines on real Indian Railways operational data.
- Boosted model accuracy by 27% using LightGBM with hyperparameter tuning; benchmarked 5+ algorithms (Random Forest, XGBoost, SVR) achieving R²: 0.78 (loading) and R²: 0.72 (unloading).
TRINITI
- Completed coursework in Data Analytics, Statistics, and visualization tools (Excel, MySQL, Tableau, Python) — applying statistical methods to analyze, clean, and transform complex datasets, delivering actionable insights through reports and dashboards to support business decision-making.
General Software & Consultancy Services (GSCS)
- Managed the AppKube cloud monitoring project overseeing 10+ AWS EC2 instances; reduced system monitoring time by 30% through automated dashboards and real-time alert pipelines.
- Delivered EC2 reporting solutions serving 5+ enterprise client environments, improving operational observability and incident response efficiency across production infrastructure.
Projects
Built a full-stack RAG platform where creators paste two YouTube URLs and ask why one outperformed the other — powered by a LangGraph StateGraph with 4-class query routing, fastembed ONNX BGE-small embeddings on pgvector HNSW, 3-stage transcript cascade (youtube-transcript-api → Supadata.ai → Groq Whisper V3), and SSE streaming with inline timestamp citations.
- Engineered timestamp-aware chunking with deterministic intro_5s / intro_15s segments enabling metadata-filtered hook retrieval;
- Deployed on Render + Vercel + Neon at $0.013/session (3.5× cheaper than GPT-4 equivalent) with LangGraph PostgresSaver.
Built and deployed a full-stack SaaS platform enabling developers to index any GitHub repository and query codebases via natural language — powered by a LangChain + RAG pipeline with pgvector (384-dim HuggingFace BGE embeddings) and Groq Llama 3.1 for context-aware answers with file-level references.
- Engineered a meeting intelligence module transcribing audio (MP3/WAV/M4A up to 50MB) via Groq Whisper Large V3 and extracting action items, decisions, and topics via LLM with timestamps — scoped per project for contextual relevance.
- Implemented AI commit summarization, team collaboration via invite links, Razorpay credit billing (1 credit = 1 file indexed), and Clerk auth — backed by Neon PostgreSQL, Uploadthing, and Vercel.
Built a multi-agent trading simulator encoding strategies from 15+ legendary investors into role-based agents (fundamentals, sentiment, risk arbitrage) with an agentic orchestration layer for signal-sharing and unified portfolio decision-making.
- Backtested agent-driven strategies on historical market data — yielded 12% higher returns and 10% lower drawdown vs. baseline index strategies across multiple market regimes; demonstrated applied multi-agent system design with measurable financial impact.
Technical Skills
Skills: Python, SQL, MERN stack, Next.js, FastAPI, LangGraph, LangChain, AWS (EC2, S3, IAM), Docker, Vercel, Git, GitHub Actions, Postman, Clerk, Razorpay, Tableau, Power BI, Excel
Awards & Achievements
- IBM DevOps, Agile & Design Thinking (IBMCE) — IBM (Jun 2025)
- Introduction to AI — Infosys Springboard (Jun 2025)
- OfficeMaster On PowerBI (BE 10X) — OfficeMaster (Jul 2024)
- NamasteDev NodeJS | ReactJS — NamasteDev (Feb 2025 / Sep 2024)