Sanjana Soma

AI & Software Engineer

Building intelligent systems with LLMs, RAG pipelines, and full-stack infrastructure.

sanjana.s.soma@gmail.com

Experience

Software Developer

Vibho TechnologiesJun 2023 - Jul 2024

Engineered scalable partner-facing web applications using ReactJS/Redux, building modular UI components that improved load performance by 25% and reduced development redundancy by 20%
Diagnosed and resolved 30+ production integration and performance issues monthly, improving reliability of distributed client-server interactions

Software Engineer

Mindgraph TechnologiesJan 2023 - Jun 2023

Developed scalable data ingestion modules for a Customer Data Platform processing 90M+ records, enabling large-scale analytics for enterprise clients
Optimized SQL queries and backend data pipelines, reducing data retrieval latency by 18%

Projects

Outreach Atelier

AI Cold Email Tool

Built a full-stack outreach tool using FastAPI and Claude API for personalized email generation across 4 tones (Formal, Conversational, Story-Driven, Data-Driven), with Groq LLaMA 3.3 70B as a resilient fallback for provider failureIntegrated DuckDuckGo for real-time prospect research and Notion API for outreach tracking

AI Book Notes Summarizer

RAG pipeline using Groq LLaMA 3.3 70B and HuggingFace embeddings to produce audience-specific summaries from PDF documents across 4 modes, with structured citation grounding to prevent hallucination

Implemented Zod schema validation with auto-repair to self-correct malformed LLM JSON outputs, ensuring pipeline reliability in production; deployed live on Vercel

Long Context Reasoning Framework

Arizona State University

Engineered a custom multi-agent data synthesis pipeline using LLaMA-3.1-8B-Instruct on the SOL Supercomputer, generating 10,000-word reasoning benchmarks through iterative prompting and a QA modifier agent that stripped answer-leaking context from questionsBuilt and manually verified a dataset of 15 long-context narratives and 108 QA pairs covering arithmetic, temporal, and logical reasoning; ran comparative accuracy audit across GPT-4, Perplexity AI, and LLaMAIdentified a “Lost in the Middle” retrieval bottleneck: model accuracy dropped significantly when key information was positioned near word 2,500 in 10,000-word contexts, consistent across models

Education

Arizona State University

MS in Computer Science

Aug 2024 - May 2026

Skills

AI / ML

RAG Pipelines

LLM Evaluation

Prompt Engineering

NLP

HuggingFace

Groq

Frameworks

FastAPI

ReactJS

NodeJS

Django

TensorFlow

Pandas

PySpark

Programming

Python

JavaScript

SQL

C++

Java

Systems & Tools

Git

REST APIs

Distributed Data Pipelines

Vercel

Zod

Linux/Unix

Connect

linkedin.com/in/soma-sanjana

github.com/SanjanaSoma11

Experience

Software DeveloperJun 2023 - Jul 2024

Vibho Technologies

Engineered scalable partner-facing web applications using ReactJS/Redux, building modular UI components that improved load performance by 25% and reduced development redundancy by 20%
Diagnosed and resolved 30+ production integration and performance issues monthly, improving reliability of distributed client-server interactions

Software EngineerJan 2023 - Jun 2023

Mindgraph Technologies

Developed scalable data ingestion modules for a Customer Data Platform processing 90M+ records, enabling large-scale analytics for enterprise clients
Optimized SQL queries and backend data pipelines, reducing data retrieval latency by 18%

Projects

Outreach Ateliergithub.com/SanjanaSoma11/outreach_atelier

AI Cold Email Tool

Built a full-stack outreach tool using FastAPI and Claude API for personalized email generation across 4 tones (Formal, Conversational, Story-Driven, Data-Driven), with Groq LLaMA 3.3 70B as a resilient fallback for provider failure
Integrated DuckDuckGo for real-time prospect research and Notion API for outreach tracking

AI Book Notes Summarizergithub.com/SanjanaSoma11/book-notes-summarizer

RAG pipeline using Groq LLaMA 3.3 70B and HuggingFace embeddings to produce audience-specific summaries from PDF documents across 4 modes, with structured citation grounding to prevent hallucination

Implemented Zod schema validation with auto-repair to self-correct malformed LLM JSON outputs, ensuring pipeline reliability in production; deployed live on Vercel

Long Context Reasoning Framework

Arizona State University

Engineered a custom multi-agent data synthesis pipeline using LLaMA-3.1-8B-Instruct on the SOL Supercomputer, generating 10,000-word reasoning benchmarks through iterative prompting and a QA modifier agent that stripped answer-leaking context from questions
Built and manually verified a dataset of 15 long-context narratives and 108 QA pairs covering arithmetic, temporal, and logical reasoning; ran comparative accuracy audit across GPT-4, Perplexity AI, and LLaMA
Identified a “Lost in the Middle” retrieval bottleneck: model accuracy dropped significantly when key information was positioned near word 2,500 in 10,000-word contexts, consistent across models

Technical Skills

Skills: AI / ML, RAG Pipelines, LLM Evaluation, Prompt Engineering, NLP, HuggingFace, Groq, Frameworks, FastAPI, ReactJS, NodeJS, Django, TensorFlow, Pandas, PySpark, Programming, Python, JavaScript, SQL, C++, Java, C, Systems & Tools, Git, REST APIs, Distributed Data Pipelines, Vercel, Zod, Linux/Unix