2026

WhatsApp PharmAgent

AI pharmacy assistant on WhatsApp with multi-pipeline orchestration over Azure

Python
Azure Functions
Azure OpenAI (GPT-4.1)
Cosmos DB
Document Intelligence
Speech-to-Text
Twilio WhatsApp API
Next.js

The problem

Small Indian pharmacies juggle paper invoices, manual stock counts, and phone-based reorders. Pharmacists already live on WhatsApp for supplier communication. PharmAgent puts the entire inventory workflow on the channel they already use, no app install, no training, and no separate POS device. A pharmacist can photograph a supplier invoice, send a voice note for a sale, or type a stock query, all in the same chat.

Architecture

A single Azure Function receives the Twilio webhook and delegates to a router that classifies the inbound message by type (image, audio, text, or session command). Three downstream services handle the actual work. The invoice service uploads images to Blob Storage, calls Document Intelligence with the prebuilt-invoice model, augments the result with regex fallbacks for missing batch numbers and expiry dates, and stores an editable session in Cosmos DB so the pharmacist can correct fields over WhatsApp before confirming. The voice service downloads audio from Twilio, transcribes via Azure Speech REST, and feeds the transcript through GPT-4.1 with a structured JSON intent schema. The text service goes straight to the LLM for intent classification. All three converge on a shared inventory service that handles the actual stock CRUD, with confirmation prompts before any destructive action. State for pending confirmations and in-flight invoice edits lives in a Cosmos sessions container keyed by phone number.

What I owned

System architecture and stack choices (Azure Functions, Cosmos DB, Twilio, GPT-4.1). Master orchestrator and message routing. Cosmos DB schema design across all 4 containers and the full data layer. Twilio WhatsApp integration. The natural language query pipeline end to end, including intent classification, inventory CRUD, and the confirmation flow. The invoice OCR and voice billing pipelines were built jointly with a teammate. The Next.js dashboard and its deployment were handled by another teammate.

Hardest technical decision

Choosing Cosmos DB over Postgres. Cosmos's schema flexibility was important because invoice OCR returns inconsistent field shapes across distributors, and pending session state is a natural document model. The trade-off was Request Unit cost on hot reads. I designed around it by partitioning every container by phone number, which keeps the access patterns single-partition and predictable, and by keeping session documents short-lived. If the read volume grew, the right next move would be a Redis cache in front of inventory lookups, but at the current scale the design holds.

What I'd do differently

Rebuild the orchestrator as an explicit conversational state machine rather than stateless intent routing. The current design routes each message in isolation and relies on the LLM to recover context from the prior turn, which works for clean conversations but is brittle for repair flows like a user clarifying a misheard product name across two messages. A state machine with named conversation states would also make the confirmation logic easier to test in isolation. I would also put a Redis cache in front of the inventory container. Repeated "do we have X" queries within a single session burn Cosmos RUs unnecessarily, and the cache invalidation rule is simple because every write goes through the inventory service.

Other projects

2026Founding Engineer at Crescentia One LLP

Production E-commerce Platform

End-to-end e-commerce platform for a Gen Z clothing brand, built as Founding Engineer at Crescentia One LLP

Owned the full backend (Express, Razorpay, Delhivery, support and email flows, rate limiting, cron) and led the engineering for a production e-commerce platform built from scratch.
Made the infrastructure call for Hetzner over EC2 t3.medium/large at roughly a third of the per-hour cost, and Cloudflare R2 over S3 for image storage (R2 has zero egress fees and lower per-GB pricing). Handled the deployment end to end.
Led client engagement and architecture decisions across the build.

Next.js
Express.js
TypeScript
PostgreSQL (Neon)
Prisma
Better Auth

Case study

2025SDE Intern at GoYaana (Aug to Oct 2025)

GoYaana Travel Chatbot

Conversational AI travel planner on AWS Lambda with LLM-driven itinerary generation

Built a serverless Java 21 backend on AWS Lambda that plans end-to-end trips from Bangalore to 118+ destinations, deployed via AWS SAM with API Gateway.
Used a dual-LLM strategy on Groq (Llama 3.3 70B for routing and date extraction, Llama 3.1 8B for batch description generation) to keep itinerary generation under the 29-second API Gateway timeout.
Containerised a Python Gradio frontend in Docker with FPDF2 and DejaVu fonts for Unicode and rupee support in downloadable PDF itineraries.

Java 21
AWS Lambda
AWS SAM
API Gateway
Groq (Llama 3.3 70B + Llama 3.1 8B)
Python

Case study

2026

TrackWell

Cross-platform mobile app for AI-powered nutrition tracking, fitness planning, and GPS-tracked activity

Three distinct Gemini AI integrations: vision-based food recognition from camera images, personalised daily fitness plan generation from onboarding inputs, and weekly health insights synthesised from 7 days of tracking data.
Real-time GPS walk tracker with foreground location streaming, Haversine-based distance calculation, and MET-based calorie estimation that uses the user's actual weight rather than a fixed multiplier.
Production touches across the app: 6-hour cache on weekly AI insights to keep API costs sane, debounced FatSecret OAuth 1.0 search, Clerk auth with Google OAuth, scheduled local notifications, and per-day log persistence in Firestore.

React Native (Expo SDK 54)
TypeScript
Expo Router
Google Gemini 3 Flash
Clerk
Firebase Firestore

Case study