2026
WhatsApp PharmAgent
AI pharmacy assistant on WhatsApp with multi-pipeline orchestration over Azure
- Python
- Azure Functions
- Azure OpenAI (GPT-4.1)
- Cosmos DB
- Document Intelligence
- Speech-to-Text
- Twilio WhatsApp API
- Next.js
The problem
Small Indian pharmacies juggle paper invoices, manual stock counts, and phone-based reorders. Pharmacists already live on WhatsApp for supplier communication. PharmAgent puts the entire inventory workflow on the channel they already use, no app install, no training, and no separate POS device. A pharmacist can photograph a supplier invoice, send a voice note for a sale, or type a stock query, all in the same chat.
Architecture
A single Azure Function receives the Twilio webhook and delegates to a router that classifies the inbound message by type (image, audio, text, or session command). Three downstream services handle the actual work. The invoice service uploads images to Blob Storage, calls Document Intelligence with the prebuilt-invoice model, augments the result with regex fallbacks for missing batch numbers and expiry dates, and stores an editable session in Cosmos DB so the pharmacist can correct fields over WhatsApp before confirming. The voice service downloads audio from Twilio, transcribes via Azure Speech REST, and feeds the transcript through GPT-4.1 with a structured JSON intent schema. The text service goes straight to the LLM for intent classification. All three converge on a shared inventory service that handles the actual stock CRUD, with confirmation prompts before any destructive action. State for pending confirmations and in-flight invoice edits lives in a Cosmos sessions container keyed by phone number.
What I owned
System architecture and stack choices (Azure Functions, Cosmos DB, Twilio, GPT-4.1). Master orchestrator and message routing. Cosmos DB schema design across all 4 containers and the full data layer. Twilio WhatsApp integration. The natural language query pipeline end to end, including intent classification, inventory CRUD, and the confirmation flow. The invoice OCR and voice billing pipelines were built jointly with a teammate. The Next.js dashboard and its deployment were handled by another teammate.
Hardest technical decision
Choosing Cosmos DB over Postgres. Cosmos's schema flexibility was important because invoice OCR returns inconsistent field shapes across distributors, and pending session state is a natural document model. The trade-off was Request Unit cost on hot reads. I designed around it by partitioning every container by phone number, which keeps the access patterns single-partition and predictable, and by keeping session documents short-lived. If the read volume grew, the right next move would be a Redis cache in front of inventory lookups, but at the current scale the design holds.
What I'd do differently
Rebuild the orchestrator as an explicit conversational state machine rather than stateless intent routing. The current design routes each message in isolation and relies on the LLM to recover context from the prior turn, which works for clean conversations but is brittle for repair flows like a user clarifying a misheard product name across two messages. A state machine with named conversation states would also make the confirmation logic easier to test in isolation. I would also put a Redis cache in front of the inventory container. Repeated "do we have X" queries within a single session burn Cosmos RUs unnecessarily, and the cache invalidation rule is simple because every write goes through the inventory service.
Other projects
Production E-commerce Platform
End-to-end e-commerce platform for a Gen Z clothing brand, built as Founding Engineer at Crescentia One LLP
- Owned the full backend (Express, Razorpay, Delhivery, support and email flows, rate limiting, cron) and led the engineering for a production e-commerce platform built from scratch.
- Made the infrastructure call for Hetzner over EC2 t3.medium/large at roughly a third of the per-hour cost, and Cloudflare R2 over S3 for image storage (R2 has zero egress fees and lower per-GB pricing). Handled the deployment end to end.
- Led client engagement and architecture decisions across the build.
- Next.js
- Express.js
- TypeScript
- PostgreSQL (Neon)
- Prisma
- Better Auth
GoYaana Travel Chatbot
Conversational AI travel planner on AWS Lambda with LLM-driven itinerary generation
- Built a serverless Java 21 backend on AWS Lambda that plans end-to-end trips from Bangalore to 118+ destinations, deployed via AWS SAM with API Gateway.
- Used a dual-LLM strategy on Groq (Llama 3.3 70B for routing and date extraction, Llama 3.1 8B for batch description generation) to keep itinerary generation under the 29-second API Gateway timeout.
- Containerised a Python Gradio frontend in Docker with FPDF2 and DejaVu fonts for Unicode and rupee support in downloadable PDF itineraries.
- Java 21
- AWS Lambda
- AWS SAM
- API Gateway
- Groq (Llama 3.3 70B + Llama 3.1 8B)
- Python
TrackWell
Cross-platform mobile app for AI-powered nutrition tracking, fitness planning, and GPS-tracked activity
- Three distinct Gemini AI integrations: vision-based food recognition from camera images, personalised daily fitness plan generation from onboarding inputs, and weekly health insights synthesised from 7 days of tracking data.
- Real-time GPS walk tracker with foreground location streaming, Haversine-based distance calculation, and MET-based calorie estimation that uses the user's actual weight rather than a fixed multiplier.
- Production touches across the app: 6-hour cache on weekly AI insights to keep API costs sane, debounced FatSecret OAuth 1.0 search, Clerk auth with Google OAuth, scheduled local notifications, and per-day log persistence in Firestore.
- React Native (Expo SDK 54)
- TypeScript
- Expo Router
- Google Gemini 3 Flash
- Clerk
- Firebase Firestore