LLM Structured Document Extraction Pipeline
Currently developing a constrained extraction pipeline converting real-world document images into structured JSON using a fine-tuned vision-language model.
• Handles complex, domain-specific schemas across multiple vendor types.
• Guarantees syntactically valid JSON output at every generation step
• Trained on thousands of real Indian commercial invoices
• Research paper in preparation - targeting international publication
Stack: Python · PyTorch · HuggingFace Transformers · LoRA/PEFT
Available for similar contract work in document AI, structured output generation, and VLM fine-tuning.
0
6
Built a full-stack healthcare recommendation platform matching patients to providers based on symptoms, location, and specialization.
• Symptom checker with urgency assessment and specialist recommendations
• Location-based facility finder with map integration
• Appointment scheduling with provider matching logic
• Personalized health tips based on user profile
• Secure authentication and user profile management
Stack: Django · Python · SQLite · Tailwind CSS
Full project overview (screenshots + feature walkthrough) available in the GitHub repo.https://github.com/kruthikhak/Care-Connect/blob/main/CareConnect_Overview.pdf
0
9
Built a full-stack NIDS using XGBoost trained on 2.83M network flows. Every prediction includes SHAP explainability showing exactly which network features triggered the alert.
• 99.57% accuracy, 0.9998 ROC-AUC • 0.52ms inference latency - 19x faster than 10ms target • 374KB model (~267x smaller than deep learning alternatives) • Detects 14 attack types without decrypting a single packet
Stack: Python · XGBoost · SHAP · FastAPI · React · Supabase Live: edge-defense-ui.vercel.app
Live demo uses pre-loaded sample flows from CIC-IDS2017 dataset. To test live traffic analysis, use the sample CSV provided in the GitHub repo(live_analysis_results.csv)
GitHub: https://github.com/kruthikhak/edge-defense-api