Health Insurance Policy Enquiry System (RAG)

Overview

This repository implements a Retrieval-Augmented Generation (RAG) system that answers user questions about health insurance policies using semantic search combined with LLM reasoning.

The system leverages the following technologies:

System Workflow

  1. Upload a policy document (PDF/DOCX/TXT/MD)
  2. System extracts text, creates semantic chunks, and generates embeddings
  3. Embeddings stored in Pinecone with metadata (file, page, chunk)
  4. User asks a question which is embedded and used to query Pinecone for relevant chunks
  5. Retrieved chunks are passed into a prompt template and Gemini generates a concise answer with citation

Project Structure

.
├── document_processing.py    # Extracts text, validates, preprocesses
├── health_rag.py             # RAG pipeline: embeddings, Pinecone, queries, prompts
├── main.py                   # FastAPI backend (upload + ask-question)
├── streamlit_app.py          # Frontend UI
├── requirements.txt
├── .env.example
└── README.md


Quick Start (Local)