Spaces:

S143
/

SamaviyaInsurance

Configuration error

App Files Files Community

S143 commited on Sep 28, 2025

Commit

880952d

verified ·

1 Parent(s): 2e0db40

Upload 3 files

Browse files

Files changed (3) hide show

README.md +183 -17
gitattributes.txt +38 -0
requirements.txt +8 -0

README.md CHANGED Viewed

@@ -1,17 +1,183 @@
----
-title: SamaviyaInsurance
-emoji: 💬
-colorFrom: yellow
-colorTo: purple
-sdk: gradio
-sdk_version: 5.42.0
-app_file: app.py
-pinned: false
-hf_oauth: true
-hf_oauth_scopes:
-- inference-api
-license: mit
-short_description: This is an insurance bot that helps us answer questions.
----
-An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

+DDS Insurance Q&A — RAG Assistant (Pinecone + OpenAI + Gradio)
+Summary: A beginner-friendly, document-grounded insurance bot that you can replicate and deploy on Hugging Face Spaces. It answers only from your uploaded insurance documents using LlamaIndex + Pinecone (serverless) + OpenAI with a simple, polite system prompt.
+What You’ll Get
+Deployed Space URL you can share.
+Grounded answers (no docs → the bot politely says it can’t find it).
+Simple UI with an FAQ dropdown + free-text question box.
+Clean structure designed for easy replication.
+Features
+Answers strictly from your data/ documents (RAG).
+Pinecone serverless index (AWS us-east-1, cosine, 1536-dim).
+OpenAI for embeddings (text-embedding-3-small) and LLM (gpt-4o-mini).
+Gradio interface with a centered required logo (data/dds_logo.png).
+Beginner-friendly defaults and error messages.
+Repository Structure
+.
+├─ data/                     # Your insurance docs + required logo
+│  └─ dds_logo.png           # REQUIRED (shown in header)
+├─ app.py                    # Main app: indexing + query + Gradio UI
+├─ requirements.txt          # Dependencies
+└─ README.md                 # This file
+Configuration (in app.py)
+EMBED_MODEL = "text-embedding-3-small"   # 1536-dim
+LLM_MODEL   = "gpt-4o-mini"
+TOP_K       = 4                          # retrieval depth
+System Prompt (keeps answers grounded + polite):
+SYSTEM_PROMPT = """You are Aisha, a polite and professional Insurance assistant.
+Answer ONLY using the information found in the indexed insurance document(s).
+If the answer is not in the document(s), say: "I couldn’t find that in the document."
+Keep responses concise, helpful, and courteous.
+"""
+FAQ List (editable):
+FAQS = [
+    "",
+    "What benefits are covered under the policy?",
+    "How do I file a claim and what documents are required?",
+    "What are the exclusions and limitations?",
+    "Is pre-authorization needed for hospitalization?",
+    "What is the reimbursement timeline?",
+    "How are outpatient vs inpatient services handled?",
+    "How can I check my network hospitals/clinics?",
+    "What is the co-pay or deductible policy?",
+]
+Deploy to Hugging Face Spaces (Beginner-Friendly)
+1) Create a Space
+Go to Hugging Face → Spaces → New Space
+SDK: Gradio
+Visibility/licensing: your choice
+2) Add Project Files
+Upload these into your Space:
+app.py
+requirements.txt
+README.md
+Create folder data/ and upload:
+Your insurance documents (PDF/TXT/MD…)
+dds_logo.png (mandatory; exact filename)
+Tip: Your Space file tree should match the Repository Structure above.
+3) Set Secrets (Environment Variables)
+In Space → Settings → Variables and secrets, add:
+OPENAI_API_KEY → your OpenAI key
+PINECONE_API_KEY → your Pinecone key
+No legacy Pinecone environment URL needed. This app uses pinecone-client ≥ 5 with serverless.
+4) Build & Run
+Spaces auto-install from requirements.txt.
+Default CPU hardware is fine.
+Entry point auto-detected from app.py.
+On first start, the app will:
+Ensure a Pinecone serverless index:
+dds-insurance-index · cosine · 1536-dim · aws/us-east-1
+Read and index documents from data/
+Launch the Gradio UI
+Your deployed link is simply the Space URL once its status is Running.
+5) Updating Documents Later
+Upload/change files in data/
+Click Restart on the Space so it re-indexes your documents
+Troubleshooting (Common Issues)
+“Missing PINECONE_API_KEY or OPENAI_API_KEY”
+Add both secrets in Space → Settings → Variables and secrets.
+Pinecone 401 / “Malformed domain”
+Ensure you’re on pinecone-client>=5.0.1 (already in requirements.txt).
+Use a valid Pinecone API key; no environment URL needed for serverless.
+“Logo not found: data/dds_logo.png”
+Upload an image named exactly dds_logo.png into the data/ folder.
+“No documents found in data/”
+Upload at least one doc (PDF/TXT/MD) into data/, then Restart the Space.
+OpenAI authorization/rate-limit errors
+Confirm key validity and model access; reduce usage if rate-limited.
+Slow first load
+First run installs dependencies and builds the index; later runs are faster.
+Manual Test Checklist
+Ask a question clearly answered in your docs → response should quote that knowledge.
+Ask something not in your docs → bot should say it can’t find it.
+Adjust TOP_K in app.py to see how answer completeness changes.
+Requirements (from requirements.txt)
+gradio>=4.44.0
+pinecone-client>=5.0.1
+openai>=1.51.0
+llama-index>=0.11.0
+llama-index-vector-stores-pinecone>=0.3.0
+llama-index-embeddings-openai>=0.3.0
+llama-index-llms-openai>=0.2.0
+tiktoken>=0.7.0
+Customization Ideas
+Swap LLMs by editing LLM_MODEL.
+Add a file uploader to refresh docs from the UI.
+Add metadata filters (e.g., policy type).
+Log queries to refine the FAQ list.
+License
+Add your chosen license (e.g., MIT) as LICENSE.
+Acknowledgments
+Thanks to LlamaIndex, Pinecone, OpenAI, and Gradio for the tooling that makes this simple and reproducible.

gitattributes.txt ADDED Viewed

	@@ -0,0 +1,38 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+insurance.pdf filter=lfs diff=lfs merge=lfs -text
+data/insurance.pdf filter=lfs diff=lfs merge=lfs -text
+data/dds_logo.png filter=lfs diff=lfs merge=lfs -text

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+gradio>=4.44.0
+pinecone-client>=5.0.1
+openai>=1.51.0
+llama-index>=0.11.0
+llama-index-vector-stores-pinecone>=0.3.0
+llama-index-embeddings-openai>=0.3.0
+llama-index-llms-openai>=0.2.0
+tiktoken>=0.7.0