Spaces:

Eniiyanu
/

Kaanta

Running

App Files Files Community

Oluwaferanmi commited on Oct 17

Commit

143273c

1 Parent(s): 61a314c

This is the latest change

Browse files

Files changed (2) hide show

EMBEDDING_MODEL_FIX.md +143 -0
orchestrator.py +51 -24

EMBEDDING_MODEL_FIX.md ADDED Viewed

	@@ -0,0 +1,143 @@

+# Embedding Model Issue - Solutions
+## Problem
+```
+[WARN] RAG not initialized: Can't load the model for 'sentence-transformers/all-MiniLM-L6-v2'
+```
+This happens because:
+1. Hugging Face Spaces has limited disk space (~50GB)
+2. The embedding model needs to download (~400MB)
+3. First-time downloads can fail due to network/space issues
+## Solutions
+### Option 1: Disable RAG (Recommended for Now)
+Set environment variable in Hugging Face Space settings:
+```bash
+DISABLE_RAG=true
+```
+**Result:**
+- ✅ Service starts immediately
+- ✅ Tax calculations work perfectly
+- ❌ Tax optimization unavailable (requires RAG)
+- ❌ Tax Q&A unavailable (requires RAG)
+**Use this if:** You only need tax calculations, not optimization recommendations.
+### Option 2: Wait for Model Download
+The model will eventually download on subsequent restarts. It may take 2-3 tries.
+**Steps:**
+1. Don't set `DISABLE_RAG`
+2. Restart the Space multiple times
+3. Check logs for: `[INFO] Embedding model cached successfully`
+**Use this if:** You need full tax optimization features.
+### Option 3: Use Smaller Embedding Model
+Change in `orchestrator.py`:
+```python
+# Instead of:
+EMBED_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
+# Use:
+EMBED_MODEL = "sentence-transformers/all-MiniLM-L12-v2"  # Smaller, faster
+```
+### Option 4: Pre-build Docker Image with Model
+Add to `Dockerfile`:
+```dockerfile
+# After RUN pip install...
+RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"
+```
+This downloads the model during build time.
+## Current Status
+The code now:
+1. ✅ Tries to pre-download the model
+2. ✅ Provides clear error messages
+3. ✅ Continues without RAG if model fails
+4. ✅ Supports `DISABLE_RAG` environment variable
+## What Works Without RAG
+| Feature | Status |
+|---------|--------|
+| Tax Calculations (PIT, CIT, VAT) | ✅ Works |
+| Tax Rules Engine | ✅ Works |
+| `/v1/query` endpoint (calculations) | ✅ Works |
+| `/v1/query` endpoint (Q&A) | ❌ Requires RAG |
+| `/v1/optimize` endpoint | ❌ Requires RAG |
+| Transaction classification | ⚠️ Basic patterns only |
+## Recommendation for Production
+**For auth-backend integration:**
+Since you mainly need transaction classification and tax calculations (not Q&A), you have two options:
+### Option A: Disable RAG, Use Basic Classification
+```bash
+# In HF Space settings
+DISABLE_RAG=true
+```
+Transaction classification will use pattern matching (salary, pension, rent, etc.) without LLM.
+### Option B: Wait for Model Download
+Keep trying to restart until the model downloads successfully. This gives you full optimization features.
+## Testing After Fix
+### Test 1: Health Check
+```bash
+curl https://your-space.hf.space/health
+```
+Expected with RAG disabled:
+```json
+{
+  "status": "ok",
+  "rag_ready": false
+}
+```
+### Test 2: Tax Calculation (Should Work)
+```bash
+curl -X POST https://your-space.hf.space/v1/query \
+  -H "Content-Type: application/json" \
+  -d '{
+    "question": "Calculate PIT for 5000000 annual income",
+    "tax_type": "PIT"
+  }'
+```
+### Test 3: Tax Optimization (Requires RAG)
+```bash
+curl -X POST https://your-space.hf.space/v1/optimize \
+  -H "Content-Type: application/json" \
+  -d '{
+    "user_id": "test",
+    "transactions": [...],
+    "tax_year": 2025
+  }'
+```
+Will return 503 if RAG disabled.
+## Next Steps
+1. **For immediate use:** Set `DISABLE_RAG=true`
+2. **For full features:** Wait for model download or use Option 4 (pre-build)
+3. **For production:** Consider upgrading to Hugging Face Spaces Pro for more resources

orchestrator.py CHANGED Viewed

@@ -38,6 +38,24 @@ GROQ_MODEL = "llama-3.1-8b-instant"
 # Use /tmp for vector store in Hugging Face Spaces (writable directory)
 VECTOR_STORE_DIR = os.getenv('VECTOR_STORE_DIR', '/tmp/vector_store')
 CALC_KEYWORDS = {
     "compute", "calculate", "calc", "how much tax", "tax due", "paye", "cit", "vat to pay",
     "what will i pay", "liability", "estimate", "breakdown", "net pay", "withholding"
@@ -264,31 +282,40 @@ class Orchestrator:
         # RAG
         rag = None
-        try:
-            src = Path(PDF_SOURCE)
-            # Use writable directory for Hugging Face Spaces
-            vector_store_path = Path(VECTOR_STORE_DIR)
-            # Create directory if it doesn't exist and is writable
             try:
-                vector_store_path.mkdir(parents=True, exist_ok=True)
-            except (PermissionError, OSError) as mkdir_err:
-                print(f"[WARN] Cannot create vector_store directory: {mkdir_err}", file=sys.stderr)
-                print(f"[INFO] RAG will be disabled. Tax calculations will still work.", file=sys.stderr)
-                raise
-            ds = DocumentStore(persist_dir=vector_store_path, embedding_model=EMBED_MODEL)
-            pdfs = ds.discover_pdfs(src)
-            if not pdfs:
-                print(f"[WARN] No PDFs found under {src}. RAG disabled.", file=sys.stderr)
-                raise FileNotFoundError(f"No PDFs found under {src}")
-            ds.build_vector_store(pdfs, force_rebuild=False)
-            # RAGPipeline reads GROQ_API_KEY from env via langchain_groq; ensure .env loaded
-            rag = RAGPipeline(doc_store=ds, model=GROQ_MODEL, temperature=0.1)
-            print("[INFO] RAG pipeline initialized successfully", file=sys.stderr)
-        except Exception as e:
-            print(f"[WARN] RAG not initialized: {e}", file=sys.stderr)
-            print(f"[INFO] Service will continue without RAG. Tax calculations available.", file=sys.stderr)
         # Tax Optimizer
         optimizer = None

 # Use /tmp for vector store in Hugging Face Spaces (writable directory)
 VECTOR_STORE_DIR = os.getenv('VECTOR_STORE_DIR', '/tmp/vector_store')
+# Allow disabling RAG entirely for resource-constrained environments
+DISABLE_RAG = os.getenv('DISABLE_RAG', 'false').lower() in ('true', '1', 'yes')
+# Pre-download embedding model to cache
+def _ensure_embedding_model_cached(model_name: str) -> bool:
+    """Pre-download embedding model to avoid runtime errors"""
+    try:
+        from sentence_transformers import SentenceTransformer
+        print(f"[INFO] Pre-downloading embedding model: {model_name}", file=sys.stderr)
+        model = SentenceTransformer(model_name)
+        print(f"[INFO] Embedding model cached successfully", file=sys.stderr)
+        return True
+    except Exception as e:
+        print(f"[WARN] Failed to cache embedding model: {e}", file=sys.stderr)
+        print(f"[INFO] This is common in Hugging Face Spaces with limited disk space", file=sys.stderr)
+        print(f"[INFO] Set DISABLE_RAG=true to skip RAG initialization", file=sys.stderr)
+        return False
 CALC_KEYWORDS = {
     "compute", "calculate", "calc", "how much tax", "tax due", "paye", "cit", "vat to pay",
     "what will i pay", "liability", "estimate", "breakdown", "net pay", "withholding"
         # RAG
         rag = None
+        if DISABLE_RAG:
+            print(f"[INFO] RAG disabled via DISABLE_RAG environment variable", file=sys.stderr)
+        else:
             try:
+                # Pre-download embedding model
+                if not _ensure_embedding_model_cached(EMBED_MODEL):
+                    print(f"[WARN] Embedding model not available. RAG disabled.", file=sys.stderr)
+                    raise RuntimeError("Embedding model unavailable")
+                src = Path(PDF_SOURCE)
+                # Use writable directory for Hugging Face Spaces
+                vector_store_path = Path(VECTOR_STORE_DIR)
+                # Create directory if it doesn't exist and is writable
+                try:
+                    vector_store_path.mkdir(parents=True, exist_ok=True)
+                except (PermissionError, OSError) as mkdir_err:
+                    print(f"[WARN] Cannot create vector_store directory: {mkdir_err}", file=sys.stderr)
+                    print(f"[INFO] RAG will be disabled. Tax calculations will still work.", file=sys.stderr)
+                    raise
+                ds = DocumentStore(persist_dir=vector_store_path, embedding_model=EMBED_MODEL)
+                pdfs = ds.discover_pdfs(src)
+                if not pdfs:
+                    print(f"[WARN] No PDFs found under {src}. RAG disabled.", file=sys.stderr)
+                    raise FileNotFoundError(f"No PDFs found under {src}")
+                ds.build_vector_store(pdfs, force_rebuild=False)
+                # RAGPipeline reads GROQ_API_KEY from env via langchain_groq; ensure .env loaded
+                rag = RAGPipeline(doc_store=ds, model=GROQ_MODEL, temperature=0.1)
+                print("[INFO] RAG pipeline initialized successfully", file=sys.stderr)
+            except Exception as e:
+                print(f"[WARN] RAG not initialized: {e}", file=sys.stderr)
+                print(f"[INFO] Service will continue without RAG. Tax calculations available.", file=sys.stderr)
         # Tax Optimizer
         optimizer = None