Spaces:

Eniiyanu
/

Kaanta

Running

App Files Files Community

Kaanta / EMBEDDING_MODEL_FIX.md

Oluwaferanmi

This is the latest change

143273c 2 months ago

preview code

raw

history blame

3.62 kB

Embedding Model Issue - Solutions

Problem

[WARN] RAG not initialized: Can't load the model for 'sentence-transformers/all-MiniLM-L6-v2'

This happens because:

Hugging Face Spaces has limited disk space (~50GB)
The embedding model needs to download (~400MB)
First-time downloads can fail due to network/space issues

Solutions

Option 1: Disable RAG (Recommended for Now)

Set environment variable in Hugging Face Space settings:

DISABLE_RAG=true

Result:

✅ Service starts immediately
✅ Tax calculations work perfectly
❌ Tax optimization unavailable (requires RAG)
❌ Tax Q&A unavailable (requires RAG)

Use this if: You only need tax calculations, not optimization recommendations.

Option 2: Wait for Model Download

The model will eventually download on subsequent restarts. It may take 2-3 tries.

Steps:

Don't set DISABLE_RAG
Restart the Space multiple times
Check logs for: [INFO] Embedding model cached successfully

Use this if: You need full tax optimization features.

Option 3: Use Smaller Embedding Model

Change in orchestrator.py:

# Instead of:
EMBED_MODEL = "sentence-transformers/all-MiniLM-L6-v2"

# Use:
EMBED_MODEL = "sentence-transformers/all-MiniLM-L12-v2"  # Smaller, faster

Option 4: Pre-build Docker Image with Model

Add to Dockerfile:

# After RUN pip install...
RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"

This downloads the model during build time.

Current Status

The code now:

✅ Tries to pre-download the model
✅ Provides clear error messages
✅ Continues without RAG if model fails
✅ Supports DISABLE_RAG environment variable

What Works Without RAG

Feature	Status
Tax Calculations (PIT, CIT, VAT)	✅ Works
Tax Rules Engine	✅ Works
`/v1/query` endpoint (calculations)	✅ Works
`/v1/query` endpoint (Q&A)	❌ Requires RAG
`/v1/optimize` endpoint	❌ Requires RAG
Transaction classification	⚠️ Basic patterns only

Recommendation for Production

For auth-backend integration:

Since you mainly need transaction classification and tax calculations (not Q&A), you have two options:

Option A: Disable RAG, Use Basic Classification

# In HF Space settings
DISABLE_RAG=true

Transaction classification will use pattern matching (salary, pension, rent, etc.) without LLM.

Option B: Wait for Model Download

Keep trying to restart until the model downloads successfully. This gives you full optimization features.

Testing After Fix

Test 1: Health Check

curl https://your-space.hf.space/health

Expected with RAG disabled:

{
  "status": "ok",
  "rag_ready": false
}

Test 2: Tax Calculation (Should Work)

curl -X POST https://your-space.hf.space/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Calculate PIT for 5000000 annual income",
    "tax_type": "PIT"
  }'

Test 3: Tax Optimization (Requires RAG)

curl -X POST https://your-space.hf.space/v1/optimize \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "test",
    "transactions": [...],
    "tax_year": 2025
  }'

Will return 503 if RAG disabled.

Next Steps

For immediate use: Set DISABLE_RAG=true
For full features: Wait for model download or use Option 4 (pre-build)
For production: Consider upgrading to Hugging Face Spaces Pro for more resources