Embedding Model Issue - Solutions
Problem
[WARN] RAG not initialized: Can't load the model for 'sentence-transformers/all-MiniLM-L6-v2'
This happens because:
- Hugging Face Spaces has limited disk space (~50GB)
- The embedding model needs to download (~400MB)
- First-time downloads can fail due to network/space issues
Solutions
Option 1: Disable RAG (Recommended for Now)
Set environment variable in Hugging Face Space settings:
DISABLE_RAG=true
Result:
- ✅ Service starts immediately
- ✅ Tax calculations work perfectly
- ❌ Tax optimization unavailable (requires RAG)
- ❌ Tax Q&A unavailable (requires RAG)
Use this if: You only need tax calculations, not optimization recommendations.
Option 2: Wait for Model Download
The model will eventually download on subsequent restarts. It may take 2-3 tries.
Steps:
- Don't set
DISABLE_RAG - Restart the Space multiple times
- Check logs for:
[INFO] Embedding model cached successfully
Use this if: You need full tax optimization features.
Option 3: Use Smaller Embedding Model
Change in orchestrator.py:
# Instead of:
EMBED_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
# Use:
EMBED_MODEL = "sentence-transformers/all-MiniLM-L12-v2" # Smaller, faster
Option 4: Pre-build Docker Image with Model
Add to Dockerfile:
# After RUN pip install...
RUN python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"
This downloads the model during build time.
Current Status
The code now:
- ✅ Tries to pre-download the model
- ✅ Provides clear error messages
- ✅ Continues without RAG if model fails
- ✅ Supports
DISABLE_RAGenvironment variable
What Works Without RAG
| Feature | Status |
|---|---|
| Tax Calculations (PIT, CIT, VAT) | ✅ Works |
| Tax Rules Engine | ✅ Works |
/v1/query endpoint (calculations) |
✅ Works |
/v1/query endpoint (Q&A) |
❌ Requires RAG |
/v1/optimize endpoint |
❌ Requires RAG |
| Transaction classification | ⚠️ Basic patterns only |
Recommendation for Production
For auth-backend integration:
Since you mainly need transaction classification and tax calculations (not Q&A), you have two options:
Option A: Disable RAG, Use Basic Classification
# In HF Space settings
DISABLE_RAG=true
Transaction classification will use pattern matching (salary, pension, rent, etc.) without LLM.
Option B: Wait for Model Download
Keep trying to restart until the model downloads successfully. This gives you full optimization features.
Testing After Fix
Test 1: Health Check
curl https://your-space.hf.space/health
Expected with RAG disabled:
{
"status": "ok",
"rag_ready": false
}
Test 2: Tax Calculation (Should Work)
curl -X POST https://your-space.hf.space/v1/query \
-H "Content-Type: application/json" \
-d '{
"question": "Calculate PIT for 5000000 annual income",
"tax_type": "PIT"
}'
Test 3: Tax Optimization (Requires RAG)
curl -X POST https://your-space.hf.space/v1/optimize \
-H "Content-Type: application/json" \
-d '{
"user_id": "test",
"transactions": [...],
"tax_year": 2025
}'
Will return 503 if RAG disabled.
Next Steps
- For immediate use: Set
DISABLE_RAG=true - For full features: Wait for model download or use Option 4 (pre-build)
- For production: Consider upgrading to Hugging Face Spaces Pro for more resources