Spaces:

Eniiyanu
/

Kaanta

Running

App Files Files Community

Kaanta / CACHE_FIX_EXPLAINED.md

Oluwaferanmi

This is the latest changes

dac20be 2 months ago

preview code

raw

history blame contribute delete

6.49 kB

Cache Location Mismatch - The Real Problem

What You Observed

Build Logs (✅ Success)

--> RUN python -c "from sentence_transformers import SentenceTransformer; ..."
Downloading embedding model...
Model cached successfully
DONE 5.5s

Runtime Logs (❌ Failure)

[INFO] Pre-downloading embedding model: sentence-transformers/all-MiniLM-L6-v2
No sentence-transformers model found with name sentence-transformers/all-MiniLM-L6-v2
[WARN] Failed to cache embedding model: Can't load the model...

The Problem

The model downloads successfully during build but can't be found at runtime. This is a cache location mismatch.

What Happens

During Docker Build (as root user):
- Model downloads to: /root/.cache/huggingface/
- Build succeeds ✅
During Runtime (as different user):
- App looks for model in: /root/.cache/huggingface/
- Permission denied (different user can't access /root/)
- Falls back to downloading from Hugging Face
- Download fails (network/space constraints)
- RAG disabled ❌

Why This Happens

Hugging Face Spaces runs containers with different users for build vs runtime:

Build time: root user
Runtime: non-root user (security)

Default cache locations:

sentence-transformers: ~/.cache/torch/sentence_transformers/
transformers: ~/.cache/huggingface/
~ = different paths for different users

The Solution

Set Explicit Cache Directories

Use /app/.cache/ which is accessible to both build and runtime users.

Implementation

1. Dockerfile Changes

# Set cache environment variables (accessible location)
ENV TRANSFORMERS_CACHE=/app/.cache/huggingface \
    HF_HOME=/app/.cache/huggingface \
    SENTENCE_TRANSFORMERS_HOME=/app/.cache/sentence-transformers

# Create directories with proper permissions
RUN mkdir -p /app/.cache/huggingface /app/.cache/sentence-transformers \
    && chmod -R 777 /app/.cache

# Download model to /app/.cache (not /root/.cache)
RUN python -c "from sentence_transformers import SentenceTransformer; \
    SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"

2. orchestrator.py Changes

# Ensure runtime uses same cache directories
os.environ.setdefault('TRANSFORMERS_CACHE', '/app/.cache/huggingface')
os.environ.setdefault('HF_HOME', '/app/.cache/huggingface')
os.environ.setdefault('SENTENCE_TRANSFORMERS_HOME', '/app/.cache/sentence-transformers')

How This Fixes It

Before (❌ Broken)

Build:   /root/.cache/huggingface/  ← Model downloaded here
Runtime: /home/user/.cache/huggingface/  ← Looking here (empty!)
Result:  Model not found, download fails

After (✅ Fixed)

Build:   /app/.cache/huggingface/  ← Model downloaded here
Runtime: /app/.cache/huggingface/  ← Looking here (found!)
Result:  Model loaded successfully

Expected Logs After Fix

Build Logs

--> RUN python -c "from sentence_transformers import SentenceTransformer; ..."
Downloading embedding model...
Downloading (…)ce_transformers_config.json: 100%|██████████| 116/116
Downloading (…)_Pooling/config.json: 100%|██████████| 190/190
Downloading (…)b52ce780/config.json: 100%|██████████| 612/612
Downloading model.safetensors: 100%|██████████| 90.9M/90.9M
Model cached successfully
DONE 5.5s

Runtime Logs

[INFO] Pre-downloading embedding model: sentence-transformers/all-MiniLM-L6-v2
[INFO] Embedding model cached successfully  ← No download, uses cached model
[INFO] RAG pipeline initialized successfully
[INFO] Tax Optimizer initialized successfully
INFO:     Application startup complete.

Why Previous Attempt Failed

Your first fix downloaded the model during build, but didn't set the cache location:

✅ Model downloaded
❌ Downloaded to /root/.cache/
❌ Runtime couldn't access it
❌ Tried to re-download, failed

Deploy the Fix

# Stage changes
git add Dockerfile orchestrator.py CACHE_FIX_EXPLAINED.md QUICK_FIX.md

# Commit
git commit -m "Fix: Set explicit cache directories for embedding model

- Set TRANSFORMERS_CACHE, HF_HOME, SENTENCE_TRANSFORMERS_HOME to /app/.cache
- Create cache directories with proper permissions
- Ensure build and runtime use same cache location
- Fixes model not found error at runtime"

# Push to Hugging Face
git push

Verification

After deployment, check logs for:

✅ Success Indicators

[INFO] Embedding model cached successfully
[INFO] RAG pipeline initialized successfully
[INFO] Tax Optimizer initialized successfully

❌ Failure Indicators (if still broken)

No sentence-transformers model found with name...
[WARN] Failed to cache embedding model...
[WARN] RAG not initialized...

Alternative Solutions (If This Still Fails)

Option 1: Use Hugging Face Hub API

Instead of local model, use Hugging Face Inference API:

from langchain_huggingface import HuggingFaceEndpointEmbeddings

embeddings = HuggingFaceEndpointEmbeddings(
    model="sentence-transformers/all-MiniLM-L6-v2",
    huggingfacehub_api_token=os.getenv("HF_TOKEN")
)

Option 2: Use Smaller Model

EMBED_MODEL = "sentence-transformers/paraphrase-MiniLM-L3-v2"  # 61MB vs 400MB

Option 3: Disable RAG

# In HF Space settings
DISABLE_RAG=true

Technical Details

Environment Variables Used

Variable	Purpose	Value
`TRANSFORMERS_CACHE`	Transformers library cache	`/app/.cache/huggingface`
`HF_HOME`	Hugging Face Hub cache	`/app/.cache/huggingface`
`SENTENCE_TRANSFORMERS_HOME`	Sentence Transformers cache	`/app/.cache/sentence-transformers`

Why `/app/.cache/`?

/app/ is the WORKDIR in Dockerfile
Accessible to all users in container
Persists across build and runtime
Can set permissions explicitly

Why `chmod -R 777`?

Ensures all users can read/write
Necessary for non-root runtime user
Safe in container environment
Alternative: use chown to set specific user

Summary

Problem: Model cached in user-specific directory during build, inaccessible at runtime
Solution: Use shared /app/.cache/ directory for both build and runtime
Result: Model loads instantly at runtime, no re-download needed

This is a common issue in Docker deployments with multi-stage builds or user switching.