Spaces:

Aqs-shispare
/

augmas-backend

Sleeping

App Files Files Community

Aqs-shispare commited on Jan 16

Commit

ce55e28

1 Parent(s): 8a91479

rag service

Browse files

Files changed (8) hide show

RAG_REFACTOR.md +196 -0
api/main.py +5 -13
api/routes.py +78 -24
requirements.txt +1 -0
services/memory_service.py +18 -75
services/rag_service.py +0 -628
services/rag_service_supabase.py +595 -0
supabase_migrations/001_create_code_embeddings.sql +68 -0

RAG_REFACTOR.md ADDED Viewed

	@@ -0,0 +1,196 @@

+# RAG System Refactoring - Workspace-Scoped with Supabase pgvector
+## Overview
+The RAG system has been refactored to use Supabase pgvector as the only vector store, with workspace-scoped indexing and querying. This ensures that:
+1. ✅ Only user workspace code is indexed (not extension source code)
+2. ✅ Each workspace is isolated using `workspace_id`
+3. ✅ All embeddings are stored in Supabase (cloud-only, no local vector DBs)
+4. ✅ Incremental indexing on file save/create/delete
+5. ✅ Free-tier friendly (efficient queries, no background loops)
+## Architecture Changes
+### Backend
+#### New RAG Service (`rag_service_supabase.py`)
+- **Stateless**: No local storage, all data in Supabase
+- **Workspace-scoped**: All operations require `workspace_id`
+- **Supabase pgvector**: Uses PostgreSQL with pgvector extension
+- **Embeddings**: HuggingFace `sentence-transformers/all-MiniLM-L6-v2` (384 dimensions)
+#### Database Schema
+```sql
+CREATE TABLE code_embeddings (
+    id UUID PRIMARY KEY,
+    workspace_id TEXT NOT NULL,
+    file_path TEXT NOT NULL,
+    content TEXT NOT NULL,
+    embedding vector(384),
+    chunk_index INTEGER,
+    total_chunks INTEGER,
+    file_size INTEGER,
+    created_at TIMESTAMPTZ,
+    updated_at TIMESTAMPTZ
+);
+```
+#### New API Endpoints
+- `POST /api/rag/index/workspace` - Index workspace files
+- `POST /api/rag/index/file` - Index a single file
+- `DELETE /api/rag/index/file` - Delete file embeddings
+- `POST /api/rag/query` - Query with workspace_id (updated)
+- `GET /api/rag/stats?workspace_id=...` - Get stats for workspace (updated)
+### Frontend (VS Code Extension)
+#### Workspace Detection
+- Detects active workspace on extension activation
+- Generates stable `workspace_id` (MD5 hash of workspace path)
+- Automatically indexes workspace files on activation
+#### File Event Handlers
+- `onDidSaveTextDocument` → Updates embeddings for saved files
+- `onDidCreateFiles` → Indexes new files
+- `onDidDeleteFiles` → Deletes embeddings for deleted files
+#### Updated Chat Flow
+- All chat messages include `workspace_id`
+- RAG queries are scoped to the current workspace
+- Context retrieval is workspace-aware
+## Setup Instructions
+### 1. Run Supabase Migration
+Execute the migration SQL in your Supabase SQL Editor:
+```bash
+# File: augmas-backend/supabase_migrations/001_create_code_embeddings.sql
+```
+This creates:
+- `code_embeddings` table with pgvector support
+- Indexes for efficient querying
+- `match_code_embeddings` RPC function for vector similarity search
+### 2. Update Environment Variables
+Ensure your `.env` file has:
+```env
+SUPABASE_URL=https://your-project-id.supabase.co
+SUPABASE_KEY=your-service-role-key
+```
+### 3. Install Dependencies
+```bash
+cd augmas-backend
+pip install -r requirements.txt
+```
+New dependency: `numpy>=1.24.0,<2.0.0`
+### 4. Restart Backend
+The backend now initializes the new RAG service automatically:
+```python
+rag_service = RAGServiceSupabase()
+```
+## Usage
+### Extension Activation
+1. Open a workspace in VS Code
+2. Extension automatically:
+   - Detects workspace
+   - Generates `workspace_id`
+   - Scans and indexes all eligible files
+   - Sets up file event handlers
+### Incremental Indexing
+- **File Save**: Automatically updates embeddings
+- **File Create**: Automatically indexes new files
+- **File Delete**: Automatically removes embeddings
+### Chat with RAG
+When you send a message in the chat:
+1. Extension includes `workspace_id` in the request
+2. Backend performs vector similarity search scoped to workspace
+3. Relevant code chunks are injected into the LLM prompt
+4. Response includes workspace-aware context
+## Key Features
+### Workspace Isolation
+- Each workspace has a unique `workspace_id`
+- All queries are filtered by `workspace_id`
+- No cross-workspace contamination
+### Efficient Vector Search
+- Uses Supabase RPC function `match_code_embeddings` for efficient pgvector queries
+- Falls back to Python-based similarity if RPC fails
+- Optimized for free-tier limits (limited result sets)
+### Free-Tier Friendly
+- No background reindex loops
+- Incremental updates only
+- Efficient batch operations
+- Respects Supabase rate limits
+### Stateless Backend
+- No local vector stores
+- No filesystem access
+- All data in Supabase
+- Horizontally scalable
+## Migration from Old System
+The old `RAGService` (using Qdrant) is no longer used. The new system:
+1. **No migration needed**: Old Qdrant data is not migrated
+2. **Fresh start**: Each workspace starts with empty index
+3. **Automatic indexing**: Files are indexed on first activation
+## Troubleshooting
+### "Supabase client not initialized"
+- Check `SUPABASE_URL` and `SUPABASE_KEY` in `.env`
+- Ensure you're using the **service_role** key (not anon key)
+### "RPC function not found"
+- Run the migration SQL in Supabase SQL Editor
+- Ensure `match_code_embeddings` function exists
+### "No workspace detected"
+- Open a folder in VS Code (File → Open Folder)
+- Extension requires an active workspace folder
+### Slow indexing
+- Large workspaces may take time to index initially
+- Subsequent updates are incremental and fast
+- Check Supabase dashboard for query performance
+## Performance Considerations
+### Free Tier Limits
+- Supabase Free Tier: 500MB database, 2GB bandwidth
+- Vector search is efficient but limited to ~1000 results per query
+- Batch inserts are optimized to avoid rate limits
+### Optimization Tips
+1. Exclude large/minified files (already handled)
+2. Use `.gitignore` patterns (extension respects them)
+3. Index only source files (not build artifacts)
+## Future Enhancements
+- [ ] Background indexing progress indicator
+- [ ] Manual reindex command
+- [ ] Index statistics in extension UI
+- [ ] Support for multiple workspace folders
+- [ ] Incremental chunk updates (not full file reindex)

api/main.py CHANGED Viewed

@@ -10,7 +10,7 @@ from utils.config import get_settings, get_environment
 from api.routes import router
 from services.langchain_service import LangChainService
 from services.memory_service import MemoryService
-from services.rag_service import RAGService
 # Get settings
 settings = get_settings()
@@ -18,7 +18,7 @@ settings = get_settings()
 # Service instances (will be initialized in lifespan)
 langchain_service: LangChainService = None
 memory_service: MemoryService = None
-rag_service: RAGService = None
 @asynccontextmanager
@@ -49,19 +49,11 @@ async def lifespan(app: FastAPI):
         print(f"❌ Failed to initialize Memory service: {e}")
         raise
-    # Initialize RAG service
     try:
-        rag_service = RAGService(
-            workspace_root=settings.workspace_root,
-            storage_path=settings.storage_path
-        )
         await rag_service.initialize()
-        print("✅ RAG service initialized")
-        if not rag_service.is_ready() and settings.enable_rag:
-            print("📚 Starting background workspace indexing...")
-            import asyncio
-            asyncio.create_task(rag_service.index_workspace(show_progress=False))
     except Exception as e:
         print(f"⚠️ RAG service initialization warning: {e}")

 from api.routes import router
 from services.langchain_service import LangChainService
 from services.memory_service import MemoryService
+from services.rag_service_supabase import RAGServiceSupabase
 # Get settings
 settings = get_settings()
 # Service instances (will be initialized in lifespan)
 langchain_service: LangChainService = None
 memory_service: MemoryService = None
+rag_service: RAGServiceSupabase = None
 @asynccontextmanager
         print(f"❌ Failed to initialize Memory service: {e}")
         raise
+    # Initialize RAG service (stateless, workspace-scoped)
     try:
+        rag_service = RAGServiceSupabase()
         await rag_service.initialize()
+        print("✅ RAG service initialized (Supabase pgvector)")
     except Exception as e:
         print(f"⚠️ RAG service initialization warning: {e}")

api/routes.py CHANGED Viewed

@@ -6,7 +6,7 @@ import logging
 from services.langchain_service import LangChainService, CodeContext, FileContext
 from services.memory_service import MemoryService
-from services.rag_service import RAGService
 from auth.dependencies import get_current_user as get_current_user_id
 logger = logging.getLogger(__name__)
@@ -16,7 +16,7 @@ router = APIRouter()
 # Service instances (should be initialized in main.py and passed as dependencies)
 langchain_service: Optional[LangChainService] = None
 memory_service: Optional[MemoryService] = None
-rag_service: Optional[RAGService] = None
 def get_langchain_service(request: Request) -> LangChainService:
@@ -33,7 +33,7 @@ def get_memory_service(request: Request) -> MemoryService:
     return service
-def get_rag_service(request: Request) -> RAGService:
     service = getattr(request.app.state, 'rag_service', None)
     if service is None:
         raise HTTPException(status_code=500, detail="RAG service not initialized")
@@ -44,6 +44,7 @@ def get_rag_service(request: Request) -> RAGService:
 class ChatRequest(BaseModel):
     message: str
     context: Optional[Dict[str, Any]] = None
     conversation_id: Optional[str] = None
     current_file: Optional[Dict[str, str]] = None
@@ -62,9 +63,26 @@ class FileReference(BaseModel):
 class RAGQueryRequest(BaseModel):
     query: str
     max_chunks: int = 5
 class ModelSwitchRequest(BaseModel):
     model_id: str
@@ -221,7 +239,7 @@ async def get_current_user(
 async def chat(
     request: ChatRequest,
     langchain: LangChainService = Depends(get_langchain_service),
-    rag: RAGService = Depends(get_rag_service)
 ):
     """Process chat message"""
     try:
@@ -247,10 +265,14 @@ async def chat(
                 if file_context:
                     context.referenced_files.append(file_context)
-        # Get RAG context
         rag_context = ""
-        if rag.is_ready():
-            rag_context = await rag.get_relevant_context(request.message, max_chunks=5)
         # Process query
         response = await langchain.process_query(
@@ -273,9 +295,9 @@ async def chat(
 @router.post("/rag/query")
 async def rag_query(
     request: RAGQueryRequest,
-    rag: RAGService = Depends(get_rag_service)
 ):
-    """Query RAG for relevant context"""
     try:
         if not rag.is_ready():
             raise HTTPException(
@@ -283,7 +305,11 @@ async def rag_query(
                 detail="RAG not ready. Please index workspace first."
             )
-        context = await rag.get_relevant_context(request.query, request.max_chunks)
         return {"context": context}
     except HTTPException:
@@ -292,38 +318,66 @@ async def rag_query(
         raise HTTPException(status_code=500, detail=str(e))
-@router.post("/rag/index")
-async def index_workspace(rag: RAGService = Depends(get_rag_service)):
-    """Index the workspace"""
     try:
-        result = await rag.index_workspace(show_progress=False)
         return result
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
-@router.post("/rag/reindex")
-async def reindex_workspace(rag: RAGService = Depends(get_rag_service)):
-    """Reindex the entire workspace"""
     try:
-        result = await rag.reindex_workspace()
         return result
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @router.get("/rag/stats")
-async def get_rag_stats(rag: RAGService = Depends(get_rag_service)):
-    """Get RAG indexing statistics"""
-    return rag.get_index_stats()
 @router.get("/rag/status")
-async def get_rag_status(rag: RAGService = Depends(get_rag_service)):
     """Get RAG service status"""
     return {
-        "ready": rag.is_ready(),
-        "files_indexed": rag.get_indexed_files_count()
     }

 from services.langchain_service import LangChainService, CodeContext, FileContext
 from services.memory_service import MemoryService
+from services.rag_service_supabase import RAGServiceSupabase
 from auth.dependencies import get_current_user as get_current_user_id
 logger = logging.getLogger(__name__)
 # Service instances (should be initialized in main.py and passed as dependencies)
 langchain_service: Optional[LangChainService] = None
 memory_service: Optional[MemoryService] = None
+rag_service: Optional[RAGServiceSupabase] = None
 def get_langchain_service(request: Request) -> LangChainService:
     return service
+def get_rag_service(request: Request) -> RAGServiceSupabase:
     service = getattr(request.app.state, 'rag_service', None)
     if service is None:
         raise HTTPException(status_code=500, detail="RAG service not initialized")
 class ChatRequest(BaseModel):
     message: str
+    workspace_id: Optional[str] = None
     context: Optional[Dict[str, Any]] = None
     conversation_id: Optional[str] = None
     current_file: Optional[Dict[str, str]] = None
 class RAGQueryRequest(BaseModel):
     query: str
+    workspace_id: str
     max_chunks: int = 5
+class IndexWorkspaceRequest(BaseModel):
+    workspace_id: str
+    files: List[Dict[str, str]]  # List of {path: str, content: str}
+class IndexFileRequest(BaseModel):
+    workspace_id: str
+    file_path: str
+    content: str
+class DeleteFileRequest(BaseModel):
+    workspace_id: str
+    file_path: str
 class ModelSwitchRequest(BaseModel):
     model_id: str
 async def chat(
     request: ChatRequest,
     langchain: LangChainService = Depends(get_langchain_service),
+    rag: RAGServiceSupabase = Depends(get_rag_service)
 ):
     """Process chat message"""
     try:
                 if file_context:
                     context.referenced_files.append(file_context)
+        # Get RAG context (workspace-scoped)
         rag_context = ""
+        if request.workspace_id and rag.is_ready():
+            rag_context = await rag.get_relevant_context(
+                request.workspace_id,
+                request.message,
+                max_chunks=5
+            )
         # Process query
         response = await langchain.process_query(
 @router.post("/rag/query")
 async def rag_query(
     request: RAGQueryRequest,
+    rag: RAGServiceSupabase = Depends(get_rag_service)
 ):
+    """Query RAG for relevant context (workspace-scoped)"""
     try:
         if not rag.is_ready():
             raise HTTPException(
                 detail="RAG not ready. Please index workspace first."
             )
+        context = await rag.get_relevant_context(
+            request.workspace_id,
+            request.query,
+            request.max_chunks
+        )
         return {"context": context}
     except HTTPException:
         raise HTTPException(status_code=500, detail=str(e))
+@router.post("/rag/index/workspace")
+async def index_workspace(
+    request: IndexWorkspaceRequest,
+    rag: RAGServiceSupabase = Depends(get_rag_service)
+):
+    """Index workspace files"""
     try:
+        result = await rag.index_workspace(request.workspace_id, request.files)
         return result
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
+@router.post("/rag/index/file")
+async def index_file(
+    request: IndexFileRequest,
+    rag: RAGServiceSupabase = Depends(get_rag_service)
+):
+    """Index a single file"""
     try:
+        result = await rag.index_workspace(
+            request.workspace_id,
+            [{'path': request.file_path, 'content': request.content}]
+        )
+        return result
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@router.delete("/rag/index/file")
+async def delete_file(
+    workspace_id: str,
+    file_path: str,
+    rag: RAGServiceSupabase = Depends(get_rag_service)
+):
+    """Delete embeddings for a file"""
+    try:
+        result = await rag.delete_file(workspace_id, file_path)
         return result
     except Exception as e:
         raise HTTPException(status_code=500, detail=str(e))
 @router.get("/rag/stats")
+async def get_rag_stats(
+    workspace_id: str,
+    rag: RAGServiceSupabase = Depends(get_rag_service)
+):
+    """Get RAG indexing statistics for a workspace"""
+    try:
+        return await rag.get_index_stats(workspace_id)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
 @router.get("/rag/status")
+async def get_rag_status(rag: RAGServiceSupabase = Depends(get_rag_service)):
     """Get RAG service status"""
     return {
+        "ready": rag.is_ready()
     }

requirements.txt CHANGED Viewed

@@ -32,3 +32,4 @@ passlib[bcrypt]==1.7.4
 httpx>=0.24.0,<0.26.0
 aiofiles==23.2.1
 colorama==0.4.6

 httpx>=0.24.0,<0.26.0
 aiofiles==23.2.1
 colorama==0.4.6
+numpy>=1.24.0,<2.0.0

services/memory_service.py CHANGED Viewed

@@ -412,42 +412,6 @@ class MemoryService:
         except Exception as e:
             logger.error(f"Error updating chat last used: {e}")
-    def create_chat(self, user_id: str, title: str) -> str:
-        """Create chat (synchronous for compatibility)"""
-        if not self.client:
-            raise ValueError("Supabase client not initialized")
-        now = int(time.time() * 1000)
-        result = self.client.table("chat_sessions").insert({
-            "title": title,
-            "created": now,
-            "last_used": now,
-            "user_id": user_id
-        }).execute()
-        return result.data[0]["id"]
-    def list_chats(self, user_id: str) -> List[Dict]:
-        """List chats (synchronous for compatibility)"""
-        if not self.client:
-            raise ValueError("Supabase client not initialized")
-        result = self.client.table("chat_sessions") \
-            .select("*") \
-            .eq("user_id", user_id) \
-            .order("last_used", desc=True) \
-            .execute()
-        return result.data
-    def delete_chat(self, user_id: str, chat_id: str):
-        """Delete chat (synchronous for compatibility)"""
-        self._verify_chat(user_id, chat_id)
-        self.client.table("chat_messages").delete().eq("chat_id", chat_id).execute()
-        self.client.table("chat_sessions").delete().eq("id", chat_id).execute()
     # =========================
     # MESSAGES
     # =========================
@@ -573,50 +537,29 @@ class MemoryService:
                 "most_active_chat": None
             }
-    def add_message(self, user_id: str, chat_id: str, role: str, content: str):
-        """Add message (synchronous for compatibility)"""
-        if role not in ("user", "assistant"):
-            raise ValueError("Invalid role")
-        self._verify_chat(user_id, chat_id)
-        self.client.table("chat_messages").insert({
-            "chat_id": chat_id,
-            "role": role,
-            "content": content,
-            "timestamp": int(time.time() * 1000)
-        }).execute()
-        self.client.table("chat_sessions").update({
-            "last_used": int(time.time() * 1000)
-        }).eq("id", chat_id).execute()
-    def get_messages(self, user_id: str, chat_id: str) -> List[Dict]:
-        """Get messages (synchronous for compatibility)"""
-        self._verify_chat(user_id, chat_id)
-        result = self.client.table("chat_messages") \
-            .select("*") \
-            .eq("chat_id", chat_id) \
-            .order("timestamp", desc=False) \
-            .execute()
-        return result.data
     # =========================
     # INTERNAL
     # =========================
-    def _verify_chat(self, user_id: str, chat_id: str):
-        """Verify that chat belongs to user"""
         if not self.client:
             raise ValueError("Supabase client not initialized")
-        result = self.client.table("chat_sessions") \
-            .select("id") \
-            .eq("id", chat_id) \
-            .eq("user_id", user_id) \
-            .execute()
-        if not result.data:
-            raise ValueError("Chat not found or access denied")

         except Exception as e:
             logger.error(f"Error updating chat last used: {e}")
     # =========================
     # MESSAGES
     # =========================
                 "most_active_chat": None
             }
     # =========================
     # INTERNAL
     # =========================
+    async def _verify_chat(self, user_id: str, chat_id: str):
+        """Verify that chat belongs to user (ASYNC VERSION)"""
         if not self.client:
             raise ValueError("Supabase client not initialized")
+        try:
+            result = self.client.table("chat_sessions") \
+                .select("id") \
+                .eq("id", chat_id) \
+                .eq("user_id", user_id) \
+                .execute()
+            if not result.data:
+                logger.warning(f"Chat verification failed - chat_id: {chat_id}, user_id: {user_id}")
+                raise ValueError("Chat not found or access denied")
+            logger.debug(f"Chat verified - chat_id: {chat_id}, user_id: {user_id}")
+        except ValueError:
+            raise  # Re-raise our custom error
+        except Exception as e:
+            logger.error(f"Error verifying chat: {e}")
+            self._handle_supabase_error(e, f"_verify_chat(chat_id={chat_id}, user_id={user_id})")

services/rag_service.py DELETED Viewed

@@ -1,628 +0,0 @@
-import json
-import hashlib
-import asyncio
-import logging
-from pathlib import Path
-from typing import List, Dict, Set, Optional, Tuple
-from dataclasses import dataclass, field
-import aiofiles
-from langchain_core.documents import Document
-from langchain_text_splitters import RecursiveCharacterTextSplitter
-from langchain_huggingface import HuggingFaceEmbeddings
-from langchain_qdrant import QdrantVectorStore
-from qdrant_client import QdrantClient, models
-from qdrant_client.http.models import Distance, VectorParams
-# Configure logging
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
-)
-logger = logging.getLogger(__name__)
-@dataclass
-class IndexingState:
-    indexed_files: Set[str] = field(default_factory=set)
-    failed_files: Dict[str, str] = field(default_factory=dict)
-    last_indexed_at: int = 0
-    version: str = "3.0"
-@dataclass
-class FileProcessResult:
-    file_path: str
-    success: bool
-    documents: Optional[List[Document]] = None
-    error: Optional[str] = None
-    size: int = 0
-class RAGService:
-    """Service for indexing and querying codebase with RAG using HuggingFace embeddings and Qdrant"""
-    VERSION = "3.0"
-    BATCH_SIZE = 10
-    MAX_FILE_SIZE = 500_000  # 500KB
-    EMBEDDING_BATCH_SIZE = 32
-    RATE_LIMIT_DELAY = 0.1
-    MAX_CONCURRENT_READS = 5
-    CHECKPOINT_INTERVAL = 20
-    MAX_RETRIES = 3
-    # Popular code embedding models from HuggingFace
-    # Options: "BAAI/bge-small-en-v1.5", "sentence-transformers/all-MiniLM-L6-v2"
-    EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
-    BINARY_EXTENSIONS = {
-        '.png', '.jpg', '.jpeg', '.gif', '.bmp', '.ico',
-        '.mp3', '.mp4', '.avi', '.mov', '.wav',
-        '.zip', '.tar', '.gz', '.rar', '.7z',
-        '.exe', '.dll', '.so', '.dylib',
-        '.pdf', '.doc', '.docx', '.xls', '.xlsx',
-        '.woff', '.woff2', '.ttf', '.eot'
-    }
-    EXCLUDE_PATTERNS = [
-        'node_modules', '.git', 'dist', 'build',
-        '.vscode', 'coverage', '__pycache__',
-        '.pytest_cache', '.next', 'out', '.DS_Store',
-        'venv', 'env', '.env', 'vendor'
-    ]
-    FILE_EXTENSIONS = [
-        '.ts', '.js', '.py', '.jsx', '.tsx', '.java',
-        '.go', '.php', '.rs', '.cpp', '.c', '.h', '.hpp',
-        '.cs', '.rb', '.swift', '.kt', '.md', '.txt',
-        '.json', '.yml', '.yaml', '.sh', '.sql', '.r'
-    ]
-    def __init__(self, workspace_root: str, storage_path: str):
-        self.workspace_root = Path(workspace_root)
-        self.storage_path = Path(storage_path)
-        logger.info(f"Initializing RAG service for workspace: {workspace_root}")
-        # Initialize HuggingFace embeddings
-        try:
-            self.embeddings = HuggingFaceEmbeddings(
-                model_name=self.EMBEDDING_MODEL,
-                model_kwargs={'device': 'cpu'},
-                encode_kwargs={'normalize_embeddings': True}
-            )
-            logger.info(f"Loaded embedding model: {self.EMBEDDING_MODEL}")
-        except Exception as e:
-            logger.error(f"Failed to load embedding model: {e}")
-            raise
-        # Text splitter for code
-        self.text_splitter = RecursiveCharacterTextSplitter(
-            chunk_size=1000,
-            chunk_overlap=200,
-            separators=["\n\n", "\n", " ", ""]
-        )
-        self.vector_store: Optional[QdrantVectorStore] = None
-        self.qdrant_client: Optional[QdrantClient] = None
-        self.indexing_state = IndexingState()
-        self.is_indexing = False
-        # Setup storage paths
-        workspace_hash = self._hash_workspace_path(str(workspace_root))
-        self.base_path = self.storage_path / "vector_stores" / workspace_hash
-        self.state_file = self.base_path / "indexing_state.json"
-        self.qdrant_path = self.base_path / "qdrant_storage"
-        self.collection_name = f"codebase_{workspace_hash}"
-    def _hash_workspace_path(self, path: str) -> str:
-        """Create hash of workspace path for storage"""
-        return hashlib.md5(path.encode()).hexdigest()[:16]
-    async def initialize(self):
-        """Initialize RAG service and load existing index"""
-        try:
-            self.base_path.mkdir(parents=True, exist_ok=True)
-            self.qdrant_path.mkdir(parents=True, exist_ok=True)
-            await self._load_indexing_state()
-            # Version check
-            if self.indexing_state.version != self.VERSION:
-                logger.warning(f"Version mismatch. Expected {self.VERSION}, got {self.indexing_state.version}. Resetting index.")
-                await self.reset_index()
-                return
-            # Initialize Qdrant client
-            try:
-                self.qdrant_client = QdrantClient(path=str(self.qdrant_path))
-                # Check if collection exists
-                collections = self.qdrant_client.get_collections().collections
-                collection_exists = any(c.name == self.collection_name for c in collections)
-                if collection_exists:
-                    # Load existing vector store
-                    self.vector_store = QdrantVectorStore(
-                        client=self.qdrant_client,
-                        collection_name=self.collection_name,
-                        embedding=self.embeddings
-                    )
-                    collection_info = self.qdrant_client.get_collection(self.collection_name)
-                    vector_count = collection_info.points_count
-                    logger.info(f"Loaded existing index with {len(self.indexing_state.indexed_files)} files, {vector_count} vectors")
-                else:
-                    logger.info("No existing index found. Ready for first-time indexing.")
-            except Exception as e:
-                logger.error(f"Failed to initialize Qdrant: {e}")
-                await self.reset_index()
-        except Exception as e:
-            logger.error(f"Initialization failed: {e}")
-            raise
-    async def _load_indexing_state(self):
-        """Load indexing state from disk"""
-        try:
-            if self.state_file.exists():
-                async with aiofiles.open(self.state_file, 'r') as f:
-                    content = await f.read()
-                    data = json.loads(content)
-                self.indexing_state = IndexingState(
-                    indexed_files=set(data.get('indexedFiles', [])),
-                    failed_files=dict(data.get('failedFiles', {})),
-                    last_indexed_at=data.get('lastIndexedAt', 0),
-                    version=data.get('version', '1.0')
-                )
-                logger.debug(f"Loaded indexing state: {len(self.indexing_state.indexed_files)} files")
-        except Exception as e:
-            logger.error(f"Failed to load indexing state: {e}")
-            self.indexing_state = IndexingState()
-    async def _save_indexing_state(self):
-        """Save indexing state to disk"""
-        try:
-            state_data = {
-                'indexedFiles': list(self.indexing_state.indexed_files),
-                'failedFiles': self.indexing_state.failed_files,
-                'lastIndexedAt': self.indexing_state.last_indexed_at,
-                'version': self.indexing_state.version
-            }
-            async with aiofiles.open(self.state_file, 'w') as f:
-                await f.write(json.dumps(state_data, indent=2))
-            logger.debug("Saved indexing state")
-        except Exception as e:
-            logger.error(f"Failed to save indexing state: {e}")
-    async def index_workspace(self, show_progress: bool = True) -> Dict[str, any]:
-        """Index the entire workspace"""
-        if self.is_indexing:
-            logger.warning("Indexing already in progress")
-            return {'error': 'Indexing already in progress'}
-        self.is_indexing = True
-        logger.info("Starting workspace indexing")
-        try:
-            all_files = await self._get_workspace_files()
-            new_files = [
-                f for f in all_files
-                if f not in self.indexing_state.indexed_files
-            ]
-            if not new_files:
-                self.is_indexing = False
-                logger.info("All files already indexed")
-                return {
-                    'success': True,
-                    'message': 'All files already indexed',
-                    'indexed': len(self.indexing_state.indexed_files)
-                }
-            logger.info(f"Found {len(new_files)} new files to index (total workspace: {len(all_files)})")
-            await self._process_files_in_batches(new_files)
-            success_count = len(new_files) - len([f for f in new_files if f in self.indexing_state.failed_files])
-            self.is_indexing = False
-            logger.info(f"Indexing complete: {success_count} files indexed successfully")
-            return {
-                'success': True,
-                'message': f'Indexed {success_count} files',
-                'indexed': success_count,
-                'failed': len(self.indexing_state.failed_files),
-                'total': len(self.indexing_state.indexed_files)
-            }
-        except Exception as e:
-            self.is_indexing = False
-            logger.error(f"Indexing error: {e}", exc_info=True)
-            return {'error': str(e)}
-    async def _process_files_in_batches(self, files: List[str]):
-        """Process files in batches with rate limiting"""
-        total_files = len(files)
-        processed_count = 0
-        documents_buffer = []
-        logger.info(f"Starting batch processing of {total_files} files")
-        for i in range(0, len(files), self.BATCH_SIZE):
-            batch = files[i:i + self.BATCH_SIZE]
-            batch_start = i + 1
-            batch_end = min(i + self.BATCH_SIZE, total_files)
-            logger.info(f"Processing files {batch_start}-{batch_end} of {total_files}")
-            try:
-                batch_results = await self._process_batch_with_concurrency(batch)
-                for result in batch_results:
-                    if result.success and result.documents:
-                        documents_buffer.extend(result.documents)
-                        self.indexing_state.indexed_files.add(result.file_path)
-                        self.indexing_state.failed_files.pop(result.file_path, None)
-                    elif not result.success:
-                        self.indexing_state.failed_files[result.file_path] = result.error or "Unknown error"
-                        logger.warning(f"Failed to index {Path(result.file_path).name}: {result.error}")
-                # Add documents to vector store in batches
-                if len(documents_buffer) >= self.EMBEDDING_BATCH_SIZE:
-                    await self._add_documents_to_qdrant(documents_buffer)
-                    documents_buffer = []
-                processed_count += len(batch)
-                # Checkpoint
-                if processed_count % self.CHECKPOINT_INTERVAL == 0:
-                    await self._save_checkpoint()
-                    logger.info(f"Checkpoint saved: {processed_count}/{total_files} files processed")
-                # Rate limiting
-                await asyncio.sleep(self.RATE_LIMIT_DELAY)
-            except Exception as e:
-                logger.error(f"Batch {batch_start}-{batch_end} processing error: {e}", exc_info=True)
-                for file_path in batch:
-                    self.indexing_state.failed_files[file_path] = "Batch processing failed"
-        # Add remaining documents
-        if documents_buffer:
-            await self._add_documents_to_qdrant(documents_buffer)
-        await self._save_checkpoint()
-        logger.info(f"Indexing complete: {processed_count} files processed")
-    async def _process_batch_with_concurrency(
-        self,
-        file_paths: List[str]
-    ) -> List[FileProcessResult]:
-        """Process a batch of files with controlled concurrency"""
-        results = []
-        for i in range(0, len(file_paths), self.MAX_CONCURRENT_READS):
-            chunk = file_paths[i:i + self.MAX_CONCURRENT_READS]
-            tasks = [self._index_file_with_retry(fp) for fp in chunk]
-            chunk_results = await asyncio.gather(*tasks, return_exceptions=True)
-            for j, result in enumerate(chunk_results):
-                if isinstance(result, Exception):
-                    results.append(FileProcessResult(
-                        file_path=chunk[j],
-                        success=False,
-                        error=str(result),
-                        size=0
-                    ))
-                else:
-                    results.append(FileProcessResult(
-                        file_path=chunk[j],
-                        success=True,
-                        documents=result[0],
-                        size=result[1]
-                    ))
-        return results
-    async def _index_file_with_retry(
-        self,
-        file_path: str
-    ) -> Tuple[List[Document], int]:
-        """Index a file with retry logic"""
-        last_error = None
-        for attempt in range(self.MAX_RETRIES):
-            try:
-                documents = await self._index_file(file_path)
-                size = Path(file_path).stat().st_size
-                return documents, size
-            except Exception as e:
-                last_error = e
-                if attempt < self.MAX_RETRIES - 1:
-                    delay = 0.5 * (2 ** attempt)
-                    logger.debug(f"Retry {attempt + 1} for {Path(file_path).name} after {delay}s")
-                    await asyncio.sleep(delay)
-        raise last_error or Exception("Unknown indexing error")
-    async def _index_file(self, file_path: str) -> List[Document]:
-        """Index a single file"""
-        path = Path(file_path)
-        # Size check
-        file_size = path.stat().st_size
-        if file_size > self.MAX_FILE_SIZE:
-            logger.debug(f"Skipping {path.name}: file too large ({file_size} bytes)")
-            return []
-        # Binary check
-        if self._is_binary_file(file_path):
-            return []
-        # Read file content
-        try:
-            async with aiofiles.open(file_path, 'r', encoding='utf-8') as f:
-                content = await f.read()
-        except UnicodeDecodeError:
-            logger.debug(f"Skipping {path.name}: encoding error")
-            return []
-        except Exception as e:
-            if 'No such file' in str(e):
-                return []
-            raise
-        # Content validation
-        if not content or len(content.strip()) < 10:
-            return []
-        # Minified file check
-        if self._is_minified_file(content, file_path):
-            logger.debug(f"Skipping {path.name}: minified file")
-            return []
-        # Split into chunks
-        try:
-            chunks = await asyncio.to_thread(self.text_splitter.split_text, content)
-        except Exception as e:
-            logger.warning(f"Failed to split {path.name}: {e}")
-            return []
-        # Create documents
-        relative_path = str(path.relative_to(self.workspace_root))
-        documents = []
-        for idx, chunk in enumerate(chunks):
-            documents.append(Document(
-                page_content=chunk,
-                metadata={
-                    'source': relative_path,
-                    'filename': path.name,
-                    'extension': path.suffix,
-                    'indexed_at': int(asyncio.get_event_loop().time() * 1000),
-                    'chunk_index': idx,
-                    'total_chunks': len(chunks),
-                    'file_size': file_size
-                }
-            ))
-        return documents
-    def _is_binary_file(self, file_path: str) -> bool:
-        """Check if file is binary"""
-        ext = Path(file_path).suffix.lower()
-        return ext in self.BINARY_EXTENSIONS
-    def _is_minified_file(self, content: str, file_path: str) -> bool:
-        """Check if file is minified"""
-        ext = Path(file_path).suffix.lower()
-        if ext not in ['.js', '.css', '.json']:
-            return False
-        lines = content.split('\n')
-        if not lines:
-            return False
-        avg_line_length = len(content) / len(lines)
-        return avg_line_length > 500 or '.min.' in file_path
-    async def _add_documents_to_qdrant(self, documents: List[Document]):
-        """Add documents to Qdrant vector store"""
-        if not documents:
-            return
-        logger.info(f"Adding {len(documents)} document chunks to Qdrant")
-        try:
-            if not self.vector_store:
-                # Create collection and initialize vector store
-                embedding_dim = len(self.embeddings.embed_query("test"))
-                self.qdrant_client.create_collection(
-                    collection_name=self.collection_name,
-                    vectors_config=VectorParams(
-                        size=embedding_dim,
-                        distance=Distance.COSINE
-                    )
-                )
-                self.vector_store = QdrantVectorStore(
-                    client=self.qdrant_client,
-                    collection_name=self.collection_name,
-                    embedding=self.embeddings
-                )
-                logger.info(f"Created Qdrant collection: {self.collection_name}")
-            # Add documents in batches
-            for i in range(0, len(documents), self.EMBEDDING_BATCH_SIZE):
-                batch = documents[i:i + self.EMBEDDING_BATCH_SIZE]
-                await asyncio.to_thread(self.vector_store.add_documents, batch)
-                await asyncio.sleep(0.05)
-            logger.info(f"Successfully added {len(documents)} chunks to vector store")
-        except Exception as e:
-            logger.error(f"Failed to add documents to Qdrant: {e}", exc_info=True)
-            raise
-    async def _save_checkpoint(self):
-        """Save checkpoint during indexing"""
-        try:
-            self.indexing_state.last_indexed_at = int(asyncio.get_event_loop().time() * 1000)
-            await self._save_indexing_state()
-            logger.debug("Checkpoint saved")
-        except Exception as e:
-            logger.error(f"Failed to save checkpoint: {e}")
-    async def _get_workspace_files(self) -> List[str]:
-        """Get all eligible workspace files"""
-        files = []
-        for ext in self.FILE_EXTENSIONS:
-            for file_path in self.workspace_root.rglob(f'*{ext}'):
-                # Check exclude patterns
-                if any(pattern in str(file_path) for pattern in self.EXCLUDE_PATTERNS):
-                    continue
-                # Skip minified files
-                if '.min.' in file_path.name:
-                    continue
-                files.append(str(file_path))
-        logger.debug(f"Found {len(files)} eligible files in workspace")
-        return files
-    async def search_similar_code(
-        self,
-        query: str,
-        k: int = 5
-    ) -> List[Document]:
-        """Search for similar code chunks"""
-        if not self.vector_store:
-            logger.error("Vector store not initialized")
-            raise ValueError("Vector store not initialized. Please index workspace first.")
-        try:
-            logger.debug(f"Searching for: {query[:50]}...")
-            results = await asyncio.to_thread(
-                self.vector_store.similarity_search,
-                query,
-                k=k
-            )
-            logger.debug(f"Found {len(results)} similar documents")
-            return results
-        except Exception as e:
-            logger.error(f"Search error: {e}", exc_info=True)
-            return []
-    async def get_relevant_context(
-        self,
-        query: str,
-        max_chunks: int = 5
-    ) -> str:
-        """Get relevant context for a query"""
-        try:
-            docs = await self.search_similar_code(query, max_chunks)
-            if not docs:
-                return ""
-            context_parts = []
-            for i, doc in enumerate(docs, 1):
-                source = doc.metadata.get('source', 'Unknown')
-                chunk_idx = doc.metadata.get('chunk_index', 0)
-                context_parts.append(
-                    f"[Context {i}] File: {source} (chunk {chunk_idx})\n{doc.page_content}\n---"
-                )
-            return "\n\n".join(context_parts)
-        except Exception as e:
-            logger.error(f"Failed to get relevant context: {e}")
-            return ""
-    async def reindex_workspace(self) -> Dict[str, any]:
-        """Reindex entire workspace from scratch"""
-        logger.info("Starting full reindex")
-        await self.reset_index()
-        return await self.index_workspace(show_progress=True)
-    async def reset_index(self):
-        """Reset the entire index"""
-        try:
-            logger.info("Resetting index")
-            # Delete Qdrant collection
-            if self.qdrant_client:
-                try:
-                    self.qdrant_client.delete_collection(self.collection_name)
-                    logger.info(f"Deleted Qdrant collection: {self.collection_name}")
-                except Exception as e:
-                    logger.warning(f"Could not delete collection: {e}")
-            self.vector_store = None
-            self.indexing_state = IndexingState()
-            # Clean up storage
-            if self.base_path.exists():
-                import shutil
-                shutil.rmtree(self.base_path)
-            self.base_path.mkdir(parents=True, exist_ok=True)
-            self.qdrant_path.mkdir(parents=True, exist_ok=True)
-            await self._save_indexing_state()
-            # Reinitialize Qdrant client
-            self.qdrant_client = QdrantClient(path=str(self.qdrant_path))
-            logger.info("Index reset successfully")
-        except Exception as e:
-            logger.error(f"Failed to reset index: {e}", exc_info=True)
-            raise
-    def get_indexed_files_count(self) -> int:
-        """Get count of indexed files"""
-        return len(self.indexing_state.indexed_files)
-    def get_index_stats(self) -> Dict[str, any]:
-        """Get indexing statistics"""
-        from datetime import datetime
-        last_indexed = "Never"
-        if self.indexing_state.last_indexed_at > 0:
-            dt = datetime.fromtimestamp(self.indexing_state.last_indexed_at / 1000)
-            last_indexed = dt.strftime('%Y-%m-%d %H:%M:%S')
-        vector_count = 0
-        if self.qdrant_client and self.vector_store:
-            try:
-                collection_info = self.qdrant_client.get_collection(self.collection_name)
-                vector_count = collection_info.points_count
-            except:
-                pass
-        return {
-            'total_indexed': len(self.indexing_state.indexed_files),
-            'total_failed': len(self.indexing_state.failed_files),
-            'vector_count': vector_count,
-            'last_indexed_at': last_indexed,
-            'is_ready': self.vector_store is not None,
-            'is_indexing': self.is_indexing,
-            'version': self.indexing_state.version,
-            'embedding_model': self.EMBEDDING_MODEL
-        }
-    def is_ready(self) -> bool:
-        """Check if RAG service is ready"""
-        return (
-            self.vector_store is not None and
-            len(self.indexing_state.indexed_files) > 0
-        )

services/rag_service_supabase.py ADDED Viewed

	@@ -0,0 +1,595 @@

+import hashlib
+import asyncio
+import logging
+from typing import List, Dict, Optional, Tuple
+from dataclasses import dataclass
+from datetime import datetime
+import os
+from langchain_core.documents import Document
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+from langchain_huggingface import HuggingFaceEmbeddings
+from supabase import create_client, Client
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+@dataclass
+class FileProcessResult:
+    file_path: str
+    success: bool
+    documents: Optional[List[Document]] = None
+    error: Optional[str] = None
+    size: int = 0
+class RAGServiceSupabase:
+    """Service for indexing and querying codebase with RAG using Supabase pgvector"""
+    VERSION = "4.0"
+    BATCH_SIZE = 10  # Files per batch
+    MAX_FILE_SIZE = 500_000  # 500KB
+    EMBEDDING_BATCH_SIZE = 32  # Chunks per embedding batch
+    RATE_LIMIT_DELAY = 0.1
+    MAX_CONCURRENT_READS = 5
+    # Using sentence-transformers/all-MiniLM-L6-v2 (384 dimensions)
+    EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
+    EMBEDDING_DIM = 384
+    BINARY_EXTENSIONS = {
+        '.png', '.jpg', '.jpeg', '.gif', '.bmp', '.ico',
+        '.mp3', '.mp4', '.avi', '.mov', '.wav',
+        '.zip', '.tar', '.gz', '.rar', '.7z',
+        '.exe', '.dll', '.so', '.dylib',
+        '.pdf', '.doc', '.docx', '.xls', '.xlsx',
+        '.woff', '.woff2', '.ttf', '.eot'
+    }
+    EXCLUDE_PATTERNS = [
+        'node_modules', '.git', 'dist', 'build',
+        '.vscode', 'coverage', '__pycache__',
+        '.pytest_cache', '.next', 'out', '.DS_Store',
+        'venv', 'env', '.env', 'vendor'
+    ]
+    FILE_EXTENSIONS = [
+        '.ts', '.js', '.py', '.jsx', '.tsx', '.java',
+        '.go', '.php', '.rs', '.cpp', '.c', '.h', '.hpp',
+        '.cs', '.rb', '.swift', '.kt', '.md', '.txt',
+        '.json', '.yml', '.yaml', '.sh', '.sql', '.r'
+    ]
+    def __init__(self):
+        """Initialize RAG service with Supabase"""
+        logger.info("Initializing RAG service with Supabase pgvector")
+        # Initialize Supabase client
+        supabase_url = os.getenv("SUPABASE_URL")
+        supabase_key = os.getenv("SUPABASE_KEY")
+        if not supabase_url or not supabase_key:
+            logger.warning("Supabase URL or key not configured. RAG features will not work.")
+            logger.warning("Please set SUPABASE_URL and SUPABASE_KEY environment variables.")
+            self.client: Optional[Client] = None
+        else:
+            try:
+                self.client = create_client(supabase_url, supabase_key)
+                # Test connection
+                try:
+                    self.client.table("code_embeddings").select("id").limit(1).execute()
+                    logger.info("Supabase client initialized and verified")
+                except Exception as e:
+                    logger.error(f"Failed to verify Supabase connection: {e}")
+                    self.client = None
+            except Exception as e:
+                logger.error(f"Failed to initialize Supabase client: {e}")
+                self.client = None
+        # Initialize HuggingFace embeddings
+        try:
+            self.embeddings = HuggingFaceEmbeddings(
+                model_name=self.EMBEDDING_MODEL,
+                model_kwargs={'device': 'cpu'},
+                encode_kwargs={'normalize_embeddings': True}
+            )
+            logger.info(f"Loaded embedding model: {self.EMBEDDING_MODEL}")
+        except Exception as e:
+            logger.error(f"Failed to load embedding model: {e}")
+            raise
+        # Text splitter for code
+        self.text_splitter = RecursiveCharacterTextSplitter(
+            chunk_size=1000,
+            chunk_overlap=200,
+            separators=["\n\n", "\n", " ", ""]
+        )
+        self.is_indexing = False
+    @staticmethod
+    def _hash_workspace_path(path: str) -> str:
+        """Create stable hash of workspace path"""
+        return hashlib.md5(path.encode()).hexdigest()
+    async def initialize(self):
+        """Initialize RAG service (no-op for stateless service)"""
+        if self.client is None:
+            logger.warning("RAG service initialized without Supabase client")
+        else:
+            logger.info("RAG service initialized successfully")
+        return True
+    async def index_workspace(
+        self,
+        workspace_id: str,
+        files: List[Dict[str, str]]
+    ) -> Dict[str, any]:
+        """
+        Index workspace files
+        Args:
+            workspace_id: Unique identifier for the workspace
+            files: List of dicts with 'path' and 'content' keys
+        """
+        if self.client is None:
+            return {'error': 'Supabase client not initialized'}
+        if self.is_indexing:
+            logger.warning("Indexing already in progress")
+            return {'error': 'Indexing already in progress'}
+        self.is_indexing = True
+        logger.info(f"Starting workspace indexing for workspace_id: {workspace_id}")
+        try:
+            total_files = len(files)
+            if total_files == 0:
+                self.is_indexing = False
+                return {
+                    'success': True,
+                    'message': 'No files to index',
+                    'indexed': 0
+                }
+            logger.info(f"Indexing {total_files} files")
+            success_count = 0
+            failed_count = 0
+            # Process files in batches
+            for i in range(0, total_files, self.BATCH_SIZE):
+                batch = files[i:i + self.BATCH_SIZE]
+                batch_start = i + 1
+                batch_end = min(i + self.BATCH_SIZE, total_files)
+                logger.info(f"Processing files {batch_start}-{batch_end} of {total_files}")
+                try:
+                    batch_results = await self._process_batch(workspace_id, batch)
+                    for result in batch_results:
+                        if result.success:
+                            success_count += 1
+                        else:
+                            failed_count += 1
+                            logger.warning(f"Failed to index {result.file_path}: {result.error}")
+                    # Rate limiting
+                    await asyncio.sleep(self.RATE_LIMIT_DELAY)
+                except Exception as e:
+                    logger.error(f"Batch {batch_start}-{batch_end} processing error: {e}", exc_info=True)
+                    failed_count += len(batch)
+            self.is_indexing = False
+            logger.info(f"Indexing complete: {success_count} files indexed, {failed_count} failed")
+            return {
+                'success': True,
+                'message': f'Indexed {success_count} files',
+                'indexed': success_count,
+                'failed': failed_count,
+                'total': total_files
+            }
+        except Exception as e:
+            self.is_indexing = False
+            logger.error(f"Indexing error: {e}", exc_info=True)
+            return {'error': str(e)}
+    async def _process_batch(
+        self,
+        workspace_id: str,
+        files: List[Dict[str, str]]
+    ) -> List[FileProcessResult]:
+        """Process a batch of files"""
+        results = []
+        # Process files concurrently
+        tasks = [self._index_file(workspace_id, file_data) for file_data in files]
+        batch_results = await asyncio.gather(*tasks, return_exceptions=True)
+        for i, result in enumerate(batch_results):
+            if isinstance(result, Exception):
+                results.append(FileProcessResult(
+                    file_path=files[i].get('path', 'unknown'),
+                    success=False,
+                    error=str(result),
+                    size=0
+                ))
+            else:
+                results.append(result)
+        return results
+    async def _index_file(
+        self,
+        workspace_id: str,
+        file_data: Dict[str, str]
+    ) -> FileProcessResult:
+        """Index a single file"""
+        file_path = file_data.get('path', '')
+        content = file_data.get('content', '')
+        try:
+            # Size check
+            if len(content.encode('utf-8')) > self.MAX_FILE_SIZE:
+                return FileProcessResult(
+                    file_path=file_path,
+                    success=False,
+                    error="File too large",
+                    size=len(content.encode('utf-8'))
+                )
+            # Binary check
+            if self._is_binary_file(file_path):
+                return FileProcessResult(
+                    file_path=file_path,
+                    success=False,
+                    error="Binary file",
+                    size=len(content.encode('utf-8'))
+                )
+            # Content validation
+            if not content or len(content.strip()) < 10:
+                return FileProcessResult(
+                    file_path=file_path,
+                    success=False,
+                    error="Empty or too short",
+                    size=len(content.encode('utf-8'))
+                )
+            # Minified file check
+            if self._is_minified_file(content, file_path):
+                return FileProcessResult(
+                    file_path=file_path,
+                    success=False,
+                    error="Minified file",
+                    size=len(content.encode('utf-8'))
+                )
+            # Split into chunks
+            try:
+                chunks = await asyncio.to_thread(self.text_splitter.split_text, content)
+            except Exception as e:
+                logger.warning(f"Failed to split {file_path}: {e}")
+                return FileProcessResult(
+                    file_path=file_path,
+                    success=False,
+                    error=f"Split error: {e}",
+                    size=len(content.encode('utf-8'))
+                )
+            # Delete existing embeddings for this file
+            await self._delete_file_embeddings(workspace_id, file_path)
+            # Create documents and embeddings
+            documents = []
+            for idx, chunk in enumerate(chunks):
+                documents.append(Document(
+                    page_content=chunk,
+                    metadata={
+                        'source': file_path,
+                        'filename': os.path.basename(file_path),
+                        'chunk_index': idx,
+                        'total_chunks': len(chunks),
+                        'file_size': len(content.encode('utf-8'))
+                    }
+                ))
+            # Generate embeddings and store in Supabase
+            await self._store_embeddings(workspace_id, file_path, documents)
+            return FileProcessResult(
+                file_path=file_path,
+                success=True,
+                documents=documents,
+                size=len(content.encode('utf-8'))
+            )
+        except Exception as e:
+            logger.error(f"Error indexing file {file_path}: {e}", exc_info=True)
+            return FileProcessResult(
+                file_path=file_path,
+                success=False,
+                error=str(e),
+                size=len(content.encode('utf-8')) if content else 0
+            )
+    async def _store_embeddings(
+        self,
+        workspace_id: str,
+        file_path: str,
+        documents: List[Document]
+    ):
+        """Store document embeddings in Supabase"""
+        if not documents:
+            return
+        try:
+            # Generate embeddings in batches
+            for i in range(0, len(documents), self.EMBEDDING_BATCH_SIZE):
+                batch = documents[i:i + self.EMBEDDING_BATCH_SIZE]
+                # Generate embeddings
+                texts = [doc.page_content for doc in batch]
+                embeddings_list = await asyncio.to_thread(
+                    self.embeddings.embed_documents,
+                    texts
+                )
+                # Prepare records for insertion
+                records = []
+                for j, (doc, embedding) in enumerate(zip(batch, embeddings_list)):
+                    # Convert embedding to list format for Supabase
+                    embedding_list = embedding if isinstance(embedding, list) else embedding.tolist()
+                    records.append({
+                        'workspace_id': workspace_id,
+                        'file_path': file_path,
+                        'content': doc.page_content,
+                        'embedding': embedding_list,
+                        'chunk_index': doc.metadata.get('chunk_index', j),
+                        'total_chunks': doc.metadata.get('total_chunks', len(documents)),
+                        'file_size': doc.metadata.get('file_size', 0)
+                    })
+                # Insert into Supabase
+                if records:
+                    self.client.table("code_embeddings").insert(records).execute()
+                    logger.debug(f"Stored {len(records)} embeddings for {file_path}")
+            logger.info(f"Stored {len(documents)} chunks for {file_path}")
+        except Exception as e:
+            logger.error(f"Failed to store embeddings for {file_path}: {e}", exc_info=True)
+            raise
+    async def _delete_file_embeddings(self, workspace_id: str, file_path: str):
+        """Delete all embeddings for a file"""
+        try:
+            self.client.table("code_embeddings").delete().eq(
+                "workspace_id", workspace_id
+            ).eq("file_path", file_path).execute()
+            logger.debug(f"Deleted embeddings for {file_path}")
+        except Exception as e:
+            logger.error(f"Failed to delete embeddings for {file_path}: {e}")
+    def _is_binary_file(self, file_path: str) -> bool:
+        """Check if file is binary"""
+        ext = os.path.splitext(file_path)[1].lower()
+        return ext in self.BINARY_EXTENSIONS
+    def _is_minified_file(self, content: str, file_path: str) -> bool:
+        """Check if file is minified"""
+        ext = os.path.splitext(file_path)[1].lower()
+        if ext not in ['.js', '.css', '.json']:
+            return False
+        lines = content.split('\n')
+        if not lines:
+            return False
+        avg_line_length = len(content) / len(lines)
+        return avg_line_length > 500 or '.min.' in file_path
+    async def search_similar_code(
+        self,
+        workspace_id: str,
+        query: str,
+        k: int = 5
+    ) -> List[Document]:
+        """Search for similar code chunks using pgvector via Supabase RPC"""
+        if self.client is None:
+            logger.error("Supabase client not initialized")
+            return []
+        try:
+            # Generate query embedding
+            query_embedding = await asyncio.to_thread(
+                self.embeddings.embed_query,
+                query
+            )
+            # Convert to list format
+            query_embedding_list = query_embedding if isinstance(query_embedding, list) else query_embedding.tolist()
+            # Use Supabase RPC function for efficient vector search
+            try:
+                response = self.client.rpc(
+                    'match_code_embeddings',
+                    {
+                        'query_embedding': query_embedding_list,
+                        'workspace_filter': workspace_id,
+                        'match_threshold': 0.3,  # Lower threshold for more results
+                        'match_count': k
+                    }
+                ).execute()
+                if not response.data:
+                    return []
+                # Convert to Document objects
+                documents = []
+                for row in response.data:
+                    documents.append(Document(
+                        page_content=row['content'],
+                        metadata={
+                            'source': row['file_path'],
+                            'filename': os.path.basename(row['file_path']),
+                            'chunk_index': row.get('chunk_index', 0),
+                            'total_chunks': row.get('total_chunks', 1),
+                            'similarity': float(row.get('similarity', 0.0))
+                        }
+                    ))
+                logger.debug(f"Found {len(documents)} similar documents via RPC")
+                return documents
+            except Exception as rpc_error:
+                # Fallback to Python-based similarity if RPC fails
+                logger.warning(f"RPC search failed, using fallback: {rpc_error}")
+                return await self._fallback_search(workspace_id, query_embedding_list, k)
+        except Exception as e:
+            logger.error(f"Search error: {e}", exc_info=True)
+            return []
+    async def _fallback_search(
+        self,
+        workspace_id: str,
+        query_embedding_list: List[float],
+        k: int
+    ) -> List[Document]:
+        """Fallback search using Python-based similarity computation"""
+        try:
+            # Fetch embeddings for workspace (limited for free tier)
+            response = self.client.table("code_embeddings").select(
+                "id, workspace_id, file_path, content, chunk_index, total_chunks, embedding"
+            ).eq("workspace_id", workspace_id).limit(500).execute()
+            if not response.data:
+                return []
+            # Compute cosine similarity
+            import numpy as np
+            query_vec = np.array(query_embedding_list)
+            query_vec_norm = np.linalg.norm(query_vec)
+            similarities = []
+            for row in response.data:
+                embedding = row['embedding']
+                if embedding:
+                    doc_vec = np.array(embedding)
+                    doc_vec_norm = np.linalg.norm(doc_vec)
+                    if doc_vec_norm > 0 and query_vec_norm > 0:
+                        similarity = np.dot(query_vec, doc_vec) / (query_vec_norm * doc_vec_norm)
+                        similarities.append((similarity, row))
+            # Sort by similarity and take top k
+            similarities.sort(key=lambda x: x[0], reverse=True)
+            top_results = similarities[:k]
+            # Convert to Document objects
+            documents = []
+            for similarity, row in top_results:
+                documents.append(Document(
+                    page_content=row['content'],
+                    metadata={
+                        'source': row['file_path'],
+                        'filename': os.path.basename(row['file_path']),
+                        'chunk_index': row.get('chunk_index', 0),
+                        'total_chunks': row.get('total_chunks', 1),
+                        'similarity': float(similarity)
+                    }
+                ))
+            return documents
+        except Exception as e:
+            logger.error(f"Fallback search error: {e}")
+            return []
+    async def get_relevant_context(
+        self,
+        workspace_id: str,
+        query: str,
+        max_chunks: int = 5
+    ) -> str:
+        """Get relevant context for a query"""
+        try:
+            docs = await self.search_similar_code(workspace_id, query, max_chunks)
+            if not docs:
+                return ""
+            context_parts = []
+            for i, doc in enumerate(docs, 1):
+                source = doc.metadata.get('source', 'Unknown')
+                chunk_idx = doc.metadata.get('chunk_index', 0)
+                similarity = doc.metadata.get('similarity', 0.0)
+                context_parts.append(
+                    f"[Context {i}] File: {source} (chunk {chunk_idx}, similarity: {similarity:.3f})\n{doc.page_content}\n---"
+                )
+            return "\n\n".join(context_parts)
+        except Exception as e:
+            logger.error(f"Failed to get relevant context: {e}")
+            return ""
+    async def delete_file(self, workspace_id: str, file_path: str) -> Dict[str, any]:
+        """Delete embeddings for a file"""
+        if self.client is None:
+            return {'error': 'Supabase client not initialized'}
+        try:
+            await self._delete_file_embeddings(workspace_id, file_path)
+            return {
+                'success': True,
+                'message': f'Deleted embeddings for {file_path}'
+            }
+        except Exception as e:
+            logger.error(f"Failed to delete file embeddings: {e}")
+            return {'error': str(e)}
+    async def get_index_stats(self, workspace_id: str) -> Dict[str, any]:
+        """Get indexing statistics for a workspace"""
+        if self.client is None:
+            return {'error': 'Supabase client not initialized'}
+        try:
+            # Count embeddings for this workspace
+            response = self.client.table("code_embeddings").select(
+                "id, file_path",
+                count="exact"
+            ).eq("workspace_id", workspace_id).execute()
+            total_vectors = response.count if hasattr(response, 'count') else len(response.data)
+            # Count unique files
+            unique_files = set()
+            for row in response.data:
+                unique_files.add(row['file_path'])
+            return {
+                'workspace_id': workspace_id,
+                'total_vectors': total_vectors,
+                'total_files': len(unique_files),
+                'is_ready': total_vectors > 0,
+                'embedding_model': self.EMBEDDING_MODEL
+            }
+        except Exception as e:
+            logger.error(f"Failed to get index stats: {e}")
+            return {'error': str(e)}
+    def is_ready(self, workspace_id: Optional[str] = None) -> bool:
+        """Check if RAG service is ready (always true if client is initialized)"""
+        return self.client is not None

supabase_migrations/001_create_code_embeddings.sql ADDED Viewed

	@@ -0,0 +1,68 @@

+-- Enable pgvector extension
+CREATE EXTENSION IF NOT EXISTS vector;
+-- Create code_embeddings table for workspace-scoped RAG
+CREATE TABLE IF NOT EXISTS code_embeddings (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    workspace_id TEXT NOT NULL,
+    file_path TEXT NOT NULL,
+    content TEXT NOT NULL,
+    embedding vector(384), -- Using sentence-transformers/all-MiniLM-L6-v2 (384 dimensions)
+    chunk_index INTEGER DEFAULT 0,
+    total_chunks INTEGER DEFAULT 1,
+    file_size INTEGER DEFAULT 0,
+    created_at TIMESTAMPTZ DEFAULT NOW(),
+    updated_at TIMESTAMPTZ DEFAULT NOW()
+);
+-- Create indexes for efficient querying
+CREATE INDEX IF NOT EXISTS idx_code_embeddings_workspace_id ON code_embeddings(workspace_id);
+CREATE INDEX IF NOT EXISTS idx_code_embeddings_file_path ON code_embeddings(workspace_id, file_path);
+CREATE INDEX IF NOT EXISTS idx_code_embeddings_embedding ON code_embeddings
+    USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
+-- Create index for workspace + file_path combination (for efficient deletion)
+CREATE INDEX IF NOT EXISTS idx_code_embeddings_workspace_file ON code_embeddings(workspace_id, file_path);
+-- Add comment
+COMMENT ON TABLE code_embeddings IS 'Stores code embeddings for RAG with workspace isolation';
+COMMENT ON COLUMN code_embeddings.workspace_id IS 'Hash of workspace path for isolation';
+COMMENT ON COLUMN code_embeddings.embedding IS 'Vector embedding of code chunk (384 dimensions)';
+-- Create function for efficient vector similarity search
+CREATE OR REPLACE FUNCTION match_code_embeddings(
+    query_embedding vector(384),
+    workspace_filter text,
+    match_threshold float DEFAULT 0.5,
+    match_count int DEFAULT 5
+)
+RETURNS TABLE (
+    id uuid,
+    workspace_id text,
+    file_path text,
+    content text,
+    chunk_index integer,
+    total_chunks integer,
+    file_size integer,
+    similarity float
+)
+LANGUAGE plpgsql
+AS $$
+BEGIN
+    RETURN QUERY
+    SELECT
+        code_embeddings.id,
+        code_embeddings.workspace_id,
+        code_embeddings.file_path,
+        code_embeddings.content,
+        code_embeddings.chunk_index,
+        code_embeddings.total_chunks,
+        code_embeddings.file_size,
+        1 - (code_embeddings.embedding <=> query_embedding) AS similarity
+    FROM code_embeddings
+    WHERE code_embeddings.workspace_id = workspace_filter
+        AND 1 - (code_embeddings.embedding <=> query_embedding) > match_threshold
+    ORDER BY code_embeddings.embedding <=> query_embedding
+    LIMIT match_count;
+END;
+$$;