Store matrices as numpy arrays instead of Python lists f2e89c2 gary-boon Claude Opus 4.5 commited on about 19 hours ago
Add per-step memory cleanup for large model support a94eb19 gary-boon Claude Opus 4.5 commited on about 20 hours ago
Fix RAM exhaustion for large token generation 959074d gary-boon Claude Opus 4.5 commited on about 20 hours ago
feat: add auto_complete parameter for token generation bb689ce gary-boon Claude Opus 4.5 commited on about 22 hours ago
fix: add QKV extraction support for Mistral/Devstral architecture d1d37a8 gary-boon Claude Opus 4.5 commited on about 23 hours ago
feat: implement lazy-loading for attention matrices 929ba88 gary-boon Claude Opus 4.5 commited on about 23 hours ago
Add avg_entropy calculation for attention heads 66a46b6 gary-boon Claude Opus 4.5 commited on 3 days ago
Revert QKV visualization fixes - need better approach for data streaming d0b7e29 gary-boon Claude Opus 4.5 commited on 8 days ago
Limit QKV matrices to top 5 heads per layer to reduce response size decb5ab gary-boon Claude Opus 4.5 commited on 8 days ago
Fix QKV matrix extraction for Mistral/Devstral architecture 9056859 gary-boon Claude Opus 4.5 commited on 8 days ago
Fix QKV visualization for Mistral/Devstral architecture 4ec134b gary-boon Claude Opus 4.5 commited on 8 days ago
Add future considerations doc for response size optimization 3e67ea2 gary-boon Claude Opus 4.5 commited on 9 days ago
Fix: Import time module at top level for SSE events 15a862b gary-boon Claude Opus 4.5 commited on 9 days ago
Add SSE streaming endpoint for real-time analysis progress 172a186 gary-boon Claude Opus 4.5 commited on 9 days ago
feat: Include token metadata in analysis response ee0f6c9 gary-boon Claude Opus 4.5 commited on 9 days ago
feat: Implement tier-based model filtering by device type 6bf9f5c gary-boon Claude Opus 4.5 commited on 9 days ago
Fix: Add attn_implementation="eager" to model switch function f94a7ae gary-boon Claude Opus 4.5 commited on 9 days ago
Add Phase 5: Performance optimizations to phased plan 383a328 gary-boon Claude Opus 4.5 commited on 10 days ago
Add tokenSections boundaries and update system prompt c6f4cc5 gary-boon Claude Opus 4.5 commited on 10 days ago
Fix: Handle MistralCommonTokenizer pad_token setter e20ccaf gary-boon Claude Opus 4.5 commited on 10 days ago
Integrate mistral-common for correct Devstral tokenization ed06dcb gary-boon Claude Opus 4.5 commited on 10 days ago
Remove mistral_common to fix dependency conflict 3d9d9ee gary-boon Claude Opus 4.5 commited on 10 days ago
Use mistral_common for proper Devstral prompt formatting 3e80769 gary-boon Claude Opus 4.5 commited on 10 days ago
Add system prompt support for instruction-tuned models 2860768 gary-boon Claude Opus 4.5 commited on 10 days ago
fix: Simpler prompt format and temperature=0 for Devstral 76020ee gary-boon Claude Opus 4.5 commited on 10 days ago
fix: Sanitize JSON response for NaN/Inf float values 99f6209 gary-boon Claude Opus 4.5 commited on 10 days ago
fix: Check chat_template is set before using apply_chat_template 474927d gary-boon Claude Opus 4.5 commited on 10 days ago
fix: Add chat template support for Devstral instruct model 8d85da8 gary-boon Claude Opus 4.5 commited on 10 days ago
fix: Convert bfloat16 to float32 for numpy compatibility cb6f39c gary-boon Claude Opus 4.5 commited on 11 days ago
fix: Use eager attention for output_attentions support 5333b21 gary-boon Claude Opus 4.5 commited on 11 days ago
fix: Skip heavy ML deps in CI security checks ba27c0c gary-boon Claude Opus 4.5 commited on 11 days ago
fix: Update torch to 2.3+ for transformers compatibility 1b73605 gary-boon Claude Opus 4.5 commited on 11 days ago
fix: Update transformers for Devstral support b788304 gary-boon Claude Opus 4.5 commited on 11 days ago
docs: Mark GPU HF Space Devstral deployment complete 65c6e2e gary-boon Claude Opus 4.5 commited on 11 days ago
docs: Update phased plan with Phase 2/2b/2c completion status 688efad gary-boon Claude Opus 4.5 commited on 11 days ago
Update .env.spark.example: TORCH_DTYPE now auto-detected 543454f gary-boon Claude Opus 4.5 commited on 11 days ago
Update plan: Phase 1 paused due to GB10 GPU support e694533 gary-boon Claude Opus 4.5 commited on 11 days ago
Add DEVICE env var to force CPU mode on DGX Spark 5f122aa gary-boon Claude Opus 4.5 commited on 11 days ago
Use NGC PyTorch 24.08 for Python 3.10 compatibility a2875a2 gary-boon Claude Opus 4.5 commited on 11 days ago
Use NVIDIA NGC PyTorch container for GB10 support a4cfbff gary-boon Claude Opus 4.5 commited on 11 days ago
Try PyTorch nightly for GB10/sm_121 GPU support a009a49 gary-boon Claude Opus 4.5 commited on 11 days ago
Make zarr/numcodecs imports optional for ARM64 compatibility 6435a75 gary-boon Claude Opus 4.5 commited on 11 days ago
Skip zarr/numcodecs in Spark build (ARM64 incompatible) d129e37 gary-boon Claude Opus 4.5 commited on 11 days ago
Fix numcodecs ARM64 compatibility in Dockerfile.spark 772fc80 gary-boon Claude Opus 4.5 commited on 11 days ago
Fix Dockerfile.spark for CUDA 13.0 compatibility a4927aa gary-boon Claude Opus 4.5 commited on 11 days ago
Fix Dockerfile.spark for ARM64 architecture (DGX Spark) 9d00d33 gary-boon Claude Opus 4.5 commited on 11 days ago