LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 20 days ago • 143
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization Paper • 2602.02958 • Published Feb 3 • 34
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference Paper • 2512.01031 • Published Nov 30, 2025 • 26
StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published Oct 10, 2025 • 53
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 189
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11, 2025 • 45
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation Paper • 2507.01957 • Published Jul 2, 2025 • 23
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published Jun 24, 2025 • 42
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity Paper • 2506.16500 • Published Jun 19, 2025 • 17
LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention Paper • 2502.14866 • Published Feb 20, 2025 • 13
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss Paper • 2402.05008 • Published Feb 7, 2024 • 23