MLLM-CL: Continual Learning for Multimodal Large Language Models Paper β’ 2506.05453 β’ Published Jun 5, 2025 β’ 4
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper β’ 2601.17124 β’ Published Jan 23 β’ 33
MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper β’ 2601.07832 β’ Published Jan 12 β’ 52
Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning Paper β’ 2601.06943 β’ Published Jan 11 β’ 216
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search Paper β’ 2601.04767 β’ Published Jan 8 β’ 28
TongSIM: A General Platform for Simulating Intelligent Machines Paper β’ 2512.20206 β’ Published Dec 23, 2025 β’ 28
Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects Paper β’ 2511.01294 β’ Published Nov 3, 2025 β’ 14
TabTune: A Unified Library for Inference and Fine-Tuning Tabular Foundation Models Paper β’ 2511.02802 β’ Published Nov 4, 2025 β’ 16
Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench Paper β’ 2510.26865 β’ Published Oct 30, 2025 β’ 12
Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning Paper β’ 2511.02818 β’ Published Nov 4, 2025 β’ 15
Value Drifts: Tracing Value Alignment During LLM Post-Training Paper β’ 2510.26707 β’ Published Oct 30, 2025 β’ 13
NaviTrace: Evaluating Embodied Navigation of Vision-Language Models Paper β’ 2510.26909 β’ Published Oct 30, 2025 β’ 14
Visual Backdoor Attacks on MLLM Embodied Decision Making via Contrastive Trigger Learning Paper β’ 2510.27623 β’ Published Oct 31, 2025 β’ 13
left|,circlearrowright,text{BUS},right|: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles Paper β’ 2511.01340 β’ Published Nov 3, 2025 β’ 13
MotionStream: Real-Time Video Generation with Interactive Motion Controls Paper β’ 2511.01266 β’ Published Nov 3, 2025 β’ 32
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning Paper β’ 2511.01833 β’ Published Nov 3, 2025 β’ 16