TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published Dec 1, 2025 • 72
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 184
Visual Programmability: A Guide for Code-as-Thought in Chart Understanding Paper • 2509.09286 • Published Sep 11, 2025 • 11
Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance for Text-to-Image Generation Paper • 2508.18032 • Published Aug 25, 2025 • 42
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25, 2025 • 211
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues Paper • 2502.12084 • Published Feb 17, 2025 • 32
ConText: Driving In-context Learning for Text Removal and Segmentation Paper • 2506.03799 • Published Jun 4, 2025 • 1
GameFactory: Creating New Games with Generative Interactive Videos Paper • 2501.08325 • Published Jan 14, 2025 • 67