Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation Paper • 2602.02214 • Published 5 days ago • 23
TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models Paper • 2512.02014 • Published Dec 1, 2025 • 73
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 187
Visual Programmability: A Guide for Code-as-Thought in Chart Understanding Paper • 2509.09286 • Published Sep 11, 2025 • 11
Visual-CoG: Stage-Aware Reinforcement Learning with Chain of Guidance for Text-to-Image Generation Paper • 2508.18032 • Published Aug 25, 2025 • 42
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25, 2025 • 212
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues Paper • 2502.12084 • Published Feb 17, 2025 • 33
ConText: Driving In-context Learning for Text Removal and Segmentation Paper • 2506.03799 • Published Jun 4, 2025 • 1
GameFactory: Creating New Games with Generative Interactive Videos Paper • 2501.08325 • Published Jan 14, 2025 • 67