Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 17 days ago • 125
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published Nov 14 • 44
Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance Paper • 2510.24711 • Published Oct 28 • 19
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction Paper • 2509.19297 • Published Sep 23 • 24
Neighboring Autoregressive Modeling for Efficient Visual Generation Paper • 2503.10696 • Published Mar 12 • 8
GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation Paper • 2411.18499 • Published Nov 27, 2024 • 18
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression Paper • 2410.08584 • Published Oct 11, 2024 • 12