DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation Paper • 2510.14949 • Published Oct 16, 2025 • 5
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published Oct 2, 2025 • 96
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness Paper • 2504.10514 • Published Apr 10, 2025 • 48
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published Mar 7, 2025 • 57