Vision-aligned Latent Reasoning for Multi-modal Large Language Model Paper • 2602.04476 • Published 14 days ago • 14
MARS: Modular Agent with Reflective Search for Automated AI Research Paper • 2602.02660 • Published 16 days ago • 63
Dual-Stream Diffusion for World-Model Augmented Vision-Language-Action Model Paper • 2510.27607 • Published Oct 31, 2025 • 10
HAMLET: Switch your Vision-Language-Action Model into a History-Aware Policy Paper • 2510.00695 • Published Oct 1, 2025 • 6
Contrastive Representation Regularization for Vision-Language-Action Models Paper • 2510.01711 • Published Oct 2, 2025 • 4
Verifier-free Test-Time Sampling for Vision Language Action Models Paper • 2510.05681 • Published Oct 7, 2025 • 4
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 37