Meta-CoT: Enhancing Granularity and Generalization in Image Editing Paper • 2604.24625 • Published 3 days ago • 23
Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation Paper • 2604.24763 • Published 3 days ago • 57
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company Paper • 2604.22446 • Published 6 days ago • 110
Video Analysis and Generation via a Semantic Progress Function Paper • 2604.22554 • Published 6 days ago • 57
ELT: Elastic Looped Transformers for Visual Generation Paper • 2604.09168 • Published 20 days ago • 20
ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation Paper • 2604.23099 • Published 5 days ago • 2
Stabilizing Efficient Reasoning with Step-Level Advantage Selection Paper • 2604.24003 • Published 3 days ago • 5
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis Paper • 2604.24198 • Published 3 days ago • 17
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 3 days ago • 110
MedVR: Annotation-Free Medical Visual Reasoning via Agentic Reinforcement Learning Paper • 2604.08203 • Published 20 days ago • 2
Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets Paper • 2604.22294 • Published 6 days ago • 15
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published 6 days ago • 211
AgentSPEX: An Agent SPecification and EXecution Language Paper • 2604.13346 • Published 16 days ago • 160
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published 7 days ago • 36
UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling Paper • 2604.19734 • Published 9 days ago • 29
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published 8 days ago • 235
Tstars-Tryon 1.0: Robust and Realistic Virtual Try-On for Diverse Fashion Items Paper • 2604.19748 • Published 9 days ago • 248
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation Paper • 2604.18168 • Published 10 days ago • 97