MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives Paper • 2512.14699 • Published 1 day ago • 14
Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling Paper • 2512.12675 • Published 3 days ago • 37
SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder Paper • 2512.11749 • Published 5 days ago • 34
UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation Paper • 2512.07831 • Published 9 days ago • 16
MultiShotMaster: A Controllable Multi-Shot Video Generation Framework Paper • 2512.03041 • Published 15 days ago • 62
Monet: Reasoning in Latent Visual Space Beyond Images and Language Paper • 2511.21395 • Published 21 days ago • 15
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO Paper • 2511.16669 • Published 27 days ago • 31
Simulating the Visual World with Artificial Intelligence: A Roadmap Paper • 2511.08585 • Published Nov 11 • 29
MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs Paper • 2511.07250 • Published Nov 10 • 17
VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models Paper • 2511.02712 • Published Nov 4 • 4
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes Paper • 2510.26800 • Published Oct 30 • 21
VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning Paper • 2510.25772 • Published Oct 29 • 32
VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning Paper • 2510.10518 • Published Oct 12 • 18
PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning Paper • 2510.13809 • Published Oct 15 • 37
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration Paper • 2510.10395 • Published Oct 12 • 30
UniMMVSR: A Unified Multi-Modal Framework for Cascaded Video Super-Resolution Paper • 2510.08143 • Published Oct 9 • 20