The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment Paper • 2511.20614 • Published Nov 25, 2025 • 37
LongCodeZip: Compress Long Context for Code Language Models Paper • 2510.00446 • Published Oct 1, 2025 • 106
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Paper • 2509.22220 • Published Sep 26, 2025 • 65
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13, 2025 • 191
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor Paper • 2312.07661 • Published Dec 12, 2023 • 19
Holodeck: Language Guided Generation of 3D Embodied AI Environments Paper • 2312.09067 • Published Dec 14, 2023 • 15
ProNeRF: Learning Efficient Projection-Aware Ray Sampling for Fine-Grained Implicit Neural Radiance Fields Paper • 2312.08136 • Published Dec 13, 2023 • 6
Clockwork Diffusion: Efficient Generation With Model-Step Distillation Paper • 2312.08128 • Published Dec 13, 2023 • 13
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation Paper • 2312.07424 • Published Dec 12, 2023 • 10
FreeInit: Bridging Initialization Gap in Video Diffusion Models Paper • 2312.07537 • Published Dec 12, 2023 • 27
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition Paper • 2312.07536 • Published Dec 12, 2023 • 18
Foundation Models in Robotics: Applications, Challenges, and the Future Paper • 2312.07843 • Published Dec 13, 2023 • 16
Honeybee: Locality-enhanced Projector for Multimodal LLM Paper • 2312.06742 • Published Dec 11, 2023 • 13
"I Want It That Way": Enabling Interactive Decision Support Using Large Language Models and Constraint Programming Paper • 2312.06908 • Published Dec 12, 2023 • 8