SafeGRPO: Self-Rewarded Multimodal Safety Alignment via Rule-Governed Policy Optimization Paper • 2511.12982 • Published Nov 17, 2025 • 3
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19, 2025 • 118
Backdoor Cleaning without External Guidance in MLLM Fine-tuning Paper • 2505.16916 • Published May 22, 2025 • 17
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning Paper • 2505.11049 • Published May 16, 2025 • 60
ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers Paper • 2504.00502 • Published Apr 1, 2025 • 26
Efficient Inference for Large Reasoning Models: A Survey Paper • 2503.23077 • Published Mar 29, 2025 • 46
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation Paper • 2503.19622 • Published Mar 25, 2025 • 31
A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations Paper • 2502.14881 • Published Feb 14, 2025 • 2
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues Paper • 2502.12084 • Published Feb 17, 2025 • 32
Logical Reasoning in Large Language Models: A Survey Paper • 2502.09100 • Published Feb 13, 2025 • 24
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper • 2502.09621 • Published Feb 13, 2025 • 28