VideoSeeker: Incentivizing Instance-level Video Understanding via Native Agentic Tool Invocation Paper • 2605.16079 • Published 15 days ago • 28
RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published 19 days ago • 76
MolmoAct2: Action Reasoning Models for Real-world Deployment Paper • 2605.02881 • Published 26 days ago • 345
GUI-G^2: Gaussian Reward Modeling for GUI Grounding Paper • 2507.15846 • Published Jul 21, 2025 • 135
Internal Consistency and Self-Feedback in Large Language Models: A Survey Paper • 2407.14507 • Published Jul 19, 2024 • 48