Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers Paper • 2505.19439 • Published May 26, 2025 • 30
ReSearch Collection Trained models as described in the paper "ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning" • 5 items • Updated Mar 27, 2025 • 7
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published Mar 25, 2025 • 19