HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30, 2025 • 43
CiteGuard: Faithful Citation Attribution for LLMs via Retrieval-Augmented Validation Paper • 2510.17853 • Published Oct 15, 2025 • 8
Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs Paper • 2406.04460 • Published Jun 6, 2024 • 1
Scaling LLM Inference with Optimized Sample Compute Allocation Paper • 2410.22480 • Published Oct 29, 2024
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? Paper • 2506.11928 • Published Jun 13, 2025 • 24
AutoCode: LLMs as Problem Setters for Competitive Programming Paper • 2510.12803 • Published Sep 29, 2025 • 1
MessIRve: A Large-Scale Spanish Information Retrieval Dataset Paper • 2409.05994 • Published Sep 9, 2024
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30, 2025 • 43
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30, 2025 • 43
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Paper • 2503.20776 • Published Mar 26, 2025 • 10
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published Mar 6, 2025 • 15