EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control Paper • 2511.15248 • Published Nov 19, 2025 • 6
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 111
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control Paper • 2511.15248 • Published Nov 19, 2025 • 6
TFPI Collection Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners • 14 items • Updated Nov 7, 2025
TFPI Collection Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners • 14 items • Updated Nov 7, 2025