TongSIM: A General Platform for Simulating Intelligent Machines Paper • 2512.20206 • Published 4 days ago • 26
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 19 days ago • 74 • 4
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection Paper • 2505.16475 • Published May 22 • 3
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Paper • 2506.08672 • Published Jun 10 • 30
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs Paper • 2508.19594 • Published Aug 27 • 2
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 19 days ago • 74
ICM-Fusion: In-Context Meta-Optimized LoRA Fusion for Multi-Task Adaptation Paper • 2508.04153 • Published Aug 6
Make an Offer They Can't Refuse: Grounding Bayesian Persuasion in Real-World Dialogues without Pre-Commitment Paper • 2510.13387 • Published Oct 15
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia Paper • 2512.03318 • Published 25 days ago
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 19 days ago • 74