Fine Tuning - a Legion-96 Collection

Legion-96 's Collections

Fine Tuning

updated Jun 17, 2025

Fine-Tuning Language Models from Human Preferences

Paper • 1909.08593 • Published Sep 18, 2019 • 3
PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models

Paper • 2503.02324 • Published Mar 4, 2025
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study

Paper • 2504.00829 • Published Apr 1, 2025
GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning

Paper • 2504.02546 • Published Apr 3, 2025 • 2
RL of Thoughts: Navigating LLM Reasoning with Inference-time Reinforcement Learning

Paper • 2505.14140 • Published May 20, 2025 • 1
SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM

Paper • 2504.14286 • Published Apr 19, 2025 • 2
General-Reasoner: Advancing LLM Reasoning Across All Domains

Paper • 2505.14652 • Published May 20, 2025 • 24
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

Paper • 2405.15793 • Published May 6, 2024 • 7
VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software

Paper • 2505.24838 • Published May 30, 2025
CAD-Recode: Reverse Engineering CAD Code from Point Clouds

Paper • 2412.14042 • Published Dec 18, 2024 • 6
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

Paper • 2404.00987 • Published Apr 1, 2024 • 23
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images

Paper • 2504.04753 • Published Apr 7, 2025 • 1
Neural Kernel Surface Reconstruction

Paper • 2305.19590 • Published May 31, 2023
MaTe3D: Mask-guided Text-based 3D-aware Portrait Editing

Paper • 2312.06947 • Published Dec 12, 2023
SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling

Paper • 2506.07636 • Published Jun 9, 2025 • 1