xuxin's picture

On Vacation 🏝️

3 9

xuxin

xx18

·

https://xinxu-ustc.github.io/

AI & ML interests

None yet

Recent Activity

authored a paper 4 days ago

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

upvoted a paper about 1 month ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

upvoted a paper about 1 month ago

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

View all activity

Organizations

authored a paper 4 days ago

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

Paper • 2511.15248 • Published Nov 19, 2025 • 6

upvoted 2 papers about 1 month ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 111

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

Paper • 2511.15248 • Published Nov 19, 2025 • 6

updated 13 models about 2 months ago

xx18/TFPI-Qwen3-4B-Thinking-2507-Stage3

Text Generation • 4B • Updated Nov 7, 2025 • 23

xx18/DirectRL_Qwen3-4B_baseline2

Text Generation • 4B • Updated Nov 7, 2025 • 9

xx18/DirectRL_Qwen3-4B_baseline1

Text Generation • 4B • Updated Nov 7, 2025 • 4

xx18/TFPI-Qwen3-4B-Stage3_then_RL

Text Generation • 4B • Updated Nov 7, 2025 • 5

xx18/TFPI-Qwen3-4B-Stage3

Text Generation • 4B • Updated Nov 7, 2025 • 9

xx18/TFPI-Qwen3-4B-Stage2

Text Generation • 4B • Updated Nov 7, 2025 • 14

xx18/TFPI-Qwen3-4B-Stage1

Text Generation • 4B • Updated Nov 7, 2025 • 5

xx18/DirectRL_DeepSeek-Qwen-1.5B_baseline2

Text Generation • 2B • Updated Nov 7, 2025 • 4

xx18/DirectRL_DeepSeek-Qwen-1.5B_baseline1

Text Generation • 2B • Updated Nov 7, 2025 • 4

xx18/TFPI-DeepSeek-Qwen-1.5B-Stage3_then_RL

Text Generation • 2B • Updated Nov 7, 2025 • 4

xx18/TFPI-DeepSeek-Qwen-1.5B-Stage3

Text Generation • 2B • Updated Nov 7, 2025 • 4

xx18/TFPI-DeepSeek-Qwen-1.5B-Stage2

Text Generation • 2B • Updated Nov 7, 2025 • 11

xx18/TFPI-DeepSeek-Qwen-1.5B-Stage1

Text Generation • 2B • Updated Nov 7, 2025 • 8

updated a collection about 2 months ago

TFPI

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners • 14 items • Updated Nov 7, 2025

published a model about 2 months ago

xx18/TFPI-Qwen3-4B-Thinking-2507-Stage3

Text Generation • 4B • Updated Nov 7, 2025 • 23

updated a collection about 2 months ago

TFPI

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners • 14 items • Updated Nov 7, 2025