edbeeching/decision-transformer-gym-hopper-expert Reinforcement Learning • Updated Jun 29, 2022 • 305 • 20
mradermacher/Tifa-Deepsex-14b-CoT-i1-GGUF Reinforcement Learning • 15B • Updated Feb 13, 2025 • 427 • 14
Open-Reasoner-Zero/Open-Reasoner-Zero-7B Reinforcement Learning • 8B • Updated Apr 7, 2025 • 1.64k • 34