qwen-3b-reasoning / README.md
TheJoeZenOne's picture
Trained with Unsloth
b2103b4 verified
---
license: apache-2.0
base_model:
- Qwen/Qwen2.5-VL-3B-Instruct
tags:
- unsloth
- trl
- grpo
---