Qwen3-GSM8K-LoRA

Qwen3-GSM8K-LoRA is a lightweight fine-tuned version of Qwen3-0.6B, adapted for multi-step mathematical reasoning on the GSM8K dataset. The model learns to produce explicit chain-of-thought reasoning followed by a numeric answer.

Model type: LoRA fine-tuned Qwen3-0.6B-base

Task: Mathematical reasoning and step-by-step problem solving

Base model: Qwen3-0.6B-base

Dataset: GSM8K (OpenAI)

Fine-tuning method: Low-Rank Adaptation (LoRA)

Training Details

Technique: LoRA fine-tuning (rank = 8, alpha = 16, dropout = 0.05)

Epochs: 3

Batch size: 2

Learning rate: 2e-4

Precision: bfloat16 / mixed

Evaluation [GSM8K (test = 1,319)]

Qwen3-0.6B (base): 33.39 %

Qwen3-GSM8K-LoRA: 35.41 %

Evaluation based on exact match of final numeric answers.

Limitations

This version includes preliminary results; further evaluation and dataset reproducibility code will be added.

May produce incorrect or verbose reasoning steps on complex multi-step problems.

Not intended for production or educational use without verification.

License

cc-by-nc-4.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Neural-Hacker/Qwen3-GSM8K

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

(601)

this model

Neural-Hacker
/

Qwen3-GSM8K

Model tree for Neural-Hacker/Qwen3-GSM8K

Dataset used to train Neural-Hacker/Qwen3-GSM8K