Qwen3 From Scratch Distillation Checkpoints
This repository contains chapter 8 distillation checkpoints for the rasbt/qwen3-from-scratch model from Build a Reasoning Model (From Scratch).
These files are raw PyTorch state_dict checkpoints intended for use with the reasoning_from_scratch package.
Available Checkpoints
- ch08_distill_deepseek_r1: the 3 DeepSeek-R1 distillation checkpoints used for rows 3-5 in
ch08_main.ipynb - ch08_distill_qwen3_235b_a22b: the 3 Qwen3 235B A22B distillation checkpoints used for rows 6-8 in
ch08_main.ipynb
Usage Example
The checkpoints in this repository are intended for use with the reasoning_from_scratch package.
For a DeepSeek-R1 distillation checkpoint, you can download it via:
from reasoning_from_scratch.qwen3 import download_qwen3_distill_checkpoints
download_qwen3_distill_checkpoints(
distill_type="deepseek_r1",
step="06682",
out_dir="qwen3",
)
For a Qwen3 235B A22B distillation checkpoint, use:
from reasoning_from_scratch.qwen3 import download_qwen3_distill_checkpoints
download_qwen3_distill_checkpoints(
distill_type="qwen3_235b_a22b",
step="05746",
out_dir="qwen3",
)
Once downloaded, you can load a checkpoint and stream text as follows:
from pathlib import Path
import torch
from reasoning_from_scratch.ch02 import (
get_device,
generate_text_basic_stream_cache,
)
from reasoning_from_scratch.qwen3 import (
download_qwen3_distill_checkpoints,
download_qwen3_small,
Qwen3Model,
Qwen3Tokenizer,
QWEN_CONFIG_06_B,
)
device = get_device()
local_dir = Path("qwen3")
checkpoint_path = download_qwen3_distill_checkpoints(
distill_type="deepseek_r1",
step="06682",
out_dir=local_dir,
)
download_qwen3_small(kind="reasoning", tokenizer_only=True, out_dir=local_dir)
tokenizer = Qwen3Tokenizer(
tokenizer_file_path=local_dir / "tokenizer-reasoning.json",
apply_chat_template=True,
add_generation_prompt=True,
add_thinking=True,
)
model = Qwen3Model(QWEN_CONFIG_06_B)
state_dict = torch.load(checkpoint_path, map_location=device)
model.load_state_dict(state_dict)
model.to(device)
model.eval()
prompt = "Solve: If x + 7 = 19, what is x?"
input_ids = torch.tensor(tokenizer.encode(prompt), device=device).unsqueeze(0)
for token in generate_text_basic_stream_cache(
model=model,
token_ids=input_ids,
max_new_tokens=256,
eos_token_id=tokenizer.eos_token_id,
):
token_id = token.squeeze(0).item()
print(tokenizer.decode([token_id]), end="", flush=True)
print()
Notes
- These are the exact epoch checkpoints used in the chapter 8 results table in
ch08_main.ipynb. - Both checkpoint families should be used with the reasoning tokenizer for consistency with chapter 8.
Download Helper Reference
These are the supported distill_type values for download_qwen3_distill_checkpoints(...):
- DeepSeek-R1 distillation data:
download_qwen3_distill_checkpoints(
distill_type="deepseek_r1",
step="06682",
out_dir="qwen3",
)
Available DeepSeek-R1 saved steps: 06682, 13364, 20046.
- Qwen3 235B A22B distillation data:
download_qwen3_distill_checkpoints(
distill_type="qwen3_235b_a22b",
step="05746",
out_dir="qwen3",
)
Available Qwen3 235B A22B saved steps: 05746, 11492, 17238.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support