Qwen3 models (123M/300M/600M) trained from scratch on 2.47B kk+ru tokens. Includes tokenizer, datasets, and checkpoints.
Saken Tukenov PRO
stukenov
AI & ML interests
None yet
Recent Activity
updated a dataset about 13 hours ago
stukenov/sozkz-corpus-tokenized-kk-morphbpe256k-v1 published a dataset about 14 hours ago
stukenov/sozkz-corpus-tokenized-kk-morphbpe256k-v1 updated a model 15 days ago
stukenov/sozkz-morphbpe-256k-kk-v1Organizations
models 74
stukenov/sozkz-morphbpe-256k-kk-v1
Token Classification • Updated
stukenov/sozkz-fix-mt5-50m-kk-gec-v1
Text Generation • 50.6M • Updated • 17
stukenov/sozkz-nllb-1b-kk-pretrain-v1
Translation • 1B • Updated • 50
stukenov/sozkz-nllb-1b-kk-gec-v1
1B • Updated • 63
stukenov/sozkz-fix-mt5b-kk-gec-run13-v1
Text Generation • 0.6B • Updated • 7
stukenov/sozkz-fix-qwen-500m-kk-gec-v1
Text Generation • 0.4B • Updated • 246
stukenov/sozkz-fix-qwen-500m-kk-gec-v2
Text Generation • 0.4B • Updated • 96
stukenov/sozkz-fix-qwen-500m-kk-gec-v3
Text Generation • 0.4B • Updated • 658
stukenov/sozkz-fix-qwen-500m-kk-gec-v4
Text Generation • 0.4B • Updated • 653
stukenov/sozkz-core-llama-300m-kk-gec-v1
Text Generation • 0.3B • Updated • 2
datasets 54
stukenov/sozkz-corpus-tokenized-kk-morphbpe256k-v1
Viewer • Updated • 1.51M
stukenov/sozkz-corpus-segmented-kk-v1
Viewer • Updated • 55.5M • 289
stukenov/sozkz-corpus-gec-benchmark-kk-v1
Viewer • Updated • 1.44k • 163
stukenov/sozkz-corpus-pretrain-gec-mix-v1
Viewer • Updated • 1.77M • 76
stukenov/sozkz-corpus-synthetic-kk-gec-rulebased-v1
Viewer • Updated • 1.06M • 29
stukenov/sozkz-corpus-synthetic-kk-gec-v1
Viewer • Updated • 19.3k • 59
stukenov/sozkz-gec-synthetic-gpt4o-v1
Viewer • Updated • 9.6k • 86
stukenov/sozkz-corpus-clean-v3
Viewer • Updated • 13.5M • 36
stukenov/sozkz-corpus-instruct-kk-alpaca-qwen35-v1
Viewer • Updated • 4.88k • 20 • 1
stukenov/kaznet-crawl-raw
Viewer • Updated • 1.55M • 6 • 1