Instructions to use suayptalha/Qwen3-0.6B-Code-Expert with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use suayptalha/Qwen3-0.6B-Code-Expert with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="suayptalha/Qwen3-0.6B-Code-Expert") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("suayptalha/Qwen3-0.6B-Code-Expert") model = AutoModelForCausalLM.from_pretrained("suayptalha/Qwen3-0.6B-Code-Expert") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use suayptalha/Qwen3-0.6B-Code-Expert with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "suayptalha/Qwen3-0.6B-Code-Expert" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "suayptalha/Qwen3-0.6B-Code-Expert", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/suayptalha/Qwen3-0.6B-Code-Expert
- SGLang
How to use suayptalha/Qwen3-0.6B-Code-Expert with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "suayptalha/Qwen3-0.6B-Code-Expert" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "suayptalha/Qwen3-0.6B-Code-Expert", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "suayptalha/Qwen3-0.6B-Code-Expert" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "suayptalha/Qwen3-0.6B-Code-Expert", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use suayptalha/Qwen3-0.6B-Code-Expert with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for suayptalha/Qwen3-0.6B-Code-Expert to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for suayptalha/Qwen3-0.6B-Code-Expert to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for suayptalha/Qwen3-0.6B-Code-Expert to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="suayptalha/Qwen3-0.6B-Code-Expert", max_seq_length=2048, ) - Docker Model Runner
How to use suayptalha/Qwen3-0.6B-Code-Expert with Docker Model Runner:
docker model run hf.co/suayptalha/Qwen3-0.6B-Code-Expert
Qwen3-0.6B-Code-Expert
This project performs full fine-tuning on the Qwen3-0.6B language model to enhance its code reasoning and generation capabilities. Training was conducted exclusively on the nvidia/OpenCodeReasoning dataset, and the model was optimized using the bfloat16 (bf16) data type.
Training Procedure
Dataset Preparation
nvidia/OpenCodeReasoningdataset was used.- Each example consists of code snippets paired with detailed step-by-step reasoning in Chain-of-Thought (CoT) style.
Model Loading and Configuration
- Qwen3-0.6B base model weights were loaded via the
unslothlibrary in bf16 precision. - Full fine-tuning (
full_finetuning=True) was applied to all layers for optimal adaptation to code reasoning.
- Qwen3-0.6B base model weights were loaded via the
Supervised Fine-Tuning
- Employed the Hugging Face TRL library with the Supervised Fine-Tuning (SFT) approach.
- The model was trained to generate correct code solutions along with the corresponding reasoning chains.
Purpose and Outcome
- The model’s capacity for understanding, reasoning about, and generating code was significantly improved through specialized, single-dataset training in bf16 precision.
- Outputs include both intermediate reasoning steps and final code solutions, enabling transparent and interpretable code generation.
License
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
Support
- Downloads last month
- 756
Model tree for suayptalha/Qwen3-0.6B-Code-Expert
Base model
Qwen/Qwen3-0.6B-Base