argilla/dpo-mix-7k
Viewer • Updated • 7.5k • 907 • 174
How to use Columbia-NLP/gemma-2b-zephyr-dpo with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Columbia-NLP/gemma-2b-zephyr-dpo")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM
tokenizer = AutoTokenizer.from_pretrained("Columbia-NLP/gemma-2b-zephyr-dpo")
model = AutoModelForMultimodalLM.from_pretrained("Columbia-NLP/gemma-2b-zephyr-dpo")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use Columbia-NLP/gemma-2b-zephyr-dpo with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Columbia-NLP/gemma-2b-zephyr-dpo"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Columbia-NLP/gemma-2b-zephyr-dpo",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/Columbia-NLP/gemma-2b-zephyr-dpo
How to use Columbia-NLP/gemma-2b-zephyr-dpo with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Columbia-NLP/gemma-2b-zephyr-dpo" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Columbia-NLP/gemma-2b-zephyr-dpo",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Columbia-NLP/gemma-2b-zephyr-dpo" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Columbia-NLP/gemma-2b-zephyr-dpo",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use Columbia-NLP/gemma-2b-zephyr-dpo with Docker Model Runner:
docker model run hf.co/Columbia-NLP/gemma-2b-zephyr-dpo
We trained the google/gemma-2b with DPO and data from argilla/dpo-mix-7k.
We carefully selected the hyper-parameters to achieve the best DPO performance.
This model has the same license as the original Gemma model collection
| Models | Avg. | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8k |
|---|---|---|---|---|---|---|---|
| google/gemma-2b | 46.37 | 48.38 | 71.77 | 41.77 | 33.08 | 66.77 | 16.91 |
| google/gemma-2b-it | 42.75 | 43.94 | 62.70 | 37.65 | 45.82 | 60.93 | 5.46 |
| wandb/gemma-2b-zephyr-sft | 47.18 | 49.74 | 72.38 | 41.37 | 34.42 | 66.93 | 18.27 |
| wandb/gemma-2b-zephyr-dpo | 46.92 | 49.66 | 72.23 | 41.13 | 34.47 | 66.54 | 17.51 |
| Columbia-NLP/gemma-2b-zephyr-sft | 48.75 | 51.80 | 72.63 | 42.20 | 41.96 | 63.85 | 20.09 |
| Columbia-NLP/gemma-2b-zephyr-dpo | 49.14 | 52.22 | 73.11 | 42.55 | 42.64 | 64.40 | 19.94 |
We evaluate our model with GPT-4-0125-preview as the judge.
| Model | Total | Coding | Extraction | Humanities | Math | Reasoning | Roleplay | STEM | Writing |
|---|---|---|---|---|---|---|---|---|---|
| google/gemma-2b-it | 4.71 | 2.95 | 4.35 | 6.15 | 2.90 | 3.50 | 5.60 | 5.50 | 6.70 |
| wandb/gemma-2b-zephyr-sft | 4.03 | 3.10 | 3.15 | 5.00 | 2.70 | 2.65 | 5.10 | 4.80 | 5.75 |
| wandb/gemma-2b-zephyr-dpo | 4.06 | 2.80 | 2.90 | 5.55 | 2.65 | 2.70 | 5.20 | 4.80 | 5.85 |
| anakin87_gemma-2b-orpo | 4.14 | 3.00 | 3.70 | 6.30 | 2.70 | 2.35 | 5.68 | 4.75 | 4.75 |
| Columbia-NLP/gemma-2b-zephyr-sft | 4.34 | 3.10 | 3.70 | 6.25 | 2.65 | 2.70 | 5.55 | 5.25 | 5.50 |
| Columbia-NLP/gemma-2b-zephyr-dpo | 4.75 | 3.50 | 4.05 | 6.75 | 3.30 | 3.70 | 5.85 | 5.40 | 5.53 |