YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Created by Peter Bednár from the Institute of Artificial Intelligence (FEI TUKE), Marek Dobeš from the Centre of Social and Psychological Sciences (Slovak Academy of Sciences) and ČZ. o.z., and Radovan Garabík from the Ľudovít Štúr Institute of Linguistics (Slovak Academy of Sciences). The first instruction model trained specifically for the Slovak language.
Ve used the multilingual Qwen3-14B-Instruct model, featuring 14 billion parameters, as the foundation. To perform full-parameter fine-tuning, we utilized:
Data from the "Araneum Slovacum VII Maximum" web corpus, provided by the UNESCO Chair in Plurilingual and Multicultural Communication at Comenius University and the Ľudovít Štúr Institute of Linguistics.
Pre-processed data from the Dictionary of the Slovak Language, provided by the Ľudovít Štúr Institute of Linguistics.
Data from the Encyclopaedia Beliana, provided by the Encyclopedic Institute of the Slovak Academy of Sciences.
The training of the Qwen3-SK model was conducted on the Leonardo and Perun supercomputers. The necessary computing time was secured through a successful project proposal within the national call for access to the Leonardo supercomputer, coordinated by the Computing Centre of the Slovak Academy of Sciences. Computing time on the Perun supercomputer was provided by the Technical University of Košice.
Inference
For inference testing, we utilized a single NVIDIA H200 GPU. To run the model in interactive mode, we allocated resources using the following setup:
Resource allocation for interactive mode
srun --partition=GPU --nodes=1 --ntasks=1 --cpus-per-task=4 --gres=gpu:1 --time=02:00:00 --pty bash
Path configuration
WORK=$SLURM_SUBMIT_DIR
export HF_HOME=${WORK}/huggingface
Environment activation
source ~/miniconda3/etc/profile.d/conda.sh
conda activate venv
We used this python script for inference:
import os
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
# Model path
MODEL_PATH = "/pytorch_model.bin/"
# 1. INITIALISATION OF TOKENIZER
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
def make_prompt(text: str) -> str:
# Qwen Template
return (
f"<|im_start|>system\n"
f"Si užitočný a presný asistent. Odpovedaj výlučne po slovensky. Nepoužívaj čechizmy. "
f"Pri použití úvodzoviek pre priamu reč alebo významy slov vždy použite tvar „dolné na začiatku, horné na konci“.\n"
f"<|im_end|>\n"
f"<|im_start|>user\n"
f"{text}\n"
f"<|im_end|>\n"
f"<|im_start|>assistant\n"
)
# 2. INITIALISATION OF LLM (vLLM)
llm = LLM(
model=MODEL_PATH,
tensor_parallel_size=1,
dtype="bfloat16",
enforce_eager=True,
max_model_len=32768
)
sampling_params = SamplingParams(
temperature=0.2,
top_p=0.9,
max_tokens=8192,
stop=["<|im_end|>", "<|im_start|>", "<|endoftext|>"]
)
# 3. INTERACTIVE LOOP
print("\n--- Interactive mode (type 'exit' to exit) ---")
while True:
# Input
user_input = input("\nZadaj text: ").strip()
if user_input.lower() in ['exit', 'quit', 'koniec']:
break
if not user_input:
continue
# Prompt
full_prompt = make_prompt(user_input)
# Generation
# vLLM requires a list
outputs = llm.generate([full_prompt], sampling_params)
# Output
for output in outputs:
generated_text = output.outputs[0].text.strip()
print(f"\nODPOVEĎ: {generated_text}")
print("\nEnd.")
Limitations of the model
The model occasionally exhibits repetitive behavior in its responses. Hallucination rates are comparable to other models of similar scale. Please note that no output moderation or safety filtering techniques have been applied.
- Downloads last month
- 8