YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Created by Peter Bednár from the Institute of Artificial Intelligence (FEI TUKE), Marek Dobeš from the Centre of Social and Psychological Sciences (Slovak Academy of Sciences) and ČZ. o.z., and Radovan Garabík from the Ľudovít Štúr Institute of Linguistics (Slovak Academy of Sciences). The first instruction model trained specifically for the Slovak language.

Ve used the multilingual Qwen3-14B-Instruct model, featuring 14 billion parameters, as the foundation. To perform full-parameter fine-tuning, we utilized:

Data from the "Araneum Slovacum VII Maximum" web corpus, provided by the UNESCO Chair in Plurilingual and Multicultural Communication at Comenius University and the Ľudovít Štúr Institute of Linguistics.

Pre-processed data from the Dictionary of the Slovak Language, provided by the Ľudovít Štúr Institute of Linguistics.

Data from the Encyclopaedia Beliana, provided by the Encyclopedic Institute of the Slovak Academy of Sciences.

The training of the Qwen3-SK model was conducted on the Leonardo and Perun supercomputers. The necessary computing time was secured through a successful project proposal within the national call for access to the Leonardo supercomputer, coordinated by the Computing Centre of the Slovak Academy of Sciences. Computing time on the Perun supercomputer was provided by the Technical University of Košice.

Inference

For inference testing, we utilized a single NVIDIA H200 GPU. To run the model in interactive mode, we allocated resources using the following setup:

Resource allocation for interactive mode

srun --partition=GPU --nodes=1 --ntasks=1 --cpus-per-task=4 --gres=gpu:1 --time=02:00:00 --pty bash

Path configuration

WORK=$SLURM_SUBMIT_DIR

export HF_HOME=${WORK}/huggingface

Environment activation

source ~/miniconda3/etc/profile.d/conda.sh

conda activate venv

We used this python script for inference:

import os
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

# Model path
MODEL_PATH = "/pytorch_model.bin/"

# 1. INITIALISATION OF TOKENIZER
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)

def make_prompt(text: str) -> str:
    # Qwen Template
    return (
        f"<|im_start|>system\n"
        f"Si užitočný a presný asistent. Odpovedaj výlučne po slovensky. Nepoužívaj čechizmy. "
        f"Pri použití úvodzoviek pre priamu reč alebo významy slov vždy použite tvar „dolné na začiatku, horné na konci“.\n"
        f"<|im_end|>\n"
        f"<|im_start|>user\n"
        f"{text}\n"
        f"<|im_end|>\n"
        f"<|im_start|>assistant\n"
    )

# 2. INITIALISATION OF LLM (vLLM)
llm = LLM(
    model=MODEL_PATH,
    tensor_parallel_size=1,
    dtype="bfloat16",
    enforce_eager=True,
    max_model_len=32768
)

sampling_params = SamplingParams(
    temperature=0.2,
    top_p=0.9,
    max_tokens=8192,
    stop=["<|im_end|>", "<|im_start|>", "<|endoftext|>"]
)

# 3. INTERACTIVE LOOP
print("\n--- Interactive mode (type 'exit' to exit) ---")

while True:
    # Input
    user_input = input("\nZadaj text: ").strip()
    
    if user_input.lower() in ['exit', 'quit', 'koniec']:
        break
    
    if not user_input:
        continue

    # Prompt
    full_prompt = make_prompt(user_input)
    
    # Generation
    # vLLM requires a list
    outputs = llm.generate([full_prompt], sampling_params)

    # Output
    for output in outputs:
        generated_text = output.outputs[0].text.strip()
        print(f"\nODPOVEĎ: {generated_text}")

print("\nEnd.")

Limitations of the model

The model occasionally exhibits repetitive behavior in its responses. Hallucination rates are comparable to other models of similar scale. Please note that no output moderation or safety filtering techniques have been applied.

Downloads last month: 8

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support