Instructions to use Selenaydmrp/flan-t5-base-turkish-ott-query-parser with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Selenaydmrp/flan-t5-base-turkish-ott-query-parser with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Selenaydmrp/flan-t5-base-turkish-ott-query-parser")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Selenaydmrp/flan-t5-base-turkish-ott-query-parser") model = AutoModelForSeq2SeqLM.from_pretrained("Selenaydmrp/flan-t5-base-turkish-ott-query-parser") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Selenaydmrp/flan-t5-base-turkish-ott-query-parser with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Selenaydmrp/flan-t5-base-turkish-ott-query-parser" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Selenaydmrp/flan-t5-base-turkish-ott-query-parser", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Selenaydmrp/flan-t5-base-turkish-ott-query-parser
- SGLang
How to use Selenaydmrp/flan-t5-base-turkish-ott-query-parser with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Selenaydmrp/flan-t5-base-turkish-ott-query-parser" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Selenaydmrp/flan-t5-base-turkish-ott-query-parser", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Selenaydmrp/flan-t5-base-turkish-ott-query-parser" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Selenaydmrp/flan-t5-base-turkish-ott-query-parser", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Selenaydmrp/flan-t5-base-turkish-ott-query-parser with Docker Model Runner:
docker model run hf.co/Selenaydmrp/flan-t5-base-turkish-ott-query-parser
FLAN-T5 Base Turkish OTT Query Parser
This model is a fine-tuned version of google/flan-t5-base for Turkish OTT and media search query parsing.
The model takes a Turkish natural language media search query as input and generates a structured JSON output. Its purpose is to convert user queries into machine-readable filters that can be used in search, recommendation, filtering, or vector database systems.
What This Model Does
This model is designed to understand Turkish OTT/media search queries.
It can extract information such as:
- genres
- excluded genres
- names
- country filters
- language filters
- mood or theme tags
- title or channel names
- similar title requests
- rating and popularity filters
- live broadcast intent
For example, the query:
yerli dram dizileri olsun ama romantik olmasın
can be converted into:
{
"intent_type": "content_search",
"filters": {
"country_names": ["Türkiye"],
"genres": ["dram"],
"exclude_genres": ["romantik"]
}
}
Intended Use
This model can be used as a query understanding layer in Turkish OTT/media search systems.
A typical usage flow is:
User Query
→ Query Parser Model
→ Structured JSON Filters
→ Search / Filtering / Vector Database
→ Final Results
The model does not directly search for movies, series, or channels. It only extracts structured filters from the user query.
Example Inputs and Outputs
Example 1
Input:
yerli komedi filmleri
Output:
{
"intent_type": "content_search",
"filters": {
"country_names": ["Türkiye"],
"genres": ["komedi"]
}
}
Example 2
Input:
trt 1 canlı yayın
Output:
{
"intent_type": "live_content",
"filters": {
"content_tags": ["canli_yayin"],
"names": ["TRT 1"],
"exact_match": true
}
}
Example 3
Input:
popüler korku dizileri
Output:
{
"intent_type": "content_search",
"filters": {
"genres": ["korku"],
"total_rating_min": ["P75"]
}
}
How to Use
import json
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "Selenaydmrp/flan-t5-base-turkish-ott-query-parser"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id).to(device)
PREFIX = (
"parse_media_query: Sorgudan yalnızca bulunan medya arama filtrelerini JSON olarak çıkar. "
"Boş alan yazma. Kullanılabilir alanlar: content_tags, genres, exclude_genres, mood_tags, "
"names, similar_to_titles, country_names, language_names, rating_count, review_count, "
"total_rating, release_year, exact_match. Sorgu: "
)
def parse_query(query):
input_text = PREFIX + query
inputs = tokenizer(
input_text,
return_tensors="pt",
truncation=True,
max_length=256
).to(device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
num_beams=4,
do_sample=False
)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
try:
return json.loads(decoded)
except json.JSONDecodeError:
return {"raw_output": decoded}
query = "yerli dram dizileri olsun ama romantik olmasın"
result = parse_query(query)
print(json.dumps(result, ensure_ascii=False, indent=2))
Training Data
The model was fine-tuned on a Turkish OTT/media query parsing dataset.
The dataset contains Turkish search queries and their corresponding structured JSON outputs.
The training examples include:
- genre-based queries
- exclusion queries
- live broadcast queries
- popularity-based queries
- rating-based queries
- country and language filters
- similar-title queries
- short Turkish user queries
Evaluation
Preliminary validation results:
| Metric | Value |
|---|---|
| Train Loss | 0.3083 |
| Validation Loss | 0.0488 |
| Exact Match | 61.47% |
| Valid JSON Rate | 99.78% |
The Exact Match score should be interpreted carefully. In this project, Exact Match is a strict metric and can be affected by formatting differences such as extra spaces, line breaks, field ordering, or small JSON representation differences. Therefore, a prediction may be semantically correct but still counted as incorrect by Exact Match.
For this reason, the Valid JSON Rate and field-level evaluation are also important when measuring the real performance of the model.
For production use, the model should also be evaluated with:
- field-level precision
- field-level recall
- field-level F1 score
- schema validity
- intent accuracy
- real user queries
- downstream search quality
Limitations
This model is task-specific and should not be used as a general chatbot.
Known limitations:
- It may produce incorrect filters for ambiguous queries.
- It may confuse mood tags and genre labels.
- It may not recognize rare movie, series, actor, or channel names.
- It may generate valid JSON with semantically incorrect fields.
- It should be tested with real user queries before production use.
License
This model is released under the Apache 2.0 license.
Author
Developed by Selenaydmrp for Turkish OTT/media query understanding and structured search experiments.
- Downloads last month
- 31
Model tree for Selenaydmrp/flan-t5-base-turkish-ott-query-parser
Base model
google/flan-t5-base