eladlaor/chat-message-multilabel
Viewer • Updated • 5.49k • 6
How to use eladlaor/chat-message-tagger-deberta-v3 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="eladlaor/chat-message-tagger-deberta-v3") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("eladlaor/chat-message-tagger-deberta-v3")
model = AutoModelForSequenceClassification.from_pretrained("eladlaor/chat-message-tagger-deberta-v3")A multi-label text classifier that assigns 15 semantic labels to group chat messages. Built for enriching chat data with structured content signals — useful for community analytics, newsletter curation, discussion ranking, and moderation tooling.
Given a chat message, the model outputs 15 independent sigmoid scores (0.0 to 1.0), each indicating the probability that the message belongs to that category. Multiple labels can be active simultaneously.
Example:
"Has anyone tried fine-tuning DeBERTa for multi-label? I got NaN gradients in fp16."
| Label | Score | Active |
|---|---|---|
| professional | 0.94 | Yes |
| question | 0.91 | Yes |
| experience_sharing | 0.72 | Yes |
| substantive | 0.88 | Yes |
| how_to | 0.15 | No |
| ... | ... | ... |
| # | Label | Definition | Test F1 | AUC-ROC |
|---|---|---|---|---|
| 1 | professional |
Domain-relevant professional substance | 0.839 | 0.882 |
| 2 | question |
Asks for information, help, or advice | 0.889 | 0.977 |
| 3 | experience_sharing |
First-hand account of trying or building something | 0.631 | 0.844 |
| 4 | resource |
Shares a link, tool, paper, or tutorial | 0.746 | 0.928 |
| 5 | opinion |
Subjective take, prediction, or stance | 0.626 | 0.853 |
| 6 | how_to |
Concrete tip, solution, or workaround | 0.530 | 0.844 |
| 7 | humor |
Joke, meme, sarcasm, playful remark | 0.347 | 0.792 |
| 8 | announcement |
Community event, meetup, release, group news | 0.723 | 0.969 |
| 9 | off_group_topic |
Content unrelated to community's purpose | 0.258 | 0.753 |
| 10 | reaction |
Acknowledgment, agreement, thanks, emoji-only | 0.725 | 0.950 |
| 11 | substantive |
High information density, summary-worthy | 0.787 | 0.963 |
| 12 | discussion_init |
Initiates a new topic or conversation | 0.706 | 0.870 |
| 13 | emotional |
Highly emotional tone (excitement, frustration) | 0.437 | 0.852 |
| 14 | disagreement |
Adversarial or confrontational disagreement | 0.123 | 0.889 |
| 15 | positive_reinforcement |
Encouragement or gratitude | 0.603 | 0.873 |
import json
import torch
import numpy as np
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_name = "eladlaor/chat-message-tagger-deberta-v3"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()
# Load per-label thresholds (optimized on validation set)
from huggingface_hub import hf_hub_download
thresholds_path = hf_hub_download(model_name, "thresholds.json")
with open(thresholds_path) as f:
thresholds = json.load(f)
# Label names in model output order
label_names = [
"professional", "question", "experience_sharing", "resource", "opinion",
"how_to", "humor", "announcement", "off_group_topic", "reaction",
"substantive", "discussion_init", "emotional", "disagreement",
"positive_reinforcement",
]
# Predict
text = "Has anyone tried fine-tuning DeBERTa? I got NaN gradients in fp16."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
logits = model(**inputs).logits[0]
scores = torch.sigmoid(logits).numpy()
# Apply per-label thresholds
for name, score in zip(label_names, scores):
threshold = thresholds.get(name, 0.5)
active = "<<" if score > threshold else ""
print(f" {name:30s} {score:.3f} {active}")
| Metric | Value |
|---|---|
| Mean F1 (macro) | 0.598 |
| Hamming Loss | 0.123 |
| Subset Accuracy | 0.193 |
| Mean AUC-ROC | 0.883 |
| Mean ECE | 0.136 |
| File | Description |
|---|---|
model.safetensors |
Merged model weights (LoRA baked in) |
config.json |
Model config with problem_type=multi_label_classification |
tokenizer.json |
Tokenizer with 7 special tokens ([URL], [MENTION], etc.) |
thresholds.json |
Per-label classification thresholds (optimized on validation set) |
label_taxonomy.json |
Label definitions and examples for downstream consumers |
evaluation.json |
Full per-label metrics on test set |
disagreement (F1=0.12), off_group_topic (F1=0.26), and humor (F1=0.35) have low F1 due to limited training data. Their AUC-ROC is strong (>0.75), so they rank correctly but thresholding is unreliable.Apache 2.0
@misc{chat-message-tagger-2026,
author = {Elad Laor},
title = {Chat Message Tagger: Multi-Label Classification for Group Chat Messages},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/eladlaor/chat-message-tagger-deberta-v3}
}
Base model
microsoft/deberta-v3-base