galileo-ai/ragbench
Viewer • Updated • 95.4k • 4.77k • 113
A LoRA-adapted faithfulness classifier for RAG systems. Detects whether a generated answer is faithful to the retrieved context.
| Metric | Value |
|---|---|
| Balanced Accuracy | 64.7% |
| F1 Score | 0.589 |
| Cohen's Kappa | 0.217 |
| Inference Latency | 53ms |
Evaluated on a combined test set of 15,976 examples from RAGBench, RAGTruth, and HaluBench.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
model = PeftModel.from_pretrained(base, "tarun5986/MicroGuard-TinyLlama-1.1B")
tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
# Or use the MicroGuard package
from microguard import MicroGuard
guard = MicroGuard(model="tarun5986/MicroGuard-TinyLlama-1.1B", base_model="TinyLlama/TinyLlama-1.1B-Chat-v1.0")
result = guard.check(
context="The Eiffel Tower was built in 1889 by Gustave Eiffel.",
question="Who built the Eiffel Tower?",
answer="The Eiffel Tower was built by Gustave Eiffel in 1889."
)
print(result) # {'verdict': 'FAITHFUL', 'confidence': 74.2, 'latency_ms': 64.0}
MicroGuard: Sub-Billion Parameter Faithfulness Classification for Real-Time RAG QA
@article{microguard2026,
title={MicroGuard: Sub-Billion Parameter Faithfulness Classification for Real-Time RAG QA},
author={Sharma, Tarun},
journal={IEEE Access},
year={2026}
}
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0