Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

ViT-tiny LoRA adapter on Food-101

A LoRA adapter that teaches WinKawaks/vit-tiny-patch16-224 to classify images from the Food-101 dataset (101 food categories) while leaving the original pretrained weights mathematically untouched.

  • Base model: WinKawaks/vit-tiny-patch16-224 (~5.7M params)
  • Dataset: Food-101 (75,750 train / 25,250 test, 101 classes)
  • Method: LoRA on attention query + value projections + a fresh 101-way classification head
  • Demo Space: turhancan97/vit-tiny-imagenet-demo

How it works

The backbone is never fine-tuned. Instead a low-rank update $\Delta W = BA$ (with rank $r = 8$) is added to each attention projection, and a separate 101-class linear head is trained on top of the pooled CLS features. The full artifact is tiny (~1–2 MB) and additive — disabling the adapter at inference time recovers the exact original ImageNet-1k model.

adapter_config.json           # PEFT LoRA config
adapter_model.safetensors     # LoRA weights (B, A matrices)
classifier.pt                 # 101-way Linear head (state_dict)
labels.json                   # {"0": "apple_pie", "1": "baby_back_ribs", ...}
preprocessor_config.json      # image processor (224x224, standard ImageNet norm)

Training

Trained with the script at turhancan97/vit-tiny-imagenet-demo/train_lora.py:

python train_lora.py \
  --rank 8 --alpha 16 --dropout 0.1 \
  --target-modules query value \
  --epochs 5 --batch-size 64 --lr 5e-4 \
  --warmup-ratio 0.03 --weight-decay 0.0 \
  --push-to-hub turhancan97/vit-tiny-lora-food101

Hyperparameters

Setting Value
LoRA rank 8
LoRA alpha 16
LoRA dropout 0.1
Target modules query, value
Optimizer AdamW (HF Trainer default)
Learning rate 5e-4
Batch size 64
Epochs 5
Warmup ratio 0.03
Weight decay 0.0
Precision FP16
Augmentation RandomResizedCrop(0.8–1.0), RandomHorizontalFlip

Trainable parameters: 93k of ~5.6M total (**1.7%**).

Evaluation

Evaluated on the Food-101 test split (25,250 images).

Metric Value
Top-1 accuracy 85 %
Top-5 accuracy 90 %

Usage

The adapter uses the standard PEFT format plus a sidecar classifier.pt and labels.json. Minimal loader:

import json
import torch
from huggingface_hub import hf_hub_download
from peft import PeftModel
from torch import nn
from transformers import AutoImageProcessor, AutoModelForImageClassification

BASE = "WinKawaks/vit-tiny-patch16-224"
ADAPTER = "turhancan97/vit-tiny-lora-food101"

processor = AutoImageProcessor.from_pretrained(BASE, use_fast=True)
base = AutoModelForImageClassification.from_pretrained(BASE)
model = PeftModel.from_pretrained(base, ADAPTER)

id2label = {int(k): v for k, v in json.loads(
    open(hf_hub_download(ADAPTER, "labels.json")).read()
).items()}

head_state = torch.load(
    hf_hub_download(ADAPTER, "classifier.pt"), map_location="cpu", weights_only=True
)
head = nn.Linear(base.config.hidden_size, len(id2label))
head.load_state_dict(head_state)
model.base_model.model.classifier = head

model.eval()

Inference:

from PIL import Image

image = Image.open("my_food.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.inference_mode():
    logits = model(**inputs).logits[0]
topk = logits.softmax(-1).topk(5)
for score, idx in zip(topk.values, topk.indices):
    print(f"{id2label[idx.item()]:30s} {score.item():.3f}")

Switching back to the base model (ImageNet-1k, 1000 classes) without unloading:

with model.disable_adapter():
    logits = base(**inputs).logits  # uses the pristine pretrained weights

Intended use

  • Educational / demo use for showing how LoRA adds new capabilities to a frozen backbone.
  • Classifying photos of prepared food into the Food-101 taxonomy.

Limitations

  • Only 101 food categories; anything outside the taxonomy will be misclassified.
  • Trained on Food-101 which is mostly western/restaurant-style dishes, with label noise in the original data.
  • ViT-tiny is a low-capacity backbone; a larger base model would likely get higher accuracy with the same adapter recipe.

License

Apache-2.0, matching the base model and the Food-101 dataset license.

Citation

If you use this adapter, please cite the underlying works:

@inproceedings{hu2022lora,
  title={{LoRA}: Low-Rank Adaptation of Large Language Models},
  author={Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
  booktitle={ICLR},
  year={2022}
}

@inproceedings{bossard2014food101,
  title={Food-101 -- Mining Discriminative Components with Random Forests},
  author={Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc},
  booktitle={ECCV},
  year={2014}
}
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for turhancan97/vit-tiny-lora-food101

Adapter
(1)
this model

Dataset used to train turhancan97/vit-tiny-lora-food101

Space using turhancan97/vit-tiny-lora-food101 1