Commit ·
74d9e65
1
Parent(s): a8646ee
Standardize model card
Browse files
README.md
CHANGED
|
@@ -10,19 +10,20 @@ tags:
|
|
| 10 |
|
| 11 |
# DeltaTok (Tokenizer) — Kinetics-700
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
-
[**Project Page**](https://deltatok.github.io) | [**GitHub**](https://github.com/amazon-far/deltatok)
|
| 16 |
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
## Model Description
|
| 20 |
-
|
| 21 |
-
This repository contains the ViT-B encoder and decoder trained on Kinetics-700 at 512x512 resolution. The model is designed to work with a frozen [DINOv3](https://github.com/facebookresearch/dinov3) ViT-B backbone (not included).
|
| 22 |
|
| 23 |
## Usage
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
## Acknowledgements
|
| 28 |
|
|
|
|
| 10 |
|
| 11 |
# DeltaTok (Tokenizer) — Kinetics-700
|
| 12 |
|
| 13 |
+
DeltaTok is a video tokenizer that encodes the vision foundation model (VFM) feature differences between consecutive frames into a single continuous "delta" token, as introduced in [A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens](https://huggingface.co/papers/2604.04913) (CVPR 2026). This approach significantly reduces the token count in video sequences (e.g., 1,024x reduction) and enables efficient generative world modeling.
|
| 14 |
|
| 15 |
+
[**Project Page**](https://deltatok.github.io) | [**GitHub**](https://github.com/amazon-far/deltatok) | [**Paper**](https://huggingface.co/papers/2604.04913)
|
| 16 |
|
| 17 |
+
This repository contains the ViT-B encoder and decoder trained on Kinetics-700 at 512x512 resolution.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
## Usage
|
| 20 |
|
| 21 |
+
Requires a frozen [DINOv3](https://github.com/facebookresearch/dinov3) ViT-B backbone. Full training and evaluation code is available in the [DeltaTok GitHub repository](https://github.com/amazon-far/deltatok). To evaluate:
|
| 22 |
+
|
| 23 |
+
```bash
|
| 24 |
+
python main.py validate -c configs/deltatok_vitb_dinov3_vitb_kinetics.yaml \
|
| 25 |
+
--model.ckpt_path=path/to/deltatok-kinetics/pytorch_model.bin
|
| 26 |
+
```
|
| 27 |
|
| 28 |
## Acknowledgements
|
| 29 |
|