Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Paper
•
2406.11579
•
Published
In this model repo we provide the official pretrained models used in the paper Duoduo CLIP: Efficient 3D Understanding with Multi-View Images. The model usage and code can be found in the github repo.
Note: We provide the main model in the initial release, we will soon upload the other models used in the paper.
The dataset card can be found here.
BibTeX:
@inproceedings{
lee2025duoduo,
title={Duoduo {CLIP}: Efficient 3D Understanding with Multi-View Images},
author={Han-Hung Lee and Yiming Zhang and Angel X Chang},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=iGbuc9ekKK}
}
This work was funded by a CIFAR AI Chair, an NSERC Discovery grant, and a CFI/BCKDF JELF grant.