Update README.md
Browse files
README.md
CHANGED
|
@@ -6,4 +6,32 @@ base_model:
|
|
| 6 |
- microsoft/Phi-4-multimodal-instruct
|
| 7 |
pipeline_tag: audio-text-to-text
|
| 8 |
library_name: adapter-transformers
|
| 9 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
- microsoft/Phi-4-multimodal-instruct
|
| 7 |
pipeline_tag: audio-text-to-text
|
| 8 |
library_name: adapter-transformers
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# The model for the paper 'StreamUni: Achieving Streaming Speech Translation with a Unified Large Speech-Language Model'
|
| 12 |
+
<div align="center">
|
| 13 |
+
<a data-pswp-width='1000' data-pswp-height='800' target='_blank' href="https://cdn-uploads.huggingface.co/production/uploads/66680c0505c407bfea87667c/sQXBy7hBmeFYtYbttLuI_.png"><img src="https://cdn-uploads.huggingface.co/production/uploads/66680c0505c407bfea87667c/sQXBy7hBmeFYtYbttLuI_.png" alt="model.png" width="1000"/></a>
|
| 14 |
+
</div>
|
| 15 |
+
|
| 16 |
+
## Usage
|
| 17 |
+
|
| 18 |
+
### Requirements
|
| 19 |
+
|
| 20 |
+
Phi-4 family has been integrated in the `4.48.2` version of `transformers`. The current `transformers` version can be verified with: `pip list | grep transformers`.
|
| 21 |
+
We suggest to run with Python 3.10.
|
| 22 |
+
Examples of required packages:
|
| 23 |
+
```
|
| 24 |
+
flash_attn==2.7.4.post1
|
| 25 |
+
torch==2.6.0
|
| 26 |
+
transformers==4.48.2
|
| 27 |
+
accelerate==1.3.0
|
| 28 |
+
soundfile==0.13.1
|
| 29 |
+
pillow==11.1.0
|
| 30 |
+
scipy==1.15.2
|
| 31 |
+
torchvision==0.21.0
|
| 32 |
+
backoff==2.2.1
|
| 33 |
+
peft==0.13.2
|
| 34 |
+
```
|
| 35 |
+
|
| 36 |
+
## Training Datasets
|
| 37 |
+
- https://huggingface.co/ICTNLP/StreamUni
|