Audio-Text-to-Text
Adapters
Safetensors
phi4mm
custom_code
guoshoutao commited on
Commit
5d73f67
·
verified ·
1 Parent(s): 842b9bb

Update README.md

Browse files

![model.png](https://cdn-uploads.huggingface.co/production/uploads/66680c0505c407bfea87667c/sQXBy7hBmeFYtYbttLuI_.png)

Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -6,4 +6,32 @@ base_model:
6
  - microsoft/Phi-4-multimodal-instruct
7
  pipeline_tag: audio-text-to-text
8
  library_name: adapter-transformers
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - microsoft/Phi-4-multimodal-instruct
7
  pipeline_tag: audio-text-to-text
8
  library_name: adapter-transformers
9
+ ---
10
+
11
+ # The model for the paper 'StreamUni: Achieving Streaming Speech Translation with a Unified Large Speech-Language Model'
12
+ <div align="center">
13
+ <a data-pswp-width='1000' data-pswp-height='800' target='_blank' href="https://cdn-uploads.huggingface.co/production/uploads/66680c0505c407bfea87667c/sQXBy7hBmeFYtYbttLuI_.png"><img src="https://cdn-uploads.huggingface.co/production/uploads/66680c0505c407bfea87667c/sQXBy7hBmeFYtYbttLuI_.png" alt="model.png" width="1000"/></a>
14
+ </div>
15
+
16
+ ## Usage
17
+
18
+ ### Requirements
19
+
20
+ Phi-4 family has been integrated in the `4.48.2` version of `transformers`. The current `transformers` version can be verified with: `pip list | grep transformers`.
21
+ We suggest to run with Python 3.10.
22
+ Examples of required packages:
23
+ ```
24
+ flash_attn==2.7.4.post1
25
+ torch==2.6.0
26
+ transformers==4.48.2
27
+ accelerate==1.3.0
28
+ soundfile==0.13.1
29
+ pillow==11.1.0
30
+ scipy==1.15.2
31
+ torchvision==0.21.0
32
+ backoff==2.2.1
33
+ peft==0.13.2
34
+ ```
35
+
36
+ ## Training Datasets
37
+ - https://huggingface.co/ICTNLP/StreamUni