Spaces:
Running
whisper : use correct seek_end when offset is used (#833)
Browse filesWhenever an `offset_ms` is provided, the value of `seek_end` is
calculated incorrectly. This causes Whisper to keep transcribing
after the end of the file.
The current behavior looks like
```
[00:34:40.000 --> 00:34:47.000] This is an example audio file.
[00:34:47.000 --> 00:34:49.000] The text has been redacted
[00:34:49.000 --> 00:34:51.000] This is the end of the audio.
[00:34:51.000 --> 00:34:52.000] ***
[00:34:52.000 --> 00:34:53.000] ***
[00:34:53.000 --> 00:34:54.000] ***
[00:34:55.000 --> 00:34:56.000] ***
...
```
The expected behavior should be
```
[00:34:40.000 --> 00:34:47.000] This is an example audio file.
[00:34:47.000 --> 00:34:49.000] The text has been redacted
[00:34:49.000 --> 00:34:51.000] This is the end of the audio.
- end of program -
```
This commit changes the calculation of the `seek_end` variable to
only add `seek_start` if a custom `duration_ms` is provided.
Otherwise, it defaults to the end of the file.
Signed-off-by: Thijs Raymakers <[email protected]>
- whisper.cpp +1 -1
|
@@ -3855,7 +3855,7 @@ int whisper_full_with_state(
|
|
| 3855 |
}
|
| 3856 |
|
| 3857 |
const int seek_start = params.offset_ms/10;
|
| 3858 |
-
const int seek_end =
|
| 3859 |
|
| 3860 |
// if length of spectrogram is less than 1s (100 samples), then return
|
| 3861 |
// basically don't process anything that is less than 1s
|
|
|
|
| 3855 |
}
|
| 3856 |
|
| 3857 |
const int seek_start = params.offset_ms/10;
|
| 3858 |
+
const int seek_end = params.duration_ms == 0 ? whisper_n_len_from_state(state) : seek_start + params.duration_ms/10;
|
| 3859 |
|
| 3860 |
// if length of spectrogram is less than 1s (100 samples), then return
|
| 3861 |
// basically don't process anything that is less than 1s
|