Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline test improvements #150

Merged
merged 9 commits into from
Aug 9, 2024

Conversation

palana
Copy link
Contributor

@palana palana commented Aug 7, 2024

These changes make the offline test deterministic in my testing, i.e. the same input file produces the same segment_labels in segments.json

palana added 6 commits August 7, 2024 13:40
this should mostly not make a difference, but feels semantically
more correct
There are two issues here:
1. `line_size` may contain padding (didn't happen in my tests)
2. from: https://git.ffmpeg.org/gitweb/ffmpeg.git/blob/2b5f000d3f6f9e737e918a5438e6c881f65e70e2:/libavutil/frame.h#l405
> For audio, only linesize[0] may be set. For planar audio, each
> channel plane must be the same size.
This kind of behaves like libobs, where each chunk of audio is
inspected individually by VAD/whisper, until processing of either
takes longer than the window length, in which case audio continues
to stream in
@palana palana force-pushed the offline-test-improvements branch from 59800a1 to 6e5a8af Compare August 7, 2024 11:40
@palana palana force-pushed the offline-test-improvements branch from 6e5a8af to e9581f3 Compare August 7, 2024 11:48

// sleep up to window size in case whisper is processing, so the buffer builds up similar to OBS
auto now = std::chrono::system_clock::now();
if (false && now > max_wait)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of undecided if this false should be a parameter, or whether the max wait should be removed

for running tests deterministically between machines you currently want to always only feed a single chunk of audio into gf->input_buffers

honoring max_wait gives you something that is closer to the "plugin within obs experience", i.e. while whisper inference (etc) is running (which takes longer than the wait time on my machine), audio buffers continue to fill up, so on the next run of the whisper loop a bigger chunk of audio is fed into VAD

@palana palana force-pushed the offline-test-improvements branch from 6414214 to bb73bcf Compare August 8, 2024 12:02
Copy link
Collaborator

@royshil royshil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

Comment on lines 44 to 45
std::time_t now_time_t = std::chrono::system_clock::to_time_t(now);
std::tm now_tm = *std::localtime(&now_time_t);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we don't need this anymore right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like having both timestamps available, local time to orient myself on when a particular run happened and "running time" to compare relative timing within a run

@royshil royshil merged commit 6cc88b1 into locaal-ai:master Aug 9, 2024
9 checks passed
@palana palana deleted the offline-test-improvements branch August 13, 2024 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants