Improvements implemented in the audio processing module #2390

mrfelpa · 2024-10-15T18:06:21Z

The relative import causing errors when the script was executed directly, so I changed from .utils import exact_div to from utils import exact_div. I also implemented a function (get_hann_window) to avoid repeated Hann window calculations.

mrfelpa · 2024-10-17T02:09:29Z

whisper/audio.py

+
+   log_spec = torch.clamp(mel_spec, min=1e-10).log10()
+
+   log_spec_normalized = (log_spec + 4.0) / 4.0


@Nisarg236 I did it separately to try to keep the code clearer, but yes, it is possible to combine the two.

- I've implemented an update to the load_audio function to provide better control over the process, allowing for better error handling and resource management. Specifically, ffmpeg output streams are now explicitly handled, decoding errors are caught - I implemented a fix in the load_audio function, direct shell command execution (which can introduce vulnerabilities) has been avoided. The ffmpeg command now uses hide_banner to avoid displaying sensitive information. - Additional input validation checks were incorporated to verify function arguments.

Improvements implemented in the audio processing module

81ef7f3

mrfelpa commented Oct 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements implemented in the audio processing module #2390

Improvements implemented in the audio processing module #2390

mrfelpa commented Oct 15, 2024

mrfelpa Oct 17, 2024


		log_spec = torch.clamp(mel_spec, min=1e-10).log10()

		log_spec_normalized = (log_spec + 4.0) / 4.0

Improvements implemented in the audio processing module #2390

Are you sure you want to change the base?

Improvements implemented in the audio processing module #2390

Conversation

mrfelpa commented Oct 15, 2024

mrfelpa Oct 17, 2024

Choose a reason for hiding this comment