Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a audio visualizer #206

Open
wants to merge 62 commits into
base: audio-waveform
Choose a base branch
from

Conversation

mvaranda
Copy link

@mvaranda mvaranda commented May 7, 2023

Thanks for creating this app as well as for your help.
I understand that the present code may have some not so elegant integration but it seems to be doing the work.
I also understand that both of us may not have too much time to spend in this project as we both have a busy life (an it is almost summer). I believe that having this code merged to "audio-waveform" branch would make possible to other developers and even smart users to try using these features and provide some feedback.
Thanks once again,
Marcelo

@otsaloma
Copy link
Owner

otsaloma commented May 7, 2023

Thanks, I'll take a closer look when I have time, maybe next weekend.

Marcelo Varanda and others added 2 commits May 13, 2023 08:17
…d. Also, prevent exception when project is closed.
@otsaloma
Copy link
Owner

otsaloma commented Jun 4, 2023

Sorry about the delay, I now had time to try this. Unfortunately playback freezes for me, not related your code, the same for master. Probably some GStreamer upgrades that I need to sort out. For now, I can only offer some very initial impressions:

  • Creating the initial cache takes a long time – it needs to be in some way opt-in, so that users don't get an annoying suprise delay. Not sure what the best way to do this is – maybe a button in the video toolbar to generate and show the waveform?
  • The mouse UI probably needs changing cursors, otherwise it's undiscoverable.
  • Dark vs. light theme should be automatic, there's some initialization code in applicationman, after which you can probably use Gtk.Settings?
  • The waveforms I see look very "flat" and uninformative. Did you look into if some preprocessing helps when generating the cache data? Such as selecting a particular channel, limiting to particular frequencies, some kind of normalization, clipping, non-linear scale in the visualization etc.

@mvaranda
Copy link
Author

mvaranda commented Jun 4, 2023

Hi Osmo, following my comments:

  • Creating the initial cache takes a long time – it needs to be in some way opt-in, so that users don't get an annoying suprise delay. Not sure what the best way to do this is – maybe a button in the video toolbar to generate and show the waveform?

    • It only creates cache if the waveform view is enabled. The user can disable it prior to loading the video. We could add a Cancel button which would stop the cache creation and disable the view.
  • The mouse UI probably needs changing cursors, otherwise it's undiscoverable.

    • I am not sure what do you mean by undiscoverable. If the user knows that the view is interactive I do not see why changing the cursor shape would do any good. If the user does not know that the widget is interactive changing the cursor would not make it obvious to most users either in my opinion.
  • Dark vs. light theme should be automatic, there's some initialization code in applicationman, after which you can probably use Gtk.Settings?

    • I tried a couple things that did not work. Maybe the Cinnamon theme that I am using is not providing the correct report to the system. Hence the manual selection.
  • The waveforms I see look very "flat" and uninformative. Did you look into if some preprocessing helps when generating the cache data? Such as selecting a particular channel, limiting to particular frequencies, some kind of normalization, clipping, non-linear scale in the visualization etc.

    • we are representing audio with much more samples than pixels. It will always be "squared". Running IIR or FIR filters and interpolation would not make any good as a single pixel represents many samples. Because of the decimation we also have high frequencies sometimes being misrepresented. I am not sure how we can make it look better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants