Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dropping all frames (again) #32

Closed
raffaem opened this issue Nov 16, 2022 · 60 comments
Closed

Dropping all frames (again) #32

raffaem opened this issue Nov 16, 2022 · 60 comments

Comments

@raffaem
Copy link

raffaem commented Nov 16, 2022

I have the exact same problem as #29.

I use Hyprland with NVIDIA proprietary drivers and mpvpaper is dropping all frames:

It worked fine with nouveau.

mpv is able to play the video.

I think it may be due to the fact that mpv uses vo=gpu and mpvpaper uses vo=libmpv?

How can I get full debug output from mpv?

@GhostNaN
Copy link
Owner

Did you read through #29 ?
It's either a nvidia driver, mpv or hyperland issue.
You could narrow it down by trying to see if mpvpaper works on sway.
If it works on sway, I would just upstream the issue to mpv

I think it may be due to the fact that mpv uses vo=gpu and mpvpaper uses vo=libmpv?

No, see here as to why: #10

How can I get full debug output from mpv?

I don't think that would be much help in this case.
Other than running mpvpaper in a debugger, you won't find where it's getting hold up.

@raffaem
Copy link
Author

raffaem commented Nov 16, 2022

It's either a nvidia driver, mpv or hyperland issue.

I don't see how.
mpv can play the video just fine, on hyprland with nvidia driver.

@GhostNaN
Copy link
Owner

True and there might be something going on with libmpv.
But there are more moving parts going on here.

For example, it could be an issue with the hyperland if the wl_surface_frame won't callback.
It could be a mpv issue if only the libmpv render is having this issue.
And it also might be a driver issue because changing the driver seems to fix the issue.

But now I'm curious, I wonder if the old mpvpaper render loop works.
It's only other thing I could think of that could be mpvpaper's fault.
This was the last commit before the new render loop was implemented: f65700a
You could clone and compile the files from here: https://github.com/GhostNaN/mpvpaper/tree/f65700a3ecc9ecd8ca501e18a807ee18845f9441

@raffaem
Copy link
Author

raffaem commented Nov 16, 2022

With that commit it says this:

❯ mpvpaper '*' "wallpaper.webm"
[-] :/ sorry about this but we can't seem to find any output.

@GhostNaN
Copy link
Owner

Yes as mpvpaper only supported 1 monitor back then, hence the render loop rewrite.

@raffaem
Copy link
Author

raffaem commented Nov 16, 2022

So we pick one random project amon mpv, hyprland, wlroots, nvidia drivers, and send the bug report there? I don't understand

@GhostNaN
Copy link
Owner

First, what was the result with the old render loop?
Second, I'm not going to lie, I'm not sure.

I'm thinking mpv will be your best bet if you can rule out it's not hyperland's fault by try a different wlroots compositor(probably sway).
Otherwise, I don't have a clue unless I know where mpvpaper is getting caught up on your Nvidia drivers.

@raffaem
Copy link
Author

raffaem commented Nov 16, 2022

Didn't I report the result with the old render loop here? #32 (comment)

@GhostNaN
Copy link
Owner

GhostNaN commented Nov 16, 2022

No, you got that error because you used "*" to select all monitors.
That commit was before that was implemented so you need to specify a monitor manually.
An example from the man page:
mpvpaper DP-2 /path/to/video
If you don't know what your monitor is labeled as use this:
mpvpaper -v a a
And check the not selected warnings.

@raffaem
Copy link
Author

raffaem commented Nov 16, 2022

And how I was supposed to know it was not implemented yet?

It works perfectly fine with the old render loop.

@GhostNaN
Copy link
Owner

Not saying you should of known, perhaps I should of been more clear when I said #32 (comment)

Anyways, Now that we know it was the render loop.
What specifically is causing this behavior for Nvidia GPU drivers is the question.
I have an old Nvidia laptop I'll see if it has the same issue.

@raffaem
Copy link
Author

raffaem commented Nov 16, 2022

Bisected to 0475df8

@GhostNaN
Copy link
Owner

Yeah, that was the first commit toward the new render loop.

@GhostNaN
Copy link
Owner

GhostNaN commented Nov 16, 2022

My laptop was too old to test, only the Intel IGPU is working.
If mpvpaper didn't error out during EGL initialization we can rule out those changes.

I wonder if it's just the polling that's broken.
Go here on the new version and replace this:

mpvpaper/src/main.c

Lines 1052 to 1101 in 5dcdfc7

while (true) {
if (wl_display_flush(state.display) == -1 && errno != EAGAIN)
break;
struct pollfd fds[2];
fds[0].fd = wl_display_get_fd(state.display);
fds[0].events = POLLIN;
fds[1].fd = wakeup_pipe[0];
fds[1].events = POLLIN;
// Wait for MPV to request a new frame to be drawn
int poll_err = 0;
while (poll_err == 0) {
poll_err = poll(fds, sizeof(fds) / sizeof(fds[0]), 10); // 10ms timeout
if (poll_err == -1 && errno != EINTR)
break;
if (halt_info.stop_render_loop) {
halt_info.stop_render_loop = 0;
sleep(2); // Wait at least 2 secs to be killed
}
}
if (fds[0].revents & POLLIN) {
if (wl_display_dispatch(state.display) == -1)
break;
}
// MPV is ready to draw a new frame
if (fds[1].revents & POLLIN) {
// Empty the pipe
char tmp[64];
if (read(wakeup_pipe[0], tmp, sizeof(tmp)) == -1)
break;
mpv_render_context_update(mpv_glcontext);
// Draw frame for all outputs
struct display_output *output;
wl_list_for_each(output, &state.outputs, link) {
// Redraw immediately if not waiting for frame callback
if (output->frame_callback == NULL) {
if (output->egl_window && output->egl_surface)
render(output);
}
else
output->redraw_needed = true;
}
}
}

With this:

    // Main Loop
    while (true) {
        if (wl_display_flush(state.display) == -1 && errno != EAGAIN)
            break;

        if (wl_display_dispatch(state.display) == -1)
            break;

        mpv_render_context_update(mpv_glcontext);

        // Draw frame for all outputs
        struct display_output *output;
        wl_list_for_each(output, &state.outputs, link) {
            // Redraw immediately if not waiting for frame callback
            if (output->frame_callback == NULL) {
                if (output->egl_window && output->egl_surface)
                    render(output);
            } else {
                output->redraw_needed = true;
            }
        }
    }

Then recompile.
This will remove all the polling and mpvpaper will render as fast as your pc can.

@raffaem
Copy link
Author

raffaem commented Nov 16, 2022

No, it's still bugged.

Can you send .patch files?

@GhostNaN
Copy link
Owner

Ok, so it's not the polling. That's good. Here's the modified main.c Polling removed main.tar.gz

For my next guess, maybe it has to with implementing wl_output v4 a5a4750
Try this version before the wl_output v4 commit: https://github.com/GhostNaN/mpvpaper/tree/18c2f4131cd1ac00aa35f82c0651af17af71fa76

Or it could be that I updated the GLAD libraries 6b0f7a8
Try this version before the GLAD update: https://github.com/GhostNaN/mpvpaper/tree/32b9f9a1121b4a14afd646a5cdbfdf9924ab12e3

I'm just throwing darts and seeing what sticks.

@raffaem
Copy link
Author

raffaem commented Nov 17, 2022

With the code here there is no "dropped" error anymore, but the screen is completely black

@raffaem
Copy link
Author

raffaem commented Nov 17, 2022

Try this version before the wl_output v4 commit: https://github.com/GhostNaN/mpvpaper/tree/18c2f4131cd1ac00aa35f82c0651af17af71fa76

No. Black screen and "Dropped" error message

@raffaem
Copy link
Author

raffaem commented Nov 17, 2022

Try this version before the GLAD update: https://github.com/GhostNaN/mpvpaper/tree/32b9f9a1121b4a14afd646a5cdbfdf9924ab12e3

No. Black screen and "Dropped" error message.

I think proper debug logging capabilities should be implemented in mpvpaper.

@raffaem
Copy link
Author

raffaem commented Nov 17, 2022

I have obtained a debug log from mpv by issuing mpvpaper -o "--log-file=output.txt".

There are a bunch of

[   0.428][v][vo/libmpv] mpv_render_context_render() not being called or stuck.

output.txt

@GhostNaN
Copy link
Owner

I think proper debug logging capabilities should be implemented in mpvpaper.

I don't disagree, it's always been a bit lacking. It's just tedious to implement.
I'm going to add a new level of verbose that will make this process easier in the future.

#32 (comment) Great idea.
Although, I'm not surprised about that. It wouldn't be dropping frames if this was being called:

mpvpaper/src/main.c

Lines 155 to 159 in 5dcdfc7

// Render frame
int err = mpv_render_context_render(mpv_glcontext, render_params);
if (err < 0) {
cflp_error("Failed to render frame with mpv, %s", mpv_error_string(err));
}

With the code here there is no "dropped" error anymore, but the screen is completely black

Interesting, try also with just 1 monitor.
It's calling the render() function, but it seems it doesn't want to display.
The only thing added to the render() function that could matter is this:

mpvpaper/src/main.c

Lines 150 to 153 in 5dcdfc7

if (!eglMakeCurrent(egl_display, output->egl_surface, output->egl_surface, egl_context)) {
cflp_error("Failed to make output surface current 0x%X", eglGetError());
}
glViewport(0, 0, output->width * output->scale, output->height * output->scale);

Remove/comment out these^ lines and see if it displays onto just 1 monitor.

It could also be that Nvidia doesn't like having multiple EGL Surfaces.

@raffaem
Copy link
Author

raffaem commented Nov 17, 2022

nope, doesn't work

@GhostNaN
Copy link
Owner

Here's what I've come up to improve debugging:
verbose main

While it's not a full on debug mode, I added a bunch more error printing and verbose messages.
And there's more I could add, but this is what I have for now.

Let's see what messages you get with -v and the new -vv flag.

@raffaem
Copy link
Author

raffaem commented Nov 18, 2022

❯ ./build/mpvpaper -vv -o "--loop=inf" HDMI-A-1 "/home/raffaele/Videos/wallpaper.webm"
[*] Verbose Level 2 enabled
[+] Connected to Wayland compositor
[*] Output HDMI-A-1 (Samsung Electric Company C24F390 H4LT305839) selected
[*] OpenGL 4.6 EGL context created
[+] EGL initialized
 (+) Video --vid=1 (*) (vp9 3840x2160 25.000fps)
 (+) Audio --aid=1 --alang=eng (*) (opus 2ch 48000Hz)
[*] Loaded /home/raffaele/Videos/wallpaper.webm
[+] MPV initialized
[+] HDMI-A-1 setup is complete, ready to start rendering
[*] HDMI-A-1 is ready for MPV to render the next frame
AO: [pipewire] 48000Hz stereo 2ch floatp
VO: [libmpv] 3840x2160 yuv420p
[*] MPV is ready to render the next frame for HDMI-A-1
[*] HDMI-A-1 is ready for MPV to render the next frame
[*] HDMI-A-1 is ready for MPV to render the next frame
AV: 00:00:04 / 00:02:00 (3%) A-V:  0.000 Dropped: 97

[*] Exiting mpvpaper

@GhostNaN
Copy link
Owner

No errors...
I thought for sure there would be an issue with the EGL surface.
And it's getting held up in the main while(true) {} render loop.
Which we already knew with #32 (comment)

But there is 1 new thing here I didn't think would occur.
It made it the render() function 3 times with when it said [*] HDMI-A-1 is ready for MPV to render the next frame
from the frame_handle_done() function.
And once with [*] MPV is ready to render the next frame for HDMI-A-1 in the main while(true) {} render loop.
Although, that still doesn't point the exact source of the issue.

I've looked a lot at the diff between mpvpaper version 1.2.1 and 1.3
The only thing I could think of that's broken is the multi EGL surfaces needed for multi monitor (despite no errors)

Commits:

  1. 19f7129
  2. 0475df8
  3. 781320f

Are the main suspects.
The first was when the EGL surfaces was split among the outputs
The second reworked how the EGL contexts was used
The third added the polling to save processing time.

If you don't want to do this next step, I don't blame you.
You would have to clone and compile from each commit tree and find the 1 that breaks for you.
Either way, thank you so much for your patience!

@raffaem
Copy link
Author

raffaem commented Nov 18, 2022

Bisected to 0475df8

Didn't I bisect it?

I don't understand

@GhostNaN
Copy link
Owner

GhostNaN commented Nov 18, 2022

Didn't I bisect it?

I don't know what you mean by "bisect", like "fork"?
Are you saying 0475df8 is when it started dropping frames/breaking?
If so, this narrows down the issue substantially!

@raffaem
Copy link
Author

raffaem commented Nov 18, 2022

@GhostNaN
Copy link
Owner

Thanks, I learned something today.
I'll go narrow down the possibilities and give you a modified main.c again later.

@GhostNaN
Copy link
Owner

Here it is reordered_egl_main.tar.gz
All I did was reorder the EGL initialization similarly to the way it was before the new render loop.
I'm thinking this is were the Nvidia driver screws up.
As a proof of concept, this version only works with 1 monitor so don't specify '*'.

@raffaem
Copy link
Author

raffaem commented Nov 19, 2022

Can you at least make a branch?

It's difficult to test like this

@raffaem
Copy link
Author

raffaem commented Nov 19, 2022

It's not pretty(code wise) but this might fix the multi-monitor support: hybrid_main.tar.gz I just kept/forced the EGL initialization load order the same and added back multi EGL surfaces.

Still working.

What was the problem with EGL initialization order?

GhostNaN added a commit that referenced this issue Nov 20, 2022
@GhostNaN
Copy link
Owner

Can you at least make a branch?

Sure thing https://github.com/GhostNaN/mpvpaper/tree/nvidia-fix

Still working.

Excellent, we got multi-monitors working now!
Now, let's see if I can rework the polling to fix the issue completely.

What was the problem with EGL initialization order?

Honestly, I have no damn clue.
This is some GPU driver voodoo magic we got here.

@GhostNaN
Copy link
Owner

GhostNaN commented Nov 20, 2022

Still working.

Wait, this is with multi monitors right? Like:
mpvpaper '*' /path/to/video
The coded changes the order and that really only effects the first monitor, the rest is like the old way.

@raffaem
Copy link
Author

raffaem commented Nov 20, 2022

Still working.

Wait, this is with multi monitors right? Like: mpvpaper '*' /path/to/video The coded changes the order and that really only effects the first monitor, the rest is like the old way.

I have a laptop with an external monitor.

I call my compositor after exporting:

export WLR_DRM_DEVICES=/dev/dri/card0

in order to use only the external monitor, for the compositor to see only the external monitor and not the laptop one, as suggested here.

Given this premise,

./build/mpvpaper -vv -o "--loop=inf" '*' "/home/raffaele/Videos/wallpaper.webm"

works

@GhostNaN
Copy link
Owner

in order to use only the external monitor, for the compositor to see only the external monitor and not the laptop one, as suggested here.

So you don't have a multi-monitor setup. This is a dead end.
There might be a way to create virtual monitors, but I'm not sure how that would work.

I think I also over engineered the solution with the EGL init.
I'm hoping the simple fix is just having EGL surface and window created before mpv.
Try this commit here e21891a
If that works, just as a sanity check try the following commit here: 33e03d3

@raffaem
Copy link
Author

raffaem commented Nov 20, 2022

Nope, commit 33e03d3 doesn't work. Says dropped frames.

commit e21891a doesn't work either. Doesn't print dropped frames, but doesn't show the video either, just a black screen.

commit 7eb7b70 work correctly.

@GhostNaN
Copy link
Owner

Anymore changes to the code will basically lead us back to this point: f65700a
And I won't be mainlining 7eb7b70 because I have a feeling it will break more than it fixs for everyone else that doesn't use a Nvidia card.
As it stands, for your setup and anyone else using the Nvidia drivers, I would just use mpvpaper from this commit: f65700a
You won't loose any functionality(besides no multi-monitor support) and it won't hammer your GPU like 7eb7b70 will.

I am uncertain where I can take this issue from here.
And I'm still not convinced this is entirely mpvpaper's fault here.
Whether this is a Nvidia driver or a MPV issue is still unclear.

I really appreciate the effort you have put in to diagnosing this issue.
But unless you have any other suggestions, this is where it ends.

@jd1t25
Copy link

jd1t25 commented Dec 1, 2022

Hey just heads up.
The branch nvidia-fix doesn't work for me.
But 7eb7b70 worked

@GhostNaN
Copy link
Owner

GhostNaN commented Dec 1, 2022

I'm aware, I will most likely just delete that branch as it was a failure.
Just use f65700a going forward for now.

@raffaem
Copy link
Author

raffaem commented Dec 1, 2022

@GhostNaN what about crowdfunding a NVIDIA laptop for you? 🧐

@GhostNaN
Copy link
Owner

GhostNaN commented Dec 1, 2022

@raffaem Very funny... how about we crowdfund everyone getting an AMD laptop instead?

@raffaem
Copy link
Author

raffaem commented Dec 1, 2022

@raffaem Very funny... how about we crowdfund everyone getting an AMD laptop instead?

I was not joking

@GhostNaN
Copy link
Owner

GhostNaN commented Dec 1, 2022

That seems a bit extreme for a wallpaper program, don't you think?

@raffaem
Copy link
Author

raffaem commented Dec 1, 2022

@raffaem Very funny... how about we crowdfund everyone getting an AMD laptop instead?

That would cost much more ... too high a target to fund 😅

@raffaem
Copy link
Author

raffaem commented Dec 1, 2022

Another thing: I found out it's possible to carry out remote GDB debugging, but I never did that, I don't know how it works

@raffaem raffaem closed this as completed Dec 1, 2022
@raffaem raffaem reopened this Dec 1, 2022
@GhostNaN
Copy link
Owner

GhostNaN commented Dec 1, 2022

Another thing: I found out it's possible to carry out remote GDB debugging, but I never did that, I don't know how it works

You're just ssh-ing into a remote PC at that point.

This isn't necessarily a "I don't have Nvidia hardware, so I can't fix" although troubleshooting would be much faster.
Again, I'm fairly certain at this point this issue isn't solvable on my end anymore.
There is something very wrong going on with the Nvidia driver which isn't surprising due to Nvidia's history with Wayland.

@GhostNaN
Copy link
Owner

GhostNaN commented Dec 1, 2022

I would be just be developing a workaround at best at this point.

@raffaem
Copy link
Author

raffaem commented Jan 18, 2023

I changed laptop and I now use the nvidia-open-dkms drivers.

mpvpaper works wonders, except for a small glitch when a looped video ends and is restarted.

Closing this as technically I can't reproduce it anymore.

@raffaem raffaem closed this as completed Jan 18, 2023
@GhostNaN
Copy link
Owner

except for a small glitch when a looped video ends and is restarted.

Same here buddy, it's like it drops a frame or 2.
Although, I find the same behavior with vanilla mpv.

@raffaem
Copy link
Author

raffaem commented Jan 28, 2023

Now this also works with the proprietary NVIDIA drivers.

I don't think we need the nvidia-fix branch anymore.

@GhostNaN
Copy link
Owner

I don't think we need the nvidia-fix branch anymore.

I just removed it. Besides, it was a failure anyways...
I'm glad this finally sorted itself out and I didn't have to make a terrible/hack workaround.

@TropicLegend
Copy link

Asking out of curiosity - Are they working well?
I'm thinking about switching to them as well (rtx 3080 ti).
I can't really find much resources on the web about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants