Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consumer shows countdown clock *skipping* times in the output #1441

Closed
pbelbin opened this issue Oct 24, 2022 · 13 comments · Fixed by #1499
Closed

Consumer shows countdown clock *skipping* times in the output #1441

pbelbin opened this issue Oct 24, 2022 · 13 comments · Fixed by #1499
Assignees

Comments

@pbelbin
Copy link
Contributor

pbelbin commented Oct 24, 2022

I created an html template that uses three.js to produce a countdown clock that has 3d text.

The template code has been created so that rather than re-rendering on every frame, the rendering only happens when the time changes (ie: once per second) as a way of reducing the GPU load.

Problem is: the clock output appears to be skipping some times entirely in the output!

Expected behaviour

Every second of the countdown should appear in the output

Current behaviour

Some seconds of the countdown output are not shown in the consumer output


Steps to reproduce

  1. Install the clock template to your CasparCG server: clock.zip
  2. Make sure your CasparCG configuration file has the html graphics enabled for access to the GPU.
  3. Create a rundown that uses the template
  4. Play the countdown.
  5. Your consumer (whatever you configured it to go to) should start showing you a countdown to a distant date & time.
  6. Observe the consumer output, and you will probably see, occasionally, that the output skips over some times being shown
  7. This behavior does not happen in a regular chrome browser with graphics acceleration enabled.
  8. The template outputs to the CasparCG log the time that it's rendering, so you'll be able to see that it's not skipping over any times.

Environment

  • Commit: abc3582
  • Server version: 'master'
  • Operating system: Windows 10

Note: this is not a new problem introduced with the above commit, but, since the CEF has been updated, I figured I'd mention that I've tested it with the above version too, and find it's skipping the output too.

Also, I've seen this behavior happen where the template is much simpler, and it's just doing 2D text output.


Screenshots

Screen recording of a web browser (top), then the screen consumer (bottom) and the CasparCG log in the background,
showing that it's not skipping over any times, but, watch long enough, and you'll see when the CasparCG output does not appear, and the time sticks for an extra second.

Check out the clock at 00:07 of the recording, and you'll see what I'm talking about!

Screen.Recording.2022-10-23.at.09.39.36.73.PM.mp4
@JonFranklin301
Copy link

JonFranklin301 commented Nov 2, 2022

Have experienced this issue too when displaying a count up timer with Lottie.
The console log shows the correct time but it visually does not update on the casparCG consumer.

I had been pulling my hair out over this! Temporary workaround is to re-render every 200ms instead of every 1 second, not ideal but better than missing a number completely!

Windows 10, casparcg-server-v2.3.1-lts-stable

@pbelbin
Copy link
Contributor Author

pbelbin commented Nov 3, 2022

Thank you for confirming! Good at least, to see it's not just me!

I've seen some other examples where css animations to make things either fully opaque or transparent do not complete fully too, which is even more problematic in some ways, which I suspect may be rooted in the same problem.

Would be really good if this issue can be fixed properly. Regardless of whether the underlying issue is CasparCG or CEF, it's eating into the reliability reputation (and therefore usability) of CasparCG, unfortunately!

@pbelbin
Copy link
Contributor Author

pbelbin commented Nov 3, 2022

If I understand correctly, CEF is also used by OBS, so it might be interesting to see if the issue shows up in OBS.

If you're interested in doing that, there's a small tweak to make regarding making the clock immediately start playing.

Edit the .html file (in the zip), and in the <body> element, where it has onload="postLoad();", instead have it do: onload="postLoad();play();", so that the clock will start once OBS has opened the web page in a browser scene.

I'm giving this a test locally, and so far, it seems that OBS (latest 27.x.x) is managing to show the clock without skipping times, but it would be good to have this confirmed by others.

Of course, if this is confirmed, then it would suggest that something is up regarding how CasparCG is using CEF.

@pbelbin
Copy link
Contributor Author

pbelbin commented Nov 3, 2022

It's possible #1431 is another manifestation of the same issue, as, it seems, in both cases, the updates being asked for are somewhat infrequent, and somehow, the expected result is not appearing in the output.

@filiphanes
Copy link
Contributor

I had been pulling my hair out over this! Temporary workaround is to re-render every 200ms instead of every 1 second, not ideal but better than missing a number completely!

If the timer is refreshed/sampled only each 1s, I would expect this behaviour, but not so much to be a problem.
Because refresh code don't run exactly each 1s but there can be slight delays and when the timer is ticking very near seconds boundary, I would expect this behaviour. Simple solution is to double sampling rate to 500ms. With 500ms timer in worst case it would show next number half a second later, but not skip it completely.

Visual explanation of skipping 1:


seconds:   1                        2                        3                        4
-----------|------------------------|------------------------|------------------------|-----
samples:  ^ 0                       ^ 2                       ^ 3                       ^ 4

@pbelbin
Copy link
Contributor Author

pbelbin commented Jan 23, 2023

I agree that some things can be done to help mitigate the issue that CasparCG seems to be having.

But, have you tried using OBS to do the same job?

I did some testing, and it appears that OBS does not have the issue we're talking about.

Since OBS and CasparCG are conceptually doing similar, and both using the Chromium embedded browser to do it, I have to wonder if there's something else in CasparCG that's causing the issue, or, perhaps a configuration issue that's causing this errant behavior in CasparCG's use of Chromium embedded browser?

Give OBS a try with something that causes issues in CasparCG and see how it goes, and report back.

It would be good to have confirmation of what I've observed, and this might help point to something that CasparCG is doing different, and can be fixed.

@Julusian
Copy link
Member

Julusian commented Feb 8, 2023

I think we have finally been seeing this issue and I have an idea of the cause.
What we have seen is that some graphics are playing frame sequences such as A B B D E F. I have verified that this is not Caspar repeating frames, but that each frame we displayed was a different frame we were given by CEF.

I think (do not know how to verify) this is caused by the changes OBS made, to output a single texture. They are issuing an async copy to this texture, then CEF tells us about the new frame without waiting for this to complete.
We are then making our own copy from this texture so that we can buffer frames, and what I think is happening is that their D3D copy is not happening before our OpenGL copy is done. Or maybe it could be something with caching due to the interop. There are no mutexes being used as part of their texture copy, so it could also be a case of the copies happening simultaneously.

In theory the answer is simple, rebuild CEF without those OBS patches and hopefully we will go back to the previous behaviour.

@pbelbin
Copy link
Contributor Author

pbelbin commented Feb 11, 2023

Hey Julian,

What you've described sounds very reasonable ways that CasparCG might be getting into a situation of not seeing all rendered frames.

Have you had a chance to try out your possible solution? I'm very keen to see this issue fixed.

@Julusian
Copy link
Member

I have tried but have not had any luck producing a build of CEF that works.
CEF 95 produced some weird compiler errors, I assume some kind of tooling update that might be fixable, but with a lot of guesswork or trial and error.
CEF 103 compiles, but doesnt send any frames.

This is taking a while as a full compile of CEF takes about an hour on my i9-10920x, which needs to be done when switching between version branches.

I am considering attempting an even newer version of CEF, but the shared-texture patch we use needs completely rewriting for this, which may not be a viable route to go for now

I will be continuing on this during the week

@pbelbin
Copy link
Contributor Author

pbelbin commented Feb 18, 2023

Hey Julian,

I saw the update you provided a handful of hours after you made it. Thank you for the update! I appreciate the effort involved in fixing this problem.

Any additional news to add?

I did have a bit of a tinker with the CEF project, trying out their sample project, and managing to get their bare-bones client to work, seemingly, with the off-screen-rendering. I was reading about some switches, and the ability to use D3D on windows, but OpenGL on other platforms, regarding doing the buffer copies when rendering off-screen.

I'm guessing you're fully aware of that sample though...

I saw some info that suggested that on Windows, the D3D copy buffer was faster than the equivalent OpenGL way of doing it, but, I really don't know much about that. Makes sense that on a windows system that using their own API would be faster, but whether it's a significant difference or not, well, that's another story. And, I guess you need a platform-agnostic way to do it anyhow.

@PeterAkakpo
Copy link

any progress?

@Julusian
Copy link
Member

Not much has progressed on this.
The NRK builds of 2.4 have reverted the cef update to avoid this issue, I am wondering if the same should be done here until there is a proper solution.

The core of the issue is that the upstream cef builds have a broken shared texture implementation, so we are using some oba builds of CEF which has fixed it.
However those builds are incompatible with the way we process and buffer frames, causing this issue. Obs has a newer version of CEF now too, which only produces black for us, as they have further embraces this flow.
The problem of their builds being a lack of mutexes, making it unsafe to use their textures with d3d to opengl interop, and in the later build, only calling the callback when there is a new texture, and not for every frame (so we would have to hope our timer and theirs were in good sync)

The bigger problem is that the patches obs are using are incompatible with any newer versions of chromium. Chromium has finished migrating drawing subsystems, so the whole approach to shring textures needs rewriting from scratch. This is a very big and complex task.
At the same time, hopefully this new implementation can also work on Linux, an area which has been missing until now.

Maybe we could get something working with the current cef builds, so far I have not had any success.
But even those builds are a little old now, and going newer requires large changes in cef which I did start prototyping, and someone else has too.

Alternatively, I am considering if using qtwebengine instead of CEF is a better approach, as they seem to have this all figured out. This could introduce new system requirements, and is looking to require other internal changes to let qt play well with the rest of caspar. This is something I am playing with, but don't have anything to share yet.

@Julusian
Copy link
Member

I have a PR which updates CEF and makes some changes to how frames move between CEF and caspar. I used this issue as a test case and I believe it no longer occurs.
Performance may or may not be the same as before though. You can read more and find a build in the PR #1499

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants