Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make high resolution scrolling more responsive and configurable #1129

Merged
merged 2 commits into from
Nov 11, 2018

Conversation

maximbaz
Copy link
Contributor

@maximbaz maximbaz commented Nov 9, 2018

I was testing high resolution scrolling under Wayland and found that scrolling becomes acceptable if I multiply yoffset by 10 and awesome if I multiply it by 20.

Since wheel_scroll_multiplier is not used under high resolution scrolling at all, I thought we could put it to good use here, so that responsiveness becomes configurable.

Under Wayland, my viewport_y_ratio is 2. I added it to this patch because it was suggested by @Luflosi in #1112, if you want to keep it, 82f9aec probably needs to be reverted.

With this formula, scrolling is acceptable with default settings and I can further improve it if I change wheel_scroll_multiplier to 10.

If we remove viewport_y_ratio from the formula, the default value of wheel_scroll_multiplier (5) is too slow, and I would need to change it to at least 10, but better to 20.

@Luflosi
Copy link
Contributor

Luflosi commented Nov 9, 2018

The default for wheel_scroll_multiplier is still 5, right? This will make kitty scroll five times as fast apter applying this patch. Is it possible to change the default to 1 when high resolution scrolling is supported?

@maximbaz
Copy link
Contributor Author

maximbaz commented Nov 9, 2018

Yes, by setting wheel_scroll_multiplier to 1 in kitty.conf 🙂 But comparing to all other apps, kitty is scrolling much slower, and with this patch it scrolls approximately as fast as others (and by setting wheel_scroll_multiplier to 10 I get the speed of sakura).

UPDATE: I misunderstood what you are suggesting, sorry, you are proposing to change default value to 1. Still think 5 is good for a default value, based on a personal experience comparing scrolling speed with other apps as I described above.

@Luflosi
Copy link
Contributor

Luflosi commented Nov 9, 2018

At least on macOS a value of 1 makes kitty scroll as fast as all other apps.

@maximbaz
Copy link
Contributor Author

maximbaz commented Nov 9, 2018

Just to confirm, are you testing with 82f9aec reverted? Because both that commit and this PR increase the speed on macOS.

In any case, I don't really care about the default value, I'm totally fine changing it in my local config as long as it's configurable 😉

@Luflosi
Copy link
Contributor

Luflosi commented Nov 9, 2018

That commit doesn't make a difference for me since yscale is 1 for me since I don't have a retina display. And I didn't suggest changing the default value for everyone, only where high precision scrolling is supported. I think that 5 lines per scroll event are pretty reasonable when scrolling with a mouse that doesn't support high precision scrolling.

@maximbaz
Copy link
Contributor Author

maximbaz commented Nov 9, 2018

Maybe it wasn't a good idea then to re-use wheel_scroll_multiplier option and we better have a different one for high precision scrolling...

@Luflosi
Copy link
Contributor

Luflosi commented Nov 9, 2018

I think the semantics of that option make perfect sense for low and high precision scrolling and we should not create a new option.

@maximbaz
Copy link
Contributor Author

maximbaz commented Nov 9, 2018

If the option will have different values based on whether the device is "high" or "low" precision, users would need to have a way to override both these values, and this will be impossible I think...

In other words, I would like to override wheel_scroll_multiplier to 10 to get the speed as in sakura with touchpad, but I don't want to change the default 5 for when I use a mouse.

@kovidgoyal
Copy link
Owner

That is not the right place for this patch. It belongs in glfw/wl_window.c just like the fix for macOS that I committed recently.basically the backend should be responsible for providing the correct number of scrolled pixels as reported by the OS. kitty should not be messing with that value.

As for adding a configurable multiplier, I agree it should be a separate option, after all the current one is named "wheel_scroll_multiplier" so the new one could be named "touch_scroll_multiplier" or similar, and it should default to 1.

@kovidgoyal
Copy link
Owner

It would be interesting to see what VTE does on wayland. Why is it scrolling faster than kitty? Does it multiply the pixels by something and if so where does the value of that something come from? @egmontkob you know the answer?

@egmontkob
Copy link

GTK+ gives us a smooth scrolling delta which hides from us the X11 vs. Wayland differences, and is probably already scaled by some global GTK+ or GNOME setting (I'm not sure about it). I'm not aware of these details.

Then in VTE it's scaled according to ceil(height / 10.0) here, which may not be a good approach as per 748012. See also 769696.

@kovidgoyal
Copy link
Owner

@egmontkob thanks for the pointers

@maximbaz
Copy link
Contributor Author

Alright, I don't think I should modify glfw/wl_window.c at this point as I don't really know if it does report wrong number of pixels on HiDPI screen or not, I don't have other Wayland apps to compare to (sakura is x20 faster, and Chromium and Firefox run in X-compatibility mode on Wayland so I can't take them as an example), and I don't want to make backend arbitrarily respond with twice as large numbers.

Let's start with implementing touch_scroll_multiplier, I'll get on it.

@kovidgoyal kovidgoyal merged commit 5e27c21 into kovidgoyal:master Nov 11, 2018
@ducis
Copy link

ducis commented Mar 11, 2020

Hello, I am wondering, in the current master branch, when the last line is updated (or a new line is added to the bottom), does kitty redraw the whole window? like if there is an "A" in the second line from the bottom, then the first line from the bottom changed, does kitty go through the process of rasterizing the letter "A" again? I assume if it does not need to (by holding a buffer of some previously rasterized parts of the terminal window), both high-prec scrolling would be easy and less computation would be needed?

@ducis
Copy link

ducis commented Mar 11, 2020

So far as I know, a terminal program either is only able to change some lines from the bottom (normal terminal program, and tab completion) or takes control of the whole (virtual) terminal (like vim, tmux, or clear), it does not change lines from the top or in the middle. In the latter case you don't have scrolling anyway. So we can always assume a terminal program changes a certain number of lines from the bottom. Is it correct?

@kovidgoyal
Copy link
Owner

kitty only ever rasterizes a character once, that's part of the magic
behind its performance. And no nay terminal program
can make changes to any part of the screen at any time, for example by
simply changing the background color, or directly by moving the cursor
around the screen.

@ducis
Copy link

ducis commented Mar 11, 2020

Thanks. Then where shall I start looking into the scrolling/rendering stuff? In particular, which variables in the source refer to the previously rasterized content?

@kovidgoyal
Copy link
Owner

there is no previously rasterized content. rendering happens on the GPU.
If you want to get previously rendered content you have to render into a
framebuffer. There is code that does that for other reasons in shaders.c
but you are going to need familiarity with OpenGL to make it work

@ducis
Copy link

ducis commented Mar 11, 2020

So by "rasterizes a character once" you meant there are caches like many images with the size of individual characters in the graphics memory?

I guess by "does that for other reasons in shaders.c" you mean draw_cells_interleaved_premult().
Is this function triggered in, say, a fresh install of kitty without customized configuration file or passed commandline options? If not, how should I enable it so that I can test if it affects the latency first?

@ducis
Copy link

ducis commented Mar 11, 2020

Does the framebuffer to render into live in the graphics memory or the main memory?
Usually for this kind of things it should live in the graphics memory and the application has pointer-like stuff with which the app can tell the GPU to do things like memcpy() purely in the graphics memory.
But I am not sure if you meant we have to fetch the rendered results into the main memory by "you have to render into a framebuffer", which would be much worse performance-wise.

@kovidgoyal
Copy link
Owner

kovidgoyal commented Mar 11, 2020 via email

@kovidgoyal
Copy link
Owner

kovidgoyal commented Mar 11, 2020 via email

@ducis
Copy link

ducis commented Mar 11, 2020

Hi, I just tested with typometer.
The environment is compiz (0.8.*, not the Ubuntu one) 60Hz frame rate globally,
both delays turned to 0, VSync enabled in compiz but disabled in kitty (so no tearing).
Transparency within kitty roughly doubles the latency from ~17ms to ~31ms,
with or without icatted images.
But transparency with compiz plugins (which can make any window translucent) results in latency ~26ms.
Would you think that the 17ms->31ms is caused by going through draw_cells_interleaved_premult()?
Am I missing anything?

@kovidgoyal
Copy link
Owner

Put a printf() in draw_cells_interleaved_premult() to check if it is going through that. IIRC without images it will not be used, see line 636 onwards of shaders.c for details. Either you need transparency + any image or non-transparent and negative z-index images.

@ducis
Copy link

ducis commented Mar 11, 2020

I put in the printf() and tested again. Yes, draw..premult() is only triggered by both background_opacity and icat.
How transparency affects latency really depended on what windows were below it.
It seems that the more complex stuff are below a translucent window the more likely the computation will take more than one frame.
But in general transparency,
as well as draw_cells_interleaved_premult() when triggered by icat, do not force an additional frame.

@kovidgoyal
Copy link
Owner

You should be able to trigger it with just

kitty -o background_image=whatever.png

and no transparency (assuming you are running from master)

@ducis
Copy link

ducis commented Mar 11, 2020

No, I am running from 0.16.0 which does not seem to have background_image,
but I just forced draw_cells_interleaved_premult() at the end of draw_cells().
Somehow it made the font bolder and much uglier when background_opacity=1,
so there was something wrong this way.
But regardless, the speed was just as fast as unmodded whether background_opacity = 1 or not,
so long as there was nothing below the window to be actually blended.

Maybe there should be an option to explicitly force rendering to a framebuffer
regardless of transparency or whatever?
So everyone can benchmark. After some time we may find out that
we can use an intermediate framebuffer with negligible penalty in performance,
which would then enable more fancy stuff.

I am even using Intel graphics, an additional copy would be even more forgiving on a real GPU.

@kovidgoyal
Copy link
Owner

Additional copies may have negligible performance penalty, in terms of
render time, but that's not my philosophy. My goal is to reduce energy
consumption, not just render time. That is why kitty has various
settings that actually insert latency into the render loop. And
performing unnecessary operations on all renders just for smooth
scrolling is not a worthwhile tradeoff.

I dont really see why that would be needed anyway. You could just switch
rendering to using a framebuffer while scrolling and turn it off again
after. It would mean writing a bit more code which you would have to do
anyway, since to actually implement scrolling you need to render stuff
above and below the window limits.

@ducis
Copy link

ducis commented Mar 11, 2020

It is hard to say without actual implementation and benchmarking, but keeping a large buffer of results painted in the last frame might actually reduce computation at the expense of more VRAM.

But whatever, what is the current situation of smooth scrolling in the master branch?
I saw that you merged something from this issue and the Luflosi branch in
#1454,
but I couldn't figure out whether we can enable (some sort of) smooth scrolling in the current master.

@kovidgoyal
Copy link
Owner

there is no smooth scrolling #1454

@Luflosi
Copy link
Contributor

Luflosi commented Mar 11, 2020

@ducis We can work on this together, if you like.

@ducis
Copy link

ducis commented Mar 11, 2020

@Luflosi
Once I switch completely from tmux.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants