Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 3d shadowcasting 4 #42835

Merged
merged 18 commits into from
Sep 28, 2020

Conversation

kevingranade
Copy link
Member

Summary

SUMMARY: Performance "Overhaul shadowcasting to be iterative instead of recursive."

Purpose of change

The semi-recursive shadowcasting implementation we're using has extremely poor cache behavior.
It's also extremely complicated and difficult to reason about.

Describe the solution

Instead of recursing when new spans are discovered, stash the newly discovered span(s) and iterate over them in a deterministic (and predictable to the memory fetcher) order.

Testing

This PR adds a number of unit tests, we need some extensive in-game testing to verify that these act as expected.

Additional context

This is one of the longest-lived feature branches we've ever had, I started it but then stalled, @ifreund picked it up and carried it forward a great deal, and just now I've rebased it on the current master and am attempting to complete it.

@lgtm-com
Copy link

lgtm-com bot commented Aug 10, 2020

This pull request introduces 1 alert when merging 752ab21 into 7ba9f97 - view on LGTM.com

new alerts:

  • 1 for Use of c-style math functions

@kevingranade
Copy link
Member Author

kevingranade commented Aug 11, 2020

I discussed this with isaac, and he reminded me of the pending issue we discovered before he had to put this down.

Basically when splitting spans, we need to reference both the current row and either the previous or next row in order to split them correctly, otherwise view artifacts can appear.

This truth table outlines how splitting should proceed.

B C D E B->C B->D C->E D->E
O O O T X X Near Near
O O T O X Near X Far
O O T T X Near Near ?
O T O T Near X ? Near
O T T O Near Near Far Far
O T T T Near Near ? ?
T O O O Far Far X X
T O O T Far Far Near Near

The current proposal for achieving this is to split the current row from the preceding and following rows (if any), and store a bit string encoding which cells are opaque or transparent in the current row. This bit string is consulted on the pass through the next row to make splitting decisions.

@ifreund
Copy link
Contributor

ifreund commented Aug 12, 2020

One thing we overlooked last time we discussed this is that transparency is not binary. We can (and do) have tiles which are more/less transparent than others and need to split spans in these cases as well. Thus, a bit string is not sufficient.

@kevingranade
Copy link
Member Author

We didn't disscuss it, but I believe it's sufficient, though I'll be re-checking.
If I remember correctly, Previous row opacity is used to differentiate between near and far splitting in D and E, it doesn't induce new splitting if it's different shades of transparent.

@kevingranade kevingranade force-pushed the fix-3d-shadowcasting-4 branch from 752ab21 to 85d53e5 Compare August 18, 2020 07:49
@lgtm-com
Copy link

lgtm-com bot commented Aug 18, 2020

This pull request introduces 1 alert when merging 85d53e5 into 0f7a804 - view on LGTM.com

new alerts:

  • 1 for Use of c-style math functions

@stale
Copy link

stale bot commented Sep 17, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Please do not 'bump' or comment on this issue unless you are actively working on it. Stale issues, and stale issues that are closed are still considered.

@stale stale bot added the stale Closed for lack of activity, but still valid. label Sep 17, 2020
@stale stale bot removed the stale Closed for lack of activity, but still valid. label Sep 19, 2020
@lgtm-com
Copy link

lgtm-com bot commented Sep 19, 2020

This pull request introduces 1 alert when merging 8935e42 into d7d885f - view on LGTM.com

new alerts:

  • 1 for Use of c-style math functions

@esotericist
Copy link
Contributor

based on my testing, this PR addresses all of the flaws i know of in the previous 3d fov implementation.

there are a couple of oddities that i think are worth pursuing in the future, but are not actually new to this implementation:

when on a roof, you can see all enemies on the ground below, and they can see you.
when you are on the ground, you cannot see enemies on the roof, and they cannot see you.

this is fortunately not exploitable, but it is very disconcerting to know "i can see down from a rooftop and spot zombies that can see me back, but i can't tell if there's zombies on a roof before i climb up there".

but again, that was present in the prior implementation.

another item: curtains in vehicles block floodlights. this is a flaw that exists even with 3d fov disabled, but it's pretty absurd, and i had just previously thought it was a 3d fov bug (but no, it's just a generalized lolwat).

near as i can tell, this is ready from a does-it-work-right standpoint.

@anothersimulacrum anothersimulacrum added 0.F Feature Freeze <Enhancement / Feature> New features, or enhancements on existing [C++] Changes (can be) made in C++. Previously named `Code` Code: Performance Performance boosting code (CPU, memory, etc.) labels Sep 25, 2020
@esotericist
Copy link
Contributor

this should be considered to fix #31952

Copy link
Contributor

@jbytheway jbytheway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a review of just the test code to begin with. Plan to review the algorithm next.

tests/shadowcasting_test.cpp Outdated Show resolved Hide resolved

const int upper_bound = fov_3d ? OVERMAP_LAYERS : 12;
for( int z = fov_3d ? 0 : 11; z < upper_bound; ++z ) {
caches[z] = new level_cache();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid raw new? Use unique_ptr? I guess it's a bit annoying because you need the array of pointers, so you'd need a separate container for the unique_ptrs...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the way I'm mapping between the contents of the level cache and the relatively large type definitions and establishing aliases for them, this is really touchy to adjust, so I'd rather not open that can of worms for an isolated invocation of new in some test code.
I 100% agree that I don't want any avoidable new in the main game code.

tests/shadowcasting_test.cpp Outdated Show resolved Hide resolved
tests/shadowcasting_test.cpp Outdated Show resolved Hide resolved
tests/shadowcasting_test.cpp Outdated Show resolved Hide resolved
tests/shadowcasting_test.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@jbytheway jbytheway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking promising. I've only looked at the code, not tested in-game.

src/shadowcasting.cpp Outdated Show resolved Hide resolved
src/shadowcasting.cpp Outdated Show resolved Hide resolved
src/shadowcasting.cpp Show resolved Hide resolved
src/shadowcasting.cpp Outdated Show resolved Hide resolved
src/shadowcasting.cpp Outdated Show resolved Hide resolved
src/shadowcasting.cpp Outdated Show resolved Hide resolved
src/shadowcasting.cpp Outdated Show resolved Hide resolved
bool floor_block = false;
if( current.z < offset.z ) {
if( z_index < ( OVERMAP_LAYERS - 1 ) &&
( *floor_caches[z_index + 1] )[current.x][current.y] ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a problem with this floor logic. Imagine a building with floors but no walls. When looking from the side towards such a building from far away you will only be able to see the tiles on the same level as you. But really you should be able to see some amount of tiles on higher and lower levels. Of course, to do this correctly we would need to distinguish "seeing the floor of the tile" and "seeing stuff (like monsters) in the tile", because you shouldn't be able to see the floor on tiles above. But I'd say in the medium term I'd be happy with players seeing the tiles above even if it allows them to see floors they shouldn't.

I don't think this needs to be fixed in this PR. Might be worth a TODO, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea put in a TODO, this needs some serious thought about how to handle this, and it's not a can of worms I want to open right now.

src/shadowcasting.cpp Show resolved Hide resolved
@kevingranade kevingranade marked this pull request as ready for review September 26, 2020 20:14
@kevingranade kevingranade force-pushed the fix-3d-shadowcasting-4 branch 2 times, most recently from 5f1f318 to 72ebbe4 Compare September 26, 2020 21:41
Copy link
Contributor

@jbytheway jbytheway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One tiny niggle, but LGTM.

// See definition of trailing_edge_major and leading_edge_major for clarification.
const slope trailing_edge_minor( delta.x * 2 - 1, delta.y * 2 + 1 );
const slope leading_edge_minor( delta.x * 2 + 1, delta.y * 2 - 1 );
static half_open_rectangle<point> bounds( point_zero, point( MAPSIZE_X, MAPSIZE_Y ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static half_open_rectangle<point> bounds( point_zero, point( MAPSIZE_X, MAPSIZE_Y ) );
static constexpr half_open_rectangle<point> bounds( point_zero, point( MAPSIZE_X, MAPSIZE_Y ) );

@kevingranade
Copy link
Member Author

Wow, this is more impactful than I even expected.
It shows up a bit with CPU profiling, but then I remembered the main benefit we expected is better cache behavior, so I ran the shadowcasting benchmarks to get the full picture.

shadowcasting_3d_2d_performance
Old code:
Note this first execution was much longer than the others.
cast_zlight() executed 100000 times in 1735470291 microseconds.
cast_zlight() executed 10000 times in 220941276 microseconds.
cast_zlight() executed 10000 times in 151328801 microseconds.
cast_zlight() executed 10000 times in 150332814 microseconds.
Total 2258073182 microseconds over 130000 executions.
Average 17.36ms / execution

New Code:
cast_zlight() executed 10000 times in 134063135 microseconds.
cast_zlight() executed 10000 times in 69909670 microseconds.
cast_zlight() executed 10000 times in 124742772 microseconds.
cast_zlight() executed 10000 times in 97593569 microseconds.
cast_zlight() executed 10000 times in 112892918 microseconds.
cast_zlight() executed 10000 times in 66551564 microseconds.
Total 605753628 microseconds over 60000 executions.
Average 10.09ms / execution

kevingranade and others added 17 commits September 27, 2020 03:51
Replace floats with new zlight_slope struct.

This now passes 2 of the tests that the old recursive algorithm was failing,
but encounters an infinite loop in some cases without the fuzzing that seems
to have been hiding a major bug.

Fix a few other minor bugs.
This is the best way I could find to fix the infinite loop without
creating tons of artifacts.
- Add casting direction diagrams
- Remove unnecessary declaration
This commit adds a new cast_vertical_zlight_segment function with
different iteration logic than the now renamed
cast_horizontal_zlight_segment.

This duplicates a lot of code, but imo is worth it for the readablity
gains over adding a lot more contitonals to the iteration logic to handle
zx and zy template parameters properly.
and remove the code that allows disabling it
It somehow took me until now to realize that y coordinates are counted
from the "top" of the screen not the bottom, so fix the comment
diagrams
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[C++] Changes (can be) made in C++. Previously named `Code` Code: Performance Performance boosting code (CPU, memory, etc.) <Enhancement / Feature> New features, or enhancements on existing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants