-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize allocation patterns in item and elsewhere #70423
Merged
Maleclypse
merged 12 commits into
CleverRaven:master
from
akrieger:itemizing_item_optimizing
Jan 6, 2024
Merged
Optimize allocation patterns in item and elsewhere #70423
Maleclypse
merged 12 commits into
CleverRaven:master
from
akrieger:itemizing_item_optimizing
Jan 6, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
akrieger
commented
Dec 24, 2023
github-actions
bot
added
Map / Mapgen
Overmap, Mapgen, Map extras, Map display
[C++]
Changes (can be) made in C++. Previously named `Code`
Fields / Furniture / Terrain / Traps
Objects that are part of the map or its features.
Code: Performance
Performance boosting code (CPU, memory, etc.)
labels
Dec 24, 2023
Great, I managed to make every compiler mad at me somehow. |
github-actions
bot
added
the
astyled
astyled PR, label is assigned by github actions
label
Dec 24, 2023
inogenous
reviewed
Dec 24, 2023
prharvey
reviewed
Dec 24, 2023
akrieger
force-pushed
the
itemizing_item_optimizing
branch
from
December 26, 2023 20:44
8216003
to
ec501ee
Compare
github-actions
bot
added
the
json-styled
JSON lint passed, label assigned by github actions
label
Dec 26, 2023
akrieger
commented
Dec 26, 2023
github-actions
bot
added
the
BasicBuildPassed
This PR builds correctly, label assigned by github actions
label
Dec 26, 2023
Clang has many errors. Do they belong to you? |
Yep I made it mad. I'll have to fix. |
akrieger
force-pushed
the
itemizing_item_optimizing
branch
from
January 4, 2024 21:09
ec501ee
to
211e1f0
Compare
Hm all my stuff together, not separately, seems to be hitting an issue. Let me verify I didn't mess up a merge somewhere. |
akrieger
added
the
0.H Backport
PR to backport to the 0.H stable release canddiate
label
Jan 6, 2024
kevingranade
added a commit
that referenced
this pull request
May 14, 2024
Procyonae
added
0.H Backported
and removed
0.H Backport
PR to backport to the 0.H stable release canddiate
labels
May 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
0.H Backported
astyled
astyled PR, label is assigned by github actions
BasicBuildPassed
This PR builds correctly, label assigned by github actions
[C++]
Changes (can be) made in C++. Previously named `Code`
Code: Performance
Performance boosting code (CPU, memory, etc.)
Fields / Furniture / Terrain / Traps
Objects that are part of the map or its features.
json-styled
JSON lint passed, label assigned by github actions
Map / Mapgen
Overmap, Mapgen, Map extras, Map display
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Performance "Optimizing mostly item to reduce load-time allocation by over 20%"
Purpose of change
Memory profiling cdda is extremely painful because just getting through load can take a stupid amount of time. Reducing the number of allocations done during load can help with that specifically, but also generally optimizing for allocations is beneficial for performance.
Describe the solution
There's a few things going on here.
item
contains several std:: container types. These types are not allnoexcept
movable, soitem
can't benoexcept
movable either. This means containers of items like vectors have to copy items on move, instead of efficiently moving them, which is a horrendous waste of allocations. Introduce aheap<>
template wrapper which we can use to put these nested containers into the heap. Although this is another level of indirection, these aren't hotly accessed members, so the penalty is negligible. But, by wrapping the std containers in theheap<>
type, we can makeitem
benoexcept
movable, which is vastly more performant. This by itself eliminates many millions of allocations.heap<>
type operates asunique_ptr
on move, we don't pay the penalty of reinitializing the containers when they are moved-from. These reinitializations are typically wasted allocations that are immediately freed. This does mean a moved-fromitem
is definitely not reusable anymore, but the typical pattern is moved-from objects aren't reusable anyway.utf8_display_split
. The original creates a separate string per non-zero-width utf8 character in mapgen pallettes, afaict. We can instead write a version that pushesstring_view
s into a by-refvector
argument. Thatvector
ofstring_view
s can then be statically allocated in mapgen and cached across calls, resulting in effectively constant allocation space. (Alternatively, using acata::small_literal_vector
to stack allocate may work becausestring_view
is trivial and qualifies for that helper class).lazy<>
to a few more members, such asfield::_field_type_list
and, notably,item
'ssafe_reference_anchor
member, also saves noticable amounts of allocs.Describe alternatives you've considered
Testing
Use a gui based
heapprof
like program to instrument the game loading.Before:
After:
The number is 'number of allocations made', as in, calls to malloc, not bytes allocated. Most of the savings are in
Item_factory::check_definitions
Additional context