Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a complexity scoring class for Metal and OpenGL #31417

Merged
merged 15 commits into from
Feb 18, 2022

Conversation

gw280
Copy link
Contributor

@gw280 gw280 commented Feb 11, 2022

This adds an initial implementation of a complexity scoring class for Metal. Some notes:

  • There are a lot of magic numbers in this file. These come from me processing benchmark data, and I have written long descriptions in the comments to give the reader an idea of where these numbers came from.
  • drawTextBlob and drawVertices are incomplete right now, because we are unable to get the glyph count or the vertex count respectively. In their place, I've put in estimates where I can.
  • The scores assigned are based off a baseline of 0.0005ms being a score of 100. This is a very rough estimate. With a 32-bit unsigned integer, this will allow us to score up to approximately 21 seconds before we overflow.
  • Throughout the file I reference the constants m and c. This stems from y=mx+c, and details the line graph used to best fit the benchmark data for that particular usecase. Important: these constants are before dividing the benchmark data by the number of draw calls made.

flutter/flutter#86728

Tests to come.

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the Flutter Style Guide and the C++, Objective-C, Java style guides.
  • I listed at least one issue that this PR fixes in the description above.
  • I added new tests to check the change I am making or feature I am adding, or Hixie said the PR is test-exempt. See testing the engine for instructions on
    writing and running engine tests.
  • I updated/added relevant documentation (doc comments with ///).
  • I signed the CLA.
  • All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

@gw280 gw280 requested a review from flar February 11, 2022 23:10
@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
return;
}
// The performance penalties seem fairly consistent percentage-wise
float non_hairline_penalty = 1.0f;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this suggestion is a premature optimization, but just want to throw it out there in case it's useful later:

Consider avoiding floating point. You can do that by multiplying these factors by 10, and then dividing by 10 later when you need the final result.

Other strategies might be needed below, but we can delay thinking about them until they're needed.

display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@flar flar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of questions asked, so I'll leave this as a "Comment" review rather than an approve or request changes review.

display_list/display_list_complexity_metal.h Outdated Show resolved Hide resolved
display_list/display_list_complexity_metal.h Outdated Show resolved Hide resolved
display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
// m = 1/2
// c = 1
save_layer_complexity = (save_layer_count_ + 2) * 100000;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I have 200 saveLayers then I can save time by adding one more? If the slope changes at 200, then perhaps this can be handled in saveLayer by doing something like:

  if (++count > 200) {
    accumulate(M1 * x + B1);
  } else {
    accumulate(M2 * x + B2);
  }

Also, is this depth based or sequential based or both?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plugging in 200 for the count in both equations shows a huge difference in the values. That doesn't sound right. Is my comment about 201 saveLayers taking less time than 199 saveLayers true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depth vs. sequential might make a difference but there's no clear trend. Looking at the benchmark data, it varies from -30% to +18%.

Bizarrely, the benchmarking data shows a decrease in overall time at around 128 saveLayer calls, then it starts to spike upwards at a much higher rate starting at around 256 calls. I see this dip in both the nested and the unnested benchmark runs. See the attached screenshot - for the blue line, the X axis values are the saveLayer count; for the orange line, multiply them by 8. That being said, the data is very hard to actually fit to a trend so this is really just a very rough approximation.

With all that being said, any saveLayer right now is going to hit the threshold for caching, and when we're talking about 200 saveLayer calls, whether we're talking about a cost of 20,200,000 (very roughly a time cost of 101 milliseconds) or 8,200,000 (roughly 41ms), they both far exceed the threshold for caching. I think we're overthinking this.

Screen Shot 2022-02-12 at 12 29 26 PM
.

display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
display_list/display_list_complexity_metal.cc Outdated Show resolved Hide resolved
// one and a less expensive one. Both scale linearly with area.
//
// Expensive: All filled style, symmetric w/AA
bool expensive =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

symmetric RRects are more expensive than non-symmetric versions? Did you mention this to Skia?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet, but yes, symmetric does seem to be more expensive.

@gw280 gw280 changed the title Add a complexity scoring class for Metal Add a complexity scoring class for Metal and OpenGL Feb 15, 2022
@gw280
Copy link
Contributor Author

gw280 commented Feb 15, 2022

Big update:

  • I've added a base class called DisplayListComplexityHelper where a lot of the common code between the two complexity calculators lives
  • Added an OpenGL implementation now
  • Fixed some minor bugs that were present before

Tests to come.

Copy link
Member

@zanderso zanderso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: Comments should end with a '.'.

The license check is just asking for the new files to be listed in the golden file.

display_list/display_list_complexity_helper.h Show resolved Hide resolved
@zanderso
Copy link
Member

/cc @iskakaushik

flow/raster_cache.cc Outdated Show resolved Hide resolved
@gw280 gw280 removed the needs tests label Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants