Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR: Improve CFI 2017 and TM30-18 Performance #1120

Merged
merged 33 commits into from
Jul 16, 2023
Merged

Conversation

tjdcs
Copy link
Contributor

@tjdcs tjdcs commented Mar 14, 2023

Summary

This PR implements several performance improvements, again, that should improve the speed and quality of the entire code base. These improvements mainly revolve around caching in certain places during sd_to_XYZ in order to prevent excessive copies of CMFs etc...

Because these features change caching behavior there is a small risk that there will be regressions, despite the tests passing so it is requested that it is tried in some other developer applications before merging.

2X performance improvement in CFI 2017 workloads

Preflight

Code Style and Quality

  • Unit tests have been implemented and passed.
  • Pyright static checking has been run and passed.
  • Pre-commit hooks have been run and passed.
  • New transformations have been added to the Automatic Colour Conversion Graph.
  • New transformations have been exported to the relevant namespaces, e.g. colour, colour.models.

Documentation

  • New features are documented along with examples if relevant.
  • The documentation is Sphinx and numpydoc compliant.

@tjdcs tjdcs requested a review from KelSolaar March 14, 2023 03:09
@tjdcs
Copy link
Contributor Author

tjdcs commented Mar 14, 2023

Looks like I did actually mess up the tests some how... I was really hoping that was my local environment quarks.

@tjdcs tjdcs force-pushed the performance/cfi2017 branch from 6d4e98a to 283e544 Compare March 14, 2023 15:26
@tjdcs
Copy link
Contributor Author

tjdcs commented Mar 14, 2023

@KelSolaar looks like the only issue left is the PCA issue. We'll have to figure out a plan for those PCA tests. FYI three of them are failing on my local machine. I suspect that small changes in the code base are affecting floating point errors and that's feeding forward into PCA results.

@tjdcs tjdcs force-pushed the performance/cfi2017 branch from ee2899b to 8045c1c Compare April 14, 2023 19:02
@coveralls
Copy link

coveralls commented Apr 14, 2023

Coverage Status

coverage: 99.773% (-0.008%) from 99.781% when pulling 074721b on performance/cfi2017 into 6b8b5d4 on develop.

@tjdcs
Copy link
Contributor Author

tjdcs commented Apr 24, 2023

Status: around 42x faster than v.0.4.2 release

@tjdcs
Copy link
Contributor Author

tjdcs commented Apr 25, 2023

@KelSolaar we have about an 100x performance improvement for TM30 in this branch now. I think it should be merged. I'd like to get another 2x faster than this... but that will require more significant refactoring. First targets if anyone wants to pick it up are the Ohno 2013 code and the CAM code...

I might introduce / write a paper on a highly compute optimized CAM. That could be nice in the future.

Anyway... The only failing tests look like some of those precision errors. The tests pass on my local machine. Let me know if you want me to fix the tests, or what you'd like to do to pass this.

And of course... looking forward to your review.

Copy link
Member

@KelSolaar KelSolaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @tjdcs!

A lot of good and exciting stuff there. I did a first review pass. I would be keen to clean up the commits a bit as discussed on Slack.

@@ -231,6 +230,7 @@ def XYZ_to_CIECAM02(
"Average"
],
discount_illuminant: bool = False,
compute_HQ: bool = True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major

Wondering if we should not generalise which correlates to compute. Having modified the CIECAM02 signature begs for modifying other CAMs for consistency. We could also have an inlined version of the model that is optimised for speed. We have a Hellwig et al. (2022) implementation that is inlined here for example: https://github.com/colour-science/colour/blob/feature/flattened_hellwig2022/colour/appearance/hellwig2022.py#L154

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference actually is that all color models are flattened, or close to it. Even models that are substantially similar.

As for generalizing a "which components to compute" that sounds like a great feature... but I don't want to address that in this PR. I like the proposal though.

colour/appearance/ciecam02.py Outdated Show resolved Hide resolved
colour/appearance/ciecam02.py Outdated Show resolved Hide resolved
colour/appearance/ciecam02.py Outdated Show resolved Hide resolved
colour/appearance/hunt.py Outdated Show resolved Hide resolved
colour/algebra/common.py Outdated Show resolved Hide resolved
colour/temperature/tests/test_ohno2013.py Outdated Show resolved Hide resolved
colour/temperature/ohno2013.py Outdated Show resolved Hide resolved
-------
:class:`numpy.ndarray`
*CIE UCS* colourspace *uv* chromaticity coordinates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor

Ditto for example.

.gitignore Outdated Show resolved Hide resolved
@tjdcs tjdcs force-pushed the performance/cfi2017 branch from b1ca2a0 to a7d8c37 Compare May 4, 2023 02:00
@tjdcs
Copy link
Contributor Author

tjdcs commented May 4, 2023

@KelSolaar This is greatly cleaned up now if you want to give it another pass.

@tjdcs tjdcs force-pushed the performance/cfi2017 branch 2 times, most recently from c832828 to ad67a6b Compare May 4, 2023 05:49
@tjdcs
Copy link
Contributor Author

tjdcs commented May 4, 2023

ad67a6b adds a dependency on xxhash which is an extremely fast hashing algorithm specifically suited for dictionary keys. https://github.com/Cyan4973/xxHash#quality

@tjdcs tjdcs force-pushed the performance/cfi2017 branch from 382a8c9 to 009700b Compare May 9, 2023 12:34
@tjdcs tjdcs force-pushed the performance/cfi2017 branch 2 times, most recently from 44dc34a to f83b0f1 Compare May 17, 2023 22:19
@KelSolaar KelSolaar changed the title Performance/cfi2017 PR: Improve CFI 2017 and TM30-18 Performance May 26, 2023
@KelSolaar
Copy link
Member

@tjdcs : Where are we with this one?

@tjdcs
Copy link
Contributor Author

tjdcs commented May 26, 2023

Bizarre corner cases in the initialization of SpectralDistribution and Signal appear to be failing tests. And there is a mountain of type errors that I can't wade through because my development environment and the CI environment are different. CI complains there are only a few type errors, but pyright on my machine reports more than 600.

It's going to be a small while before I can address these on my own. :/ i have to prep for cinegear, and then infocomm. I have urgent work projects taking the few days I have in between.

Probably the end of June is when I will have serious time to sit down with this again.

@KelSolaar
Copy link
Member

Thanks, sounds good! I'm going to update Pyright on our side so that we run the latest and then we can rebase your branch on top of develop.

@KelSolaar
Copy link
Member

develop is up-to-date against the latest pyright and a few fixes have been rolled for static analysis.

@tjdcs tjdcs force-pushed the performance/cfi2017 branch 3 times, most recently from 9dc7021 to 36d9255 Compare June 27, 2023 07:41
@KelSolaar
Copy link
Member

I looked quickly why the docs build is failing and I think that we are destroying the blackbody spectral distributions that we defined here :

blackbody_sds = [
, there:
if sd.shape.boundaries != cmfs.shape.boundaries:

This is one of the reason why reshape_sd was always operating on a copy, now we have introduced those kind of hard to debug side effects. I think we need to use a context manager or expose the copy argument in the definitions that use reshape_sd. Food for thoughts!

I would be keen to drop the Poetry commit from this PR as it is a separate problem and just created a huge mess on my end, took me a while to realise what had happened.

tjdcs added 6 commits June 29, 2023 14:51
Repair documenation build error

fix test due to default interpolation change


[pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

fix peculier init case in MaxOS environment

fix type checking

Auto stash before revert of "Improve default interpolation performance"
fix domain / range copy on signal init

fix doctests
@tjdcs tjdcs force-pushed the performance/cfi2017 branch from b9ad315 to 0558836 Compare June 29, 2023 18:51
@tjdcs
Copy link
Contributor Author

tjdcs commented Jun 29, 2023

@KelSolaar ready for a final review, assuming that you are OK if I hurt coverage by 3 lines...

@KelSolaar
Copy link
Member

Just pushed a small varnishing pass, hopefully we can merge right after if everything builds.

@KelSolaar
Copy link
Member

LGTM, happy to merge?

@tjdcs
Copy link
Contributor Author

tjdcs commented Jul 16, 2023

Make it so!

@KelSolaar KelSolaar merged commit ca0bba3 into develop Jul 16, 2023
@KelSolaar KelSolaar deleted the performance/cfi2017 branch July 16, 2023 07:32
@KelSolaar
Copy link
Member

Merged, thanks for the great work @tjdcs! Do you remember roughly the performance improvements for the release notes?

@KelSolaar KelSolaar added this to the v0.4.3 milestone Jul 16, 2023
@tjdcs
Copy link
Contributor Author

tjdcs commented Jul 16, 2023

45x from pre-merge commit. From when the work started I think it was around 100x I can check that tomorrow. Some other performance stuff got merged in early or there were other changes.

I backed out some of the hashing improvements which is the next big thing, but in order to do it with good code style we need to make xxhash a default install. (need to access private signal variables in multisignal __hash__)

@tjdcs
Copy link
Contributor Author

tjdcs commented Jul 16, 2023

colour_fidelity_index_ANSIIESTM3018 100x compared to v0.4.2

uv_to_CCT_Ohno2013 5x faster than v.0.4.2 which might also be of interest in a release note

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants