Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

various optimisations #1

Closed
wants to merge 7 commits into from
Closed

various optimisations #1

wants to merge 7 commits into from

Conversation

slowriot
Copy link
Contributor

@slowriot slowriot commented Nov 7, 2014

No description provided.

@ghost
Copy link

ghost commented Nov 7, 2014

@slowriot I went through your patch and I am just curious — why do you prefer pre-incrementing and not-equal operator instead of usual post-incrementing and lower-than check?

for(i = 0; i < 3; i++) ;
for(i = 0; i != 3; ++i) ;

@lvandeve
Copy link
Owner

lvandeve commented Nov 7, 2014

Hi, thanks, some very useful patches in there, thanks for finding the code duplication that could be eliminiated and the optimizations.

I'd prefer to not pull in the postincrement to preincrement conversions though, it should not affect optimization because it are primitive types, where this makes no difference, and postincrement is the style chosen to be used in this code.

Thanks!

@slowriot
Copy link
Contributor Author

@petrkutalek on many architectures != is noticeably faster than < at the machine instruction level. As for pre-increment, that's just good practice to do in every situation when you aren't using the return value, as post-increment requires a copy. At best, you're counting on your compiler optimising away these inefficiencies for you, but it's better practice to just write the code the way you want it.

@lvandeve of course it's up to you if you want to use postincrement, and there should be no difference here as long as they remain primitive types (in any case the compiler will optimise away any issues since you aren't using the return values in most cases) - but if code were added with objects in place of primitives in future, then inefficiencies could creep in.

@ghost
Copy link

ghost commented Nov 10, 2014

@slowriot Thanks for your explanation! I definitely use these tips in my projects, I like such C optimisations. Well, I'm getting smarter every day.

@lvandeve
Copy link
Owner

Alright, nice. I'm just going to run a tests on this later at home.

The code needs to be able to compile with C89, where variable declarations must be at the top of the scope so I think the places where they moved into a for will not work.

@slowriot
Copy link
Contributor Author

@lvandeve Is there any particular reason you want to stick with 25 year outdated C89 when the current C standard is C11 - especially if there is an associated performance cost?

It just seems paradoxical when the library itself is packaged in C++ format with files named .cpp needing to be renamed to .c by the user for C support. Are there really any users of this library at all who require strict C89 conformity?

@lvandeve
Copy link
Owner

Yes, it was requested. Originally it was all C++, no C.

On 11 November 2014 11:07, slowriot [email protected] wrote:

@lvandeve https://github.com/lvandeve Is there any particular reason
you want to stick with 25 year outdated C89 when the current C standard is
C11 - especially if there is an associated performance cost?

It just seems paradoxical when the library itself is packaged in C++
format with files named .cpp needing to be renamed to .c by the user for C
support. Are there really any users of this library at all who require
strict C89 conformity?


Reply to this email directly or view it on GitHub
#1 (comment).

@slowriot
Copy link
Contributor Author

@lvandeve then it seems to me that the way to get best performance from this library would be to maintain separate C89 and C++ branches, with the C++ branch making use of C++11 features to improve performance... although of course I could see that being more work to maintain, but it sounds like a number of compromises are being made in performance just in order to support a very minor part of the userbase still relying on a hugely outdated standard.

@lvandeve
Copy link
Owner

Yes I'm sorry, while performance is definitely important, it is not worth
such tradeoff here :)

On 11 November 2014 12:24, slowriot [email protected] wrote:

@lvandeve https://github.com/lvandeve then it seems to me that the way
to get best performance from this library would be to maintain separate C89
and C++ forks, with the C++ fork making use of C++11 features to improve
performance... although of course I could see that being more work to
maintain, but it sounds like a number of compromises are being made in
performance just in order to support a very minor part of the userbase
still relying on a hugely outdated standard.


Reply to this email directly or view it on GitHub
#1 (comment).

@lvandeve
Copy link
Owner

I added in the first two changes. Thanks for these.

There must be something wrong in one of the next changes, because the unit test fails:

g++ lodepng.cpp lodepng_util.cpp lodepng_unittest.cpp -Wall -Wextra -Wshadow -pedantic -ansi -O3 && ./a.out
codec test 1 1
Error: problem while processing dynamic deflate block
Error: Not equal! Expected 0 got 15. Message: decoder error C: problem while processing dynamic deflate block
error!

Also, here is how I test that strict C89 works:

mv lodepng.cpp lodepng.c ; gcc lodepng.c example_decode.c -ansi -pedantic -Wall -Wextra -O3 ; mv lodepng.c lodepng.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants