various optimisations #1

slowriot · 2014-11-07T03:36:49Z

No description provided.

…!= where safe and pre-increment wherever return value is not used

…ements with their pre-increment and pre-decrement equivalents wherever the return value isn't used

ghost · 2014-11-07T09:33:19Z

@slowriot I went through your patch and I am just curious — why do you prefer pre-incrementing and not-equal operator instead of usual post-incrementing and lower-than check?

for(i = 0; i < 3; i++) ;
for(i = 0; i != 3; ++i) ;

lvandeve · 2014-11-07T15:33:47Z

Hi, thanks, some very useful patches in there, thanks for finding the code duplication that could be eliminiated and the optimizations.

I'd prefer to not pull in the postincrement to preincrement conversions though, it should not affect optimization because it are primitive types, where this makes no difference, and postincrement is the style chosen to be used in this code.

Thanks!

slowriot · 2014-11-10T17:24:53Z

@petrkutalek on many architectures != is noticeably faster than < at the machine instruction level. As for pre-increment, that's just good practice to do in every situation when you aren't using the return value, as post-increment requires a copy. At best, you're counting on your compiler optimising away these inefficiencies for you, but it's better practice to just write the code the way you want it.

@lvandeve of course it's up to you if you want to use postincrement, and there should be no difference here as long as they remain primitive types (in any case the compiler will optimise away any issues since you aren't using the return values in most cases) - but if code were added with objects in place of primitives in future, then inefficiencies could creep in.

ghost · 2014-11-10T23:33:36Z

@slowriot Thanks for your explanation! I definitely use these tips in my projects, I like such C optimisations. Well, I'm getting smarter every day.

lvandeve · 2014-11-11T09:46:22Z

Alright, nice. I'm just going to run a tests on this later at home.

The code needs to be able to compile with C89, where variable declarations must be at the top of the scope so I think the places where they moved into a for will not work.

slowriot · 2014-11-11T10:07:35Z

@lvandeve Is there any particular reason you want to stick with 25 year outdated C89 when the current C standard is C11 - especially if there is an associated performance cost?

It just seems paradoxical when the library itself is packaged in C++ format with files named .cpp needing to be renamed to .c by the user for C support. Are there really any users of this library at all who require strict C89 conformity?

lvandeve · 2014-11-11T10:23:33Z

Yes, it was requested. Originally it was all C++, no C.

On 11 November 2014 11:07, slowriot [email protected] wrote:

@lvandeve https://github.com/lvandeve Is there any particular reason
you want to stick with 25 year outdated C89 when the current C standard is
C11 - especially if there is an associated performance cost?

It just seems paradoxical when the library itself is packaged in C++
format with files named .cpp needing to be renamed to .c by the user for C
support. Are there really any users of this library at all who require
strict C89 conformity?

—
Reply to this email directly or view it on GitHub
#1 (comment).

slowriot · 2014-11-11T11:24:31Z

@lvandeve then it seems to me that the way to get best performance from this library would be to maintain separate C89 and C++ branches, with the C++ branch making use of C++11 features to improve performance... although of course I could see that being more work to maintain, but it sounds like a number of compromises are being made in performance just in order to support a very minor part of the userbase still relying on a hugely outdated standard.

lvandeve · 2014-11-11T12:09:10Z

Yes I'm sorry, while performance is definitely important, it is not worth
such tradeoff here :)

On 11 November 2014 12:24, slowriot [email protected] wrote:

@lvandeve https://github.com/lvandeve then it seems to me that the way
to get best performance from this library would be to maintain separate C89
and C++ forks, with the C++ fork making use of C++11 features to improve
performance... although of course I could see that being more work to
maintain, but it sounds like a number of compromises are being made in
performance just in order to support a very minor part of the userbase
still relying on a hugely outdated standard.

—
Reply to this email directly or view it on GitHub
#1 (comment).

lvandeve · 2014-11-15T17:05:07Z

I added in the first two changes. Thanks for these.

There must be something wrong in one of the next changes, because the unit test fails:

g++ lodepng.cpp lodepng_util.cpp lodepng_unittest.cpp -Wall -Wextra -Wshadow -pedantic -ansi -O3 && ./a.out
codec test 1 1
Error: problem while processing dynamic deflate block
Error: Not equal! Expected 0 got 15. Message: decoder error C: problem while processing dynamic deflate block
error!

Also, here is how I test that strict C89 works:

mv lodepng.cpp lodepng.c ; gcc lodepng.c example_decode.c -ansi -pedantic -Wall -Wextra -O3 ; mv lodepng.c lodepng.cpp

slowriot added 7 commits November 7, 2014 02:44

do not check unsigned windowsize for < 0

3b22b26

collapsing duplicate branch

4547bfa

reducing scope of loop counters, and micro-optimising loops by using …

daac255

…!= where safe and pre-increment wherever return value is not used

optimisation: replacing all unnecessary post-increments and post-decr…

a85479d

…ements with their pre-increment and pre-decrement equivalents wherever the return value isn't used

oops, typo fix

8a77a04

another typo fix

0696a29

using size_t iterator when comparing to size_t

e141374

lvandeve closed this Nov 21, 2014

seungwoos mentioned this pull request Jan 3, 2020

pngdetail hangs at lodepng_strlen() #123

Closed

bhaller mentioned this pull request Oct 13, 2020

RGB converts to grayscale by taking the R channel, in contradiction to what the doc states #139

Open

Cvjark mentioned this pull request May 24, 2022

vulnerability Discover #165

Closed

This was referenced Oct 21, 2022

Memory leaks in function benchmark. #176

Open

SEGV on unknown address in function pngdetail. #177

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

various optimisations #1

various optimisations #1

slowriot commented Nov 7, 2014

ghost commented Nov 7, 2014

lvandeve commented Nov 7, 2014

slowriot commented Nov 10, 2014

ghost commented Nov 10, 2014

lvandeve commented Nov 11, 2014

slowriot commented Nov 11, 2014

lvandeve commented Nov 11, 2014

slowriot commented Nov 11, 2014

lvandeve commented Nov 11, 2014

lvandeve commented Nov 15, 2014

various optimisations #1

various optimisations #1

Conversation

slowriot commented Nov 7, 2014

ghost commented Nov 7, 2014

lvandeve commented Nov 7, 2014

slowriot commented Nov 10, 2014

ghost commented Nov 10, 2014

lvandeve commented Nov 11, 2014

slowriot commented Nov 11, 2014

lvandeve commented Nov 11, 2014

slowriot commented Nov 11, 2014

lvandeve commented Nov 11, 2014

lvandeve commented Nov 15, 2014