-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python optimization #20
Comments
https://bugs.gentoo.org/615412 solution: https://forums.gentoo.org/viewtopic-p-8030692.html#8030692 |
"Speaking about python, can be build with pgo by default. Well 3.5 version take A LOT more time, but lto, graphite and pgo speed gain is ~25%." By the way, what about profile-guided optimization? It would be interesting to enable it for as many packages as possible. Edit:
According to this PGO should also speed up gcc up to 15% when compiling other packages. That sounds awesome. |
Does |
Python-2.7.13/configure.ac:
|
to avoid question about
|
A 25% speedup with Python PGO! Wow! Would certainly help my emerge dependency resolution times :) PGO is something I have been keeping my eye on and I would like to support it if at all possible. I'm going to guess that only packages with test suites are eligible as the profiling information has to come from somewhere--either that or we would require users to use whichever PGOed package for a day to collect such info. I know Firefox also supports PGO, although I haven't had much luck in building with it. What do you think about doing PGO on a per-package basis until we iron out the details about how best to do it? I'm also not opposed to including modified ebuilds in the repo that offer PGO as a USE flag for certain packages. I expect we may end up having to do such anyways to fix broken ebuilds before they are merged upstream. Perhaps to start with we could look at GCC and Python, per the comments in this issue? |
That's exactly what I thought: portage is sooooo slow!
I completely agree.
When possibile I would rather prefer to live patch the ebuilds instead. I used to do it when I needed to carry additional patches, but I'm not sure how extensible is this approach. |
Yeah I just switched my own Portage over to git to make it easier to upstream patches. Working on this one right now: gentoo/gentoo#5741 I want to get a "pgo" USE in GCC as I had a successful build today with PGO per your comment above |
I'll be able to take a look at Python more in depth over the next couple of days. It doesn't look super involved thankfully. Unfortunately for me, Glibc 2.26-r1 is actually preventing my rebuild of Python (any version) because it doesn't have the "rpc" stuff in it anymore (it was deprecated and isn't in the USE anymore). I may file a bug upstream about it, as Python seems to hard depend on it. So I would not be able to test any ebuild modifications I make locally. If anyone does manage to patch the ebuild and submit it upstream, would you mind linking the PR here? |
Great! Can you please benchmark it against a couple of packages, measuring how much time they need to be compiled with both versions of gcc (gcc has been compiled with and without PGO)? |
Sure! Do you have any packages in mind? I was thinking Firefox would be a good one. |
Firefox nowadays compiles a lot of Rust, so that’s perhaps not the best idea. I was thinking about the GCC itself. |
True--I hadn't considered Rust. I'll do a simple |
Possibly a package which scales very well with the number of cores, meaning that it passes most of the time doing real compilation instead of linking etc. For example libreoffice will take a lot of time to compile despite how many cores you will throw at him: the vast majority of the cores will be semi-idles most of the time. |
Even better: the linux kernel. Just run |
Wow--barely any difference in GCC. Method:
I'll try the linux kernel next |
Disappointing, this is just a 0.5%, very far from the 15% stated in that thread :( |
Indeed! I'm wondering if perhaps GCC is pathological due to the bootstrapping it does. I think the linux kernel will be a better test for sure. Anyone want to try that out? |
I can't, my dual core laptop isn't powerful enough for my tastes to be able to run Gentoo so I switched to Arch. I plan to buy a Threadripper in a couple of months, it will be fun then. My goal is to be able to rebuild all the packages (with O3, graphite, LTO and PGO) during the night (so in less than 7-8 hours). Probably a Ryzen 7 would be enough for that, but I will make good use of more cores anyway. |
python2.7 PGO preview not sure if it builds correct since it was too fast for profiled build and I need to check output for any errors
|
🐎 |
|
Work-around is simply adding: |
@pchome Nice work! It looks like the python startup times improved with PGO, but many other tests actually were slightly slower--is that within the range of statistical error? The only way I can think of PGO being slower would be if the training set didn't have good exemplar data. @ThGravo Thanks! I'll give that a shot today. I was wondering if that |
No, actually I found the error:
so I believe only startup profiled I have no luck to fix this w/ different flag combination, maybe it's GCC profiler error. |
Linux kernel results are in! I used the Method:
Results:
30 seconds shaved off. Not bad! Also, I edited the |
@pchome Great catch--I'll hopefully have some time to look at making a similar patch for Python today or tomorrow. I'll submit that upstream and link the PR here as usual. Very cool findings everyone. |
Actually it's a lot: it's 12% faster! |
I expect it'll probably vary on package to package. GCC for example does bootstrapping which will wipe out the PGO benefits early on, resulting in what we saw before. Still, probably worth it for a Gentoo-er :) It looks like there may be a bug in the GCC buildsystem that does not respect AR, NM, and RANLIB for some packages. I can build it with LTO if I use the linker plugin, but not without. Not something you'd notice unless you removed |
Ahh yes, found the problem with GCC LTO--details have been posted to the symlink thread to keep things clean in here. |
Of potential interest here: https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01692.html https://gcc.gnu.org/wiki/AutoFDO/Tutorial It looks like this isn't quite ready for prime time yet, but it could potentially make PGO accessible to a wider range of programs (even if it's not as good as explicit PGO). Neat! |
Meanwhile
all tests takes near 1hr on my system |
Did a quick test with Method:
My own Fish shell records the wall time it takes. Results:
I suspect the emerge time is dominated by the SAT stuff taking place inside it. Still, shaving 12 seconds off isn't bad! I expect the best results would probably come from something like |
PR created upstream for all Python versions in Portage: |
Wait... Does portage work with pypy? |
I have read that some have had success with pypy2, but no word on pypy3. Would you be willing to try? |
As I said in another thread unfortunately my dual core laptop isn't fast enough to run Gentoo, so until I will buy a faster desktop I'm stuck with Arch Linux. |
GCC PR accepted: gentoo/gentoo@0591d59 Go ahead and set your gcc USE="pgo" and enjoy everyone :) |
I encourage everyone to check out the bashrc.d thread as it is indeed related to general PGO! |
How exactly does it support PGO? |
See the README file--there's an entire section on PGO in there. https://github.com/vaeth/portage-bashrc-mv/blob/master/bashrc.d/README |
Can we have a POC so we can see if it fits into the workflow easily. |
As of HEAD, we now have PGO-enabled Python ebuilds. PGO is off by default, but can be enabled by adding |
Sorry to revive, I'd rather not open another issue for this. I've noticed that this occurs in the build log when emerging
So I've done:
Now build.log shows
...However, sinice the lto-overlay ebuild has PGO, does it even matter to enable I'll just be adding |
When you have a look at the configure output of python there is potential:
checking for --enable-optimizations... no
and further down the line
If you want a release build with all optimizations active (LTO, PGO, etc), please run ./configure --enable-optimizations
Do you see a way to incorporate that in your overlay?
The text was updated successfully, but these errors were encountered: