Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for AVX512? #137

Closed
count0 opened this issue May 27, 2024 · 19 comments
Closed

Support for AVX512? #137

count0 opened this issue May 27, 2024 · 19 comments
Labels
question Further information is requested

Comments

@count0
Copy link
Contributor

count0 commented May 27, 2024

Comment:

Some parts of graph-tool relating to network reconstruction benefit a lot from vectorization, but this relies on AVX512 being available.

Currently the binaries produced aim for the broadest possible support.

Is there a mechanism in conda to ship binaries for specific architectures at a finer level of granularity than amd64, etc? Or any other approach that might be helpful to distribute optimized binaries?

One obvious option is to create separate packages that work for some types of CPU and newer, but maybe there's already an infrastructure that I'm not aware.

@stuarteberg

@count0 count0 added the question Further information is requested label May 27, 2024
@stuarteberg
Copy link
Contributor

stuarteberg commented May 28, 2024

I found the following notes from a meeting last October:
https://conda-forge.org/community/minutes/2023-10-18/

Apparently that very same day, the feature was implemented in a PR.

It's documented here:
https://conda-forge.org/docs/maintainer/knowledge_base/#microarch

...but I can't seem to find any actual examples of it being used in other feedstocks.

There's also this comment in the microach-level feedstock README:

When building packages on CI, level=4 will not be guaranteed, so you can only use level<=3 to build.

...but I think that just means you can't require level 4 in your build environment. It doesn't forbid you from cross-compiling a level-4 executable.

@count0
Copy link
Contributor Author

count0 commented May 28, 2024

That's actually quite nice!

But the “not guaranteed” language is bit ambiguous. If we enable micro-architectures, will them get built by the CI or not?

@stuarteberg
Copy link
Contributor

My interpretation is that the CI will build every micro-arch you list.

But the docs want you to be aware that the CI machine that is actually performing that build is not guaranteed to have a CPU that supports level 4, even though it is producing level 4 binaries (thanks to cross-compilation). So you won’t be able to execute AVX-512 instructions during the build (including tests). I guess you would have to selectively disable testing for that arch level and just cross your fingers. Or build the feedstock on your own machine and upload the results to your own channel.

I think the only way to know for sure is to try it. Since I can find no other feedstocks using this yet, we will be the guinea pig, so to speak.

@count0
Copy link
Contributor Author

count0 commented May 28, 2024

Not being able to run the tests is not an issue... In fact, the tests we have are so trivial that I would be surprised if they would fail because of AVX or some other missing instruction. But in such a case we can just disable them.

Let's give it a try!

@count0
Copy link
Contributor Author

count0 commented May 28, 2024

Mmm... there are dozens of microarchitectures. The builds are going to take days to complete. I'll start with a few to get a sense of it.

@count0
Copy link
Contributor Author

count0 commented May 28, 2024

I'm trying with #140.... But I don't see anything different with the CI. Maybe I'm missing something.

@stuarteberg
Copy link
Contributor

Closed via #140

@count0 count0 reopened this May 29, 2024
@count0
Copy link
Contributor Author

count0 commented May 29, 2024

I just tried to install graph-tool from a clean conda install, and it does not seem to automatically select the microarch... I suppose I need to set it up somehow, but I didn't the documentation. Can you point me in the right direction?

@stuarteberg
Copy link
Contributor

Oh crap. Before we embarked on this microarch quest, I had created a new build of graph-tool so I could pin boost to 1.82. At that time, I chose a special build number for that branch: 1000.

IIUC, when the conda solver has two equally satisfactory packages to choose from, it selects the one with the higher build number. So when I try this:

conda create -n test-gt graph-tool

It proposes to give me this:

  graph-tool         conda-forge/linux-64::graph-tool-2.68-py312hf63df81_1000

That's my non-micro-optimized boost-1.82 packaged. (Note that #141 has NOT been merged yet.)

If I deliberately avoid that special build:

conda create -n test-gt graph-tool 'boost>=1.84'

...then I get the appropriate package:

  graph-tool         conda-forge/linux-64::graph-tool-2.68-py311h2cebea5_402

So the question now is how to resolve this unfortunate issue related to build numbers and package precedence.

@count0
Copy link
Contributor Author

count0 commented May 29, 2024

Wait... it selects builds that have not been merged? How is that possible?

@count0
Copy link
Contributor Author

count0 commented May 29, 2024

If it it's just a matter of build number we can make it 1001 and then reset it a next release, but I just don't quite understand how can an unmerged PR influence anything.

@stuarteberg
Copy link
Contributor

It's not selecting an unmerged PR.

I had already merged #138, before we started working on microarch optimizations. That produced a package with build number 1000.

Then, earlier today, I created (but did not merge) #141. Once merged, it will produce packages with build numbers 1001, 1101, 1301, 1401.

@count0
Copy link
Contributor Author

count0 commented May 29, 2024

Oh, you used a different branch. But it is still surprising to me that it works like this, I didn't know that multiple branches were allowed.

So, what precedence do we want the boost-1.82 branch to have?

It seems to me it should be the opposite of what we have now.

So we should have the main branch starting from 1000... (Momentarily from 2000 until the next upstream version).

@stuarteberg
Copy link
Contributor

So, what precedence do we want the boost-1.82 branch to have?
It seems to me it should be the opposite of what we have now.

I think it doesn't matter too much... currently, the conda-forge distro has a global default pinning to boost-1.82, but a "migration" is underway to bring all feedstocks up to boost-1.84. Once that migration is complete, they will update the global default pinning to 1.84. (We already had a version of graph-tool that used boost-1.82, but due to a bug in our recipe, I needed to revise that package even though our feedstock had already migrated to 1.84.)

Once #141 is merged, then our boost-1.82-based packages will also have the new microarch variants. I doubt most people will care which version of boost they get, as long as the solver is happy.

So we should have the main branch starting from 1000... (Momentarily from 2000 until the next upstream version).

That would be fine, but probably unnecessary if #141 is merged. Mind if I merge it now?

@count0
Copy link
Contributor Author

count0 commented May 29, 2024

Sure, go ahead...

@stuarteberg
Copy link
Contributor

stuarteberg commented May 30, 2024

Now that #141 is merged and built on our boost-1.82 branch, conda gives me the correct microarch variant by default. If that doesn't work on your machine, then make sure you're using the latest version of conda. I'm using 24.5.0. (And mamba does not seem to work in this case, at least not yet.)

Here's what I see on Linux:

$ conda create -n test-gt graph-tool

...

  graph-tool         conda-forge/linux-64::graph-tool-2.68-py312hf67fd38_1401
  graph-tool-base    conda-forge/linux-64::graph-tool-base-2.68-py312h00ebab2_1401

...

And on Mac:

$ conda create -n test-gt graph-tool

...

  graph-tool         conda-forge/osx-64::graph-tool-2.68-py312h9c88843_1301
  graph-tool-base    conda-forge/osx-64::graph-tool-base-2.68-py312h1cb44a0_1301

...

@stuarteberg
Copy link
Contributor

@count0 Are you now able to obtain microarch-optimized packages on your own machine?

If not, let me know what version of graph-tool conda wants to give you, and what conda info says on your machine (particularly the __arch virtual package).

@count0
Copy link
Contributor Author

count0 commented Jun 4, 2024

Yes, it's working fine for me now! And I already advertised this feature to anyone I could. :-)

@stuarteberg
Copy link
Contributor

Excellent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants