Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide option to disable -march=native #204

Closed
mjp41 opened this issue May 27, 2020 · 5 comments
Closed

Provide option to disable -march=native #204

mjp41 opened this issue May 27, 2020 · 5 comments

Comments

@mjp41
Copy link
Member

mjp41 commented May 27, 2020

The option -march=native enables the compiler to specify code to the current machines CPU capabilities with Clang and GCC. This improves performance.

However, if the build and deployment machine are different then this option should not be used as the build machine may have features such as AVX512., which are not supported on the target machine.

We should provide an option to specify the target architecture features.

Options

  1. Provide options is disable -march=native
  2. Provide option to specify AVX, AVX2, AVX512, and -march=native
  3. Something else?

@davidchisnall, @nwf, @SchrodingerZhu any thoughts on what we should do here? The option will also need exposing through the Rust Crate.

Looking at the codegen, -march=native doesn't make much difference and does not affect the fast paths. So I would lean towards option, 1. This will allow us to build the fast code for benchmarking, and if someone needs something more specific we can add it later.

@nwf
Copy link
Collaborator

nwf commented May 27, 2020

Can you expose -march as a cmake parameter defaulting to "native"?

@davidchisnall
Copy link
Collaborator

I seem to recall that -march=native implies -mtune=native unless otherwise specified. The former specifies the set of instructions to enable, the latter the pipeline structure to optimise for. This isn't great for reproduceable builds (you will get different instruction selection and scheduling decisions based on the CPU of the builder), but aside from that -mtune=native isn't particularly harmful (unless tuning for one CPU makes it significantly worse on another. That's pretty rare for x86, but ARM Cortex-A53 vs A78 will likely give very different decisions).

I'd be inclined to add an optimisation flags CMake option that defaults to -march=native for release builds and empty for debug builds. If people want to aggressively tune for a particular target, they can.

@plietar
Copy link
Contributor

plietar commented May 27, 2020

-march=native also interacts badly with distcc. When I was in Cambridge last year I always had the relevant line of CMakeLists.txt commented out because of that (obviously not for benchmarks though).
So I’m definitely in favor of doing something about it.

I’m almost inclined to have march=native disabled by default, if it leads to less confusion like non portable binaries. Especially if, as Matt says, the fast path code gen is similar. I’d hope that anyone doing benchmarking / using snmalloc seriously would read the README, where we can explain how to turn it on.

@mjp41
Copy link
Member Author

mjp41 commented May 27, 2020

Fast path codegen is identical up to jump addresses for malloc/calloc/free. Realloc has a couple of minor differences for working out if it needs to do reallocation.

There are a few places where zeroing gets switched to vector instructions from memset, and there are a few uses of BMI instructions like shlx, lzcnt that mildly improve codegen.

I am going to move to disabled by default, and have an option to turn it on.

@mjp41
Copy link
Member Author

mjp41 commented May 29, 2020

Closed by #206

@mjp41 mjp41 closed this as completed May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants