Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numa_affinity broken ? #223

Closed
mathiaswagner opened this issue Apr 15, 2015 · 41 comments
Closed

numa_affinity broken ? #223

mathiaswagner opened this issue Apr 15, 2015 · 41 comments
Labels
Milestone

Comments

@mathiaswagner
Copy link
Member

I noticed getting the message:

Failed to determine NUMA affinity for device 0 (possibly not applicable)

quite often. When looking in the code it looks for something like

"/proc/driver/nvidia/gpus/%d/information", my_gpu

with my_gpu being the device id (i.e. an integer).

I checked on some machines and see

[mwagner@gdev1 ~]$ ls /proc/driver/nvidia/gpus/
0000:02:00.0  0000:03:00.0  0000:83:00.0  0000:84:00.0
...
[mwagner@cream ~]$  ls /proc/driver/nvidia/gpus/
0000:02:00.0  0000:03:00.0  0000:83:00.0  0000:84:00.0

It looks like the directory names have changed from device-ids to Bus-Ids with the newer drivers. I don't know when that happened but we should update the numa_affinity.cpp to reflect this.

@mathiaswagner
Copy link
Member Author

cudaDeviceGetPCIBusId should come to rescue us here, see: http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__DEVICE_g548a98d8f123f7b0053f3913b733d07c.html#g548a98d8f123f7b0053f3913b733d07c

That will allow us to skip the first step with looking for the bus location in the files under
/proc/driver/nvidia/gpus/%d/information in the first place.

Or am I missing something here?

@alexstrel
Copy link
Member

Also, I have noted that the current implementation did not allow true multi-threading within an MPI process (e.g., all openmp threads will be launched on a single core). Don Holmgren wrote a patch to numa_affinity.cpp file to fix the issue, which I tried in my early versions of the eigcg solver. Need to revisit this stuff.

@mathiaswagner
Copy link
Member Author

Can you create a branch with that patch?

On Apr 15, 2015, at 17:15, Alexei Strelchenko <[email protected]mailto:[email protected]> wrote:

Also, I have noted that the current implementation did not allow true multi-threading within an MPI process (e.g., all openmp threads will be launched on a single core). Don Holmgren wrote a patch to numa_affinity.cpp file to fix the issue, which I tried in my early versions of the eigcg solver. Need to revisit this stuff.


Reply to this email directly or view it on GitHubhttps://github.com//issues/223#issuecomment-93571569.

@mathiaswagner
Copy link
Member Author

mathiaswagner commented Apr 15, 2015 via email

@alexstrel
Copy link
Member

Done, in hotfix/numa branch

@mathiaswagner
Copy link
Member Author

Thanks. I will look into it.

On Apr 15, 2015, at 17:52, Alexei Strelchenko <[email protected]mailto:[email protected]> wrote:

Done, in hotfix/numa branch


Reply to this email directly or view it on GitHubhttps://github.com//issues/223#issuecomment-93579366.

@mathiaswagner
Copy link
Member Author

hotfix/numadoes not compile for me.

numa_affinity.cpp: In function ‘int setNumaAffinity(int)’:
numa_affinity.cpp:196: error: ‘numa_node_of_cpu’ was not declared in this scope
numa_affinity.cpp:199: error: ‘numa_allocate_nodemask’ was not declared in this scope
numa_affinity.cpp:205: error: ‘numa_bitmask_clearall’ was not declared in this scope
numa_affinity.cpp:206: error: ‘numa_bitmask_setbit’ was not declared in this scope
numa_affinity.cpp:207: error: cannot convert ‘bitmask*’ to ‘const nodemask_t*’ for argument ‘1’ to ‘void numa_bind(const nodemask_t*)’
numa_affinity.cpp:208: error: ‘numa_bitmask_free’ was not declared in this scope

It might depend on the right 'numa.h' ???

Although the version still looks for the no-longer existing path
sprintf(nvidia_info_path,"/proc/driver/nvidia/gpus/%d/information", my_gpu);

Reading /sys/class/pci_bus/%.7s/cpulistaffinity directly using the pic bus id from the CUDA runtime instead of the numeric gpu id might be the better choice. But even with that hack I noticed that it sets the affinity to use just a single core instead all in the right num domain.
It might be easier to read /sys/class/pci_bus/%.7s/cpuaffinity which is a hex affinity mask already.

@alexstrel
Copy link
Member

Ooops, yes, you need an old numa library for the compilation. I asked Don to revisit the stuff.

@mathiaswagner
Copy link
Member Author

http://www.open-mpi.org/projects/hwloc/ might be worth a look.

There is also an example code related to GPUs on https://www.open-mpi.org/faq/?category=runcuda

@maddyscientist
Copy link
Member

Where does this stand at the moment? Is there an easy fix for this to tide us over? Did Don prove an updated variant that compiles with more recent numa?

@mathiaswagner
Copy link
Member Author

I have not heard of an update from Don.

So, my summary:

  • The only system I observed the current code to work is on the Bloomington Cray w/ CUDA 5.5
  • On other systems (newer CUDA drivers, probably other Kernel versions) this is broken.

I am not sure which of the files under /sys or /proc are created by the CUDA driver / (depend on the version of) and which depend on the Kernel version. But right now the code for the numa affinity is pretty fragile.

@alexstrel
Copy link
Member

I'm communicating with Don about the topic. Will let you know.

@maddyscientist
Copy link
Member

Before I delve into this internally, anyone know if the numa affinity was working v340 of the driver? I'm trying to track down a performance regression in the driver since then, and numa working on v340 but not since then would explain this.

@mathiaswagner
Copy link
Member Author

Which CUDA version is 340?

As I wrote it worked on a Cray with 5.5 but I don't know the driver version.

For a single GPU you can use taskset to force it when launching.

@maddyscientist
Copy link
Member

Around CUDA 6.5, but that doesn't mean that a user running 6.5 isn't running a more recent version. I guess I'll have to look into this in more detail

@mathiaswagner
Copy link
Member Author

For the issue of parsing cpulistaffinity / cpuaffinity I found something hopefully useful:
https://github.com/karelzak/util-linux/blob/master/include/cpuset.h

and

https://github.com/karelzak/util-linux/blob/master/lib/cpuset.c

It seems to provide functions to parse the files mentioned above.
I have not tried the code yet but in case it does work we might use it.
The code is LGPL and I hope that this should not be an issue for us.

@mathiaswagner
Copy link
Member Author

I have given that a first try in hotfix/numa_quickfix_cpuset. Ugly proof of concept.

@mathiaswagner
Copy link
Member Author

The proof of concept is somewhat cleaned up and somewhat tested now.

@maddyscientist
Copy link
Member

Thanks for pushing on this Mathias. Can you give a run down of the implications of including LGPL code in QUDA?

@mathiaswagner
Copy link
Member Author

I believe we only need to mention the license. Quda is open source anyway so this should not be an issue.

But I will have to read LGPL and maybe it is a good idea to ask the maintainer about it.

@maddyscientist
Copy link
Member

My only concern is that QUDA must never be bound to GPL, so I want us to be very careful.

@mathiaswagner
Copy link
Member Author

I get that. I'll check and once I came to a conclusion I will let you know.
Then it is probably s good idea to have someone double check that.

@rbabich
Copy link
Member

rbabich commented Jun 3, 2015

Unfortunately, I think we want to stay away from the LGPL.

Since QUDA is itself a library, I'm pretty sure that bundling an LGPL-licensed source file would be tantamount to relicensing all of QUDA under the LGPL; see section 2 of LGPLv2.1: https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.html

I'm being generous here and assuming that the author of "cpuset" intended to license it under v2 or v2.1 (and would be willing to edit the two source files to say as much), since LGPLv3 imposes additional restrictions that make it incompatible with GPLv2, among other problems.

I liked the idea of using hwloc. I guess that didn't pan out?

@mathiaswagner
Copy link
Member Author

I have not yet followed the hwloc path.

For now I will remove the code from github (force delete the branch) - just to be on the safe side.

In case we gain further insight we can re-add it if we are sure it does not have any undesirable licensing impacts.

@alexstrel
Copy link
Member

@mathiaswagner
Copy link
Member Author

Thanks, @alexstrel !

This http://docs.nvidia.com/deploy/nvml-api/group__nvmlDeviceQueries.html#group__nvmlDeviceQueries_1g1c52b8aefbf804bcb0ab07f5d872c2a1 indeed looks promising. We might just need to check which requirements (NVML version, etc) come with it. I will have a look.

@mathiaswagner
Copy link
Member Author

Looks like we need at last a CUDA 6.5 driver: http://docs.nvidia.com/deploy/nvml-api/change-log.html#change-log

I know that our old code somewhat worked on 5.5. I am not sure about 6.0.
So we could try to use nvml by default and add a fallback to our old code? Anyway, as the big machines like Titan and BlueWaters now go for 6.5 I think the issue with older CUDA versions becomes less severe. But I think we still support them. What is the oldest CUDA version we want to support anyway?

@alexstrel
Copy link
Member

no problem on pi0g cluster at FNAL, we have the latest driver (CUDA 7.0).

@nmrcardoso
Copy link
Contributor

I "think" that we will have great news about cuda in the next couple months.
However, the time between new releases and the installation of these can be
an issue.

On Wed, Jun 10, 2015 at 2:33 PM, Alexei Strelchenko <
[email protected]> wrote:

no problem on pi0g cluster at FNAL, we have the latest driver (CUDA 7.0).


Reply to this email directly or view it on GitHub
#223 (comment).

@mathiaswagner
Copy link
Member Author

I am also aware that some machines like the Cray in Bloomington will probably not get CUDA 7.0 at all. (at least not supported by Cray due to the management contract …).
Given that 6.5 is now available for quite a while and supported on the Cray etc. we might think about dropping support. In this respect it would be nice to know who is using QUDA and which GPUs and CUDA versions are actively used.
As we plan to drop support for pre-Fermi devices with 0.8 it might be a good time to drop support for old CUDA versions at the same time.

Anyway, this issue is about NUMA and not CUDA support. Just that the NVML functions for cpu affinity require a v340 driver.

Mathias

On Jun 10, 2015, at 15:44, nmrcardoso <[email protected]mailto:[email protected]> wrote:

I "think" that we will have great news about cuda in the next couple months.
However, the time between new releases and the installation of these can be
an issue.

On Wed, Jun 10, 2015 at 2:33 PM, Alexei Strelchenko <
[email protected]:[email protected]> wrote:

no problem on pi0g cluster at FNAL, we have the latest driver (CUDA 7.0).


Reply to this email directly or view it on GitHub
#223 (comment).


Reply to this email directly or view it on GitHubhttps://github.com//issues/223#issuecomment-110889332.


Mathias Wagner
Department of Physics SW 117 - Indiana University
Bloomington, IN 47405
email: [email protected]:[email protected]

@maddyscientist
Copy link
Member

I think that the only machine that will need 5.5 support in the near future is Titan. All other machines of note have been upgraded to 6.5. So I think we should be ok with the strategy of using NVML in general, and drop back to the old way if running on CUDA 5.5.

I'm not even sure if this is an issue on Titan anyway, since it has only one GPU and one CPU per node. Though I guess it could be an issue, since the Opteron processors used by Titan do have a NUMA architecture.

@mathiaswagner
Copy link
Member Author

O looked into NVML and using NVML to set the cpu affinity is actually pretty easy. However, using it for QUDA brings up some questions.

nvml.h is part of the GPU deployment kit. It might not be available on all systems. For my tests I just installed the kit and copied the header to the include path.

  1. Can we include that header in quda (without running into any licensing issues).
  2. Which version should we use? We need a version that has the cpu affinity calls.
  3. The application using quda must then be linked against the nvidia-mllibrary. No issues for the tests but we need to educate developers about this.
  4. If the library is to old (before CUDA 6 I think) the cpu affinity calls are not in the library and I assume we get errors when linking. Can we detect this earlier? Or do we have to rely on the people using quda?

We should definitely use an option in configure to choose whether to use the new behavior or fall back to the old version (which should still be somewhat functional for CUDA 5.5)

@mathiaswagner
Copy link
Member Author

The code works on a local machine with CUDA 7 and on Cray with CUDA 6.5 and 5.5. However, on the cray the nvidia-ml library is located under /opt/cray/nvidia/default/lib64 so we need to include that in the library search path.

Right now the code uses the compile time check for CUDA version, i.e.

#ifdef NUMA_AFFINITY
  if(numa_affinity_enabled){
#if (CUDA_VERSION >= 6000)
    setNumaAffinityNVML(dev);
#else
    setNumaAffinity(dev);
#endif
  }
#endif

so compilation and linking also work apart from the library path issue also on systems with older CUDA / driver versions. But that may need more checks.

Maybe we can use some configure option like --with-nvml but on the other hand the application that uses quda has to link against nvidia-ml ...
The discussion above applies more to the internal tests.

@mathiaswagner
Copy link
Member Author

@mikeaclark @rbabich I think the nvml solution looks good, but before going further: Do you know whether we can distribute 'nvml.h' with quda?
I checked on several systems and the header is not present, only the library.
I just need to get the configure options done and we need to run tests on different systems.

@mathiaswagner mathiaswagner added this to the QUDA 0.7.2 milestone Jun 19, 2015
@rbabich
Copy link
Member

rbabich commented Jun 19, 2015

There's an ongoing thread about this, but it tentatively looks like we can distribute nvml.h with QUDA. We should just add another blurb to the LICENSE file, incorporating the text at the top of the header.

It's a bit unfortunate that the user will have to link against the nvml shared library. There are two options there:

  1. Build on a machine with a sufficiently recent display driver installed (which ships with NVML), and specify the path with --with-nvml=...
  2. Install the "GPU Deployment Kit" (overriding the default install path to point to a local directory, if the user doesn't have root), and point --with-nvml there. The GDK includes a stub library that should suffice for compiling.

Option 2 is useful since it doesn't require that the build machine (e.g., the head node of a cluster) have the driver installed. If we wanted to support option 2 exclusively, we wouldn't have to distribute nvml.h at all, since it's included with the GDK. In that case, we'd want to define "--with-nvml" such that it takes the parent directory of the one where the library is installed (e.g., --with-nvml=/usr on most non-cray systems). Then the Makefile would look under $(NVML_HOME)/lib for the library and under $(NVML_HOME)/include/nvidia/gdk for the header. We would also be able check the NVML API version in the header, to warn the user if they try compiling against a version of the GDK that's too old (which is a potential problem otherwise).

It's probably more user-friendly to support both options, though. In that case, we have to distribute the header anyway, so we might as well adopt the convention that --with-nvml points directly to the library path, e.g., --with-nvml=/usr/lib.

@mathiaswagner
Copy link
Member Author

The last paragraph is roughly what I had in mind. But of course that might break on a head node without GPUs. It does work on the cray nodes. For non cray systems it is not even necessary to specify the location of the library as it is in a default location.

I think we can include the header and use the configure option to point to the library (full or stub). Then we have a solution that works in most cases without any user configuration, on Cray with just specifying the path and on GPU less head nodes the user will have to install the gdk.
I believe having Numa working by default is desirable. That's the way it was until CUDA 6 broke the 'hack' that is included right in the current release (and we fall back to that for old CUDA versions).

@mathiaswagner
Copy link
Member Author

I now have a solution that adds the the following configure options:

 --enable-nvidia-ml      Use the nvidia-ml library to set NUMA affinity (default: enabled)
 --with-nvidia-ml-lib     Specify location of the nvidia-ml library if not in the default library locations (default: blank)

If numa is not enabled none of them apply.
If numa is enabled it will use nvml (unless disabled) for CUDA >= 6.0 and fall back to old code otherwise.

At some point I would like to get rid of the old code.

I am not sure about the naming in configure: I think NVML is the correct abbreviation but the library itself is named libnvidia-ml.so. So if a user looks for a non default installation location find will only help with nvidia-ml but not nvml.
Anyhow, I now would prefer the use of nvml instead of nviidia-ml and add a comment to the readme.

I will put in a pull request once the license questions have finally settled.

@mathiaswagner
Copy link
Member Author

@rbabich Just wanted to follow up whether the internal discussion led to a final conclusion whether we may include the header file in quda.

@rbabich
Copy link
Member

rbabich commented Jul 8, 2015

Kate has pinged the relevant folks again... will hopefully know tomorrow. I'm 90% sure that we're okay, though.

@mathiaswagner
Copy link
Member Author

Got to ask here again as I am doing some benchmarks with MILC and affinity improves things somewhat.

Also, I don't like having this lying around forever on my desk / issue tracker forever.

I can bug the people who decide that if you give me their contact data.

@mathiaswagner
Copy link
Member Author

If we go on with CMake I believe we can probably integrate and void adding nvml.h https://github.com/jirikraus/scripts/blob/master/NVML/cmake/FindNVML.cmake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants