Skip to content

Eigen (CPU) version and Other Improvements

Compare
Choose a tag to compare
@lightvector lightvector released this 23 Aug 19:42
· 1652 commits to master since this release

If you're a new user, don't forget to check out this section for getting started and basic usage!
The latest and strongest neural nets are still those from the former release: https://github.com/lightvector/KataGo/releases/tag/v1.4.5

KataGo has now improved its Eigen implementation, making for what is now a reasonably decently-optimized pure-CPU version! It will of course still be much slower than with a good GPU, but particularly for smaller nets (20 blocks, or 15 blocks) should often get from 5 to 20 playouts per second. All of these versions are available as pre-compiled executables in this release now.

Versions available

  • OpenCL - Use this if you have a modern GPU.
    This continues to be the general GPU version of KataGo, should work on a variety of GPUs, although older GPUs not from the last few years may not work, and AMD and minor vendors often have driver issues in their OpenCL implementations.

  • CUDA - Test this if have a top-end NVIDIA GPU, are willing to do some more technical setup work, and care about getting every bit of performance.
    Requires an NVIDIA GPU and requires installing CUDA 10.2 (not CUDA 11 yet) and CUDNN from NVIDIA. For most users, there is little reason to use this version, often the OpenCL version will be faster even on NVIDIA's own GPUs! The CUDA version may be a little faster for some very top-end GPUs that have FP16 tensor cores. But even then not always, so you should benchmark the difference to see in practice on your specific hardware.

  • Eigen AVX2 - Use this if you don't have a GPU or your GPU is too old to work, but you have an Intel or AMD CPU from the last several years.
    This is a pure CPU version of KataGo, but compiled to use AVX2 and FMA operations, which roughly will double the speed compared to not using them. However, it will completely fail to run on older or weaker CPUs that don't support these operations.

  • Eigen - Use this if you don't have a GPU or your GPU is too old to work, and your CPU turns out not to support AVX2 or FMA.
    This is the pure CPU version of KataGo, with no special instructions, which should hopefully run just about anywhere.

Major New Stuff This Release:

Performance

  • Massive optimizations for the Eigen implementation thanks to kaorahi, making it now usable.
  • Reduced OpenCL code overhead, which may make it able to run on a small number of older GPUs where it couldn't before.
  • Worked around major OpenCL issue with NVIDIA GPUs that prevented it from using more than one GPU effectively. Now it should scale on a multi-GPU machine, whereas previously it didn't at all.

For Analysis Tool Devs

  • Implemented allow and avoid options for both the json analysis engine and for GTP lz-analyze and kata-analyze, whose precise semantics should be documented in these links. These options allow restricting the search down to only specific moves or specific regions of the board. I'm not entirely sure if they match Leela Zero's semantics, since I could not find any precise specification for them beyond the raw source code and scattered descriptions in github issues.

  • Added pvVisits option for both the json analysis engine and for GTP lz-analyze and kata-analyze. This option causes KataGo to also report the number of visits for every move in any of the principal variations for different moves. These values might be useful for estimating or for informing users about the reliability of the moves as you get deeper into a variation.

  • Improved some of the logging options available for the analysis engine. The new options are in https://github.com/lightvector/KataGo/blob/master/cpp/configs/analysis_example.cfg, commented out with their default values.

Other

  • Fixed some interface-related bugs and made a variety of changes in the build config to produce more friendly messages and hints in CMakeGUI.