Skip to content
vdbergh edited this page Oct 10, 2024 · 6 revisions

Get username/password for fishtest

In case you have not done so already, get a new account/password registering on fishtest:

https://tests.stockfishchess.org/signup

Please mind that after the registration you will be automatically redirected to the login page. The message "Permission Required. Creating or modifying tests requires you to be logged in. If you don't have an account, please Register." is an information, not an error message.

Install the worker

In case you have not installed the worker on your computer yet, follow the installation instructions on these pages :

Launch the worker

To launch the worker using the default parameters (e.g. using 3 CPU cores) simply open a console window in worker directory and run the following command:

python3 worker.py

Enter your username and password when the worker asks for them. Alternatively you can specify username and password on the command line:

python3 worker.py username password

Add the option concurrency to control the number of CPU cores allocated to the worker. The safest max setting suggested is to use the number of physical cores leaving one core for the OS.

On a PC dedicated only to fishtest it's possible to set concurrency to the number of virtual cores leaving one core for the OS, but with the number of workers contributed today to the framework this is not strictly necessary.

python3 worker.py --concurrency MAX-1

If you don't set the concurrency parameter then the worker will use 3 cores.

The worker writes the parameters into the configuration file fishtest.cfg, so next time you can start the worker with the simple:

python3 worker.py

On Linux and Macintosh, you can use the nohup command to run the worker as a background task.

nohup python3 worker.py &

If this is your only background python3 process then you can stop the worker quickly and cleanly with killall python3. Otherwise you have to find the PID of the worker process via ps and use kill [PID].

Worker parameters

python3 worker.py [<username> <password>] [-h] [-P <URL protocol>] [-n <URL domain>] [-p <URL port>] [-c <number of cores>] [-m <memory>] [-u <uuid>] [-t <test min thread number>] [-f <fleet flag>] [-c <compiler>] [-w] [-v]

<username>          : your username on fishtest, first argument
<password>          : your password on fishtest, second argument
-h / --help         : show the help message and exit
-P / --protocol       PROTOCOL     : protocol of the fishtest server URL (string {https; http}, default: https)
-n / --host           HOST         : address of fishtest server URL (string, default: tests.stockfishchess.org)
-p / --port           PORT         : port of fishtest server URL (number, default: 443)
-c / --concurrency    CONCURRENCY  : the number of core allocated to the worker (number or string {MAX; expression}, default: 3, MAX to use all cores)
-m / --max_memory     MAX_MEMORY   : memory used by the worker (number or string {MAX; expression}, default: system memory/2 expressed in MB)
-u / --uuid_prefix    UUID_PREFIX  : first part of the worker UUID (string {_hw; alphanumeric with length>1), default: _hw, _hw to use internal algorithm)
-t / --min_threads    MIN_THREADS  : minimum number of thread that a test must have to be assigned to the worker (number, default: 1)
-f / --fleet          FLEET        : stop the worker when fishtest has no task to do (boolean, default: False)
-C / --compiler       COMPILER     : compiler used to build binaries (string {g++; clang++}, default: g++)
-w / --only_config                 : write the config file and exit (no parameter)
-v / --no_validation               : skip the username/password validation with the server (no parameter)

CONCURRENCY and MAX_MEMORY can be set with an expression, also involving MAX, eg:
--concurrency "min(max(8, MAX/3-1), MAX/2)"  --max_memory "MAX/2 if MAX>16384 else MAX*3/5" 

See the available expressions.

Generate a sri.txt to open a PR for the fishtest worker code

The fishtest worker writes in the sri.txt file the hash of the worker files, the GitHub fishtest Continuous Integration stops if the hash is invalid. To generate a new sri.txt before opening a PR for the fishtest worker code, simply run:

python3 worker.py a a --only_config --no_validation

Worker limitations for systems with a large number of cores

Currently, the game-playing program (cutechess-cli) used by the worker will flag games as lost on time in case of a very large (>32) concurrency for very fast games. Hardware with more than 32 cores will therefore be assigned to tests running at long time control (LTC). In very rare cases, no LTC tests run on fishtest, in which case the worker might be idle.

workers with very many threads (e.g. 100) might exceed some of the defaults limits on a Linux server. Ensure that max user processes (ulimit -u) and open files (ulimit -n) are sufficiently large.

The Stockfish Testing Framework server (fishtest) is configured so that machines with less than 8 cores do not run the SMP test; machines with more than 32 cores do not run the STC test; machines with low RAM (below 4GB) do not run the LTC test (please refer to the Testing Methodology for definitions.

Running the worker on CPUs with different core types

Some CPUs (e.g., Apple M1 or those based on Intel Alder Lake microarchitecture) feature cores of different strengths (named P and E cores in case of Intel). To avoid playing matches between engines running on different hardware, one has to restrict the fishtest worker to a subset of identical cores. Under Linux, this is achieved with the taskset command (for an 8 P + 8 E setup like i9-12900K):

taskset -c 0-15 python3 fishtest/worker/worker.py foo bar --concurrency 16
taskset -c 16-23 python3 fishtest/worker/worker.py foo bar --concurrency 8 (or 7)

The exact requirement applies for Apple M1 CPUs with fast and slow cores: given that there is no taskset command on Mac, just set the --concurrency parameter to not exceed the number of fast cores to not allow the worker to run on slow cores, which would lead to inconsistent results.

Stop the worker

To stop the worker gracefully, create a file named fish.exit near the worker.py script. In this case, the worker will stop after finishing the current batch of games; however, it will still start new games from the current batch until all games are finished. On Linux, simply run the command touch fish.exit. The worker will delete the fish.exit file on exit.

By the way, you can also stop the worker quickly by simply killing the process. The framework will still use most of the completed games played by your worker. It is strongly preferable to kill it gracefully though, i.e., by using the CRTL+C in the command window or by using the kill -15 command on Linux to send the "SIGTERM" signal, etc. Don't use "SIGKILL" (kill -9)!!

GitHub API requests rate limit

The worker makes some GitHub API requests. GitHub sets these rate limits:

  • 60 requests/hour for single IP address for unauthenticated requests
  • 5000 requests/hour for authenticated requests

The lower rate is perfectly fine for the majority of CPU contributors, to switch to the higher rate:

  1. Login on your GitHub account (sign up for a free account on GitHub)
  2. From the token creation page create a new authorization token enabling the "public_repo" scope and copy the <personal-access-token> number
  3. Create a text file:
  • Linux and Windows Subsystem for Linux: touch ${HOME}/.netrc && chmod 600 ${HOME}/.netrc
  • Windows: C:\Users\<Your_Username>\.netrc or C:\Users\<Your_Username>\_netrc (check your $HOME). Make sure to delete the .txt extension (_netrc, not _netrc.txt)
  • Write this content:
    machine api.github.com
    login <personal-access-token>
    password x-oauth-basic
    
    or
    machine api.github.com
    login <your_github_username>
    password <personal-access-token>
    

Cloud and container considerations

  • When storing a config file (fishtest.cfg) in an image, please make sure that it does not contain a [private] section. Otherwise the internal algorithm may assign the same name to all workers (something like username-3cores-aeb3dfcd) and only one will be able to connect. Whether this will actually happen depends on the details of the setup, but it is better to be safe than sorry!
  • If you are using a container that is frequently recreated from scratch (e.g. an EC2 spot instance) please consider running the worker with a hand crafted UUID prefix (via the -u option). This makes it possible for the server to track this worker even if the container is recreated. The script for running the worker on EC2 already does this for you (see Running-the-worker-in-the-Amazon-AWS-EC2-cloud). Note that if you run multiple workers in this way, each of them should have a different UUID prefix.