Skip to content

Latest commit

 

History

History
51 lines (29 loc) · 7.95 KB

README.md

File metadata and controls

51 lines (29 loc) · 7.95 KB

AI Snakes 2021: NegaSnake

This repository contains the source code of NegaSnake, the bot I developed in association with Serpentine for AI Snakes 2021. It finished 1st, with a win rate of 87%! More detailed results can be found in the log file (where NegaSnake is named "Serpentine").

For setting up your workspace, refer to the original instructions.

Rules

In this competition, bots play a 1v1 variant of Snake on a 14x14 grid. Summarized, the rules are as follows.

  • Snakes move simultaneously and must choose their next move within 1 second.
  • Matches end after at most 3 minutes, or when a snake crashes into the border, its own body, or the opponent's body. If both snakes crash simultaneously, the longer snake wins.
  • Snakes grow by eating apples. At any time, exactly one apple is available on the grid, which repositions randomly if not eaten after 11 moves (note that the official rule is imprecise).

Strategy & implementation

The core of the NegaSnake bot is a (heuristic) negamax search with alpha-beta pruning. Note that this algorithm requires the game to be turn-based. Since the snakes actually move simultaneously, we need to make a 'paranoid' assumption: during the search, we move first, giving the opponent an advantage.

During the search, as few objects as possible are created. For example, instead of cloning the provided Snake class, preallocated data structures in the State class are updated. In the same spirit, bitwise operations are used throughout the code (e.g. see isFinal). This improves performance and makes it less volatile (since the GC is run less often), allowing more states to be searched. Note, however, that these optimizations make it harder to understand, modify, and debug the code.

Depending on the current state, the time it takes to search to a certain depth is highly variable. For this reason, we use iterative deepening. Additionally, the search is run in a separate thread, which is interrupted when the 1 second time limit is approached. Together, these techniques maximize the search depth (30 ply is commonly reached) while preventing losing by timeout.

Considering 'good' moves first during the search results in more alpha-beta cuts. Therefore, using results from shallower searches, getMovesOrdered sorts moves from best to worst. These results are stored in a transposition table, that maps states (which are uniquely represented by the sequence of moves that lead to them; check append for details) to their score. For this we use a custom HashTable, which has much less overhead than Java's HashMap. Note that we never check for actual transpositions while searching, since they are exceedingly rare in Snake.

The score mentioned above is the (heuristic) evaluation of the current state. Whereas a deeper search improves tactical play ("Can I force-crash the opponent?"), a better evaluation results in stronger positional play. It is a hand-crafted linear combination of the following features, from most to least heavily weighted.

  • Whether the current state is a win/loss/draw. A draw is considered a marginally better loss. Earlier wins and later losses/draws are scored slightly higher.
  • The (difference in the) snakes' length.
  • The distance to the current apple. If one snake can reach the apple in time but its opponent cannot, that snake is rewarded additionally.
  • The positioning directly after the apple repositions. In particular, for both snakes, we consider the number of unoccupied squares within 11 steps (the time-to-live of an apple) that are strictly closer to the snake's head than its opponent's. This is calculated using two relatively expansive BFSs. The score is then scaled according to the ratio of 'controlled space' (e.g. it is increased by 75% if one snake controls 4x as many squares). Note that this is the only feature that is calculated during the search instead of at leaf states. Furthermore, before the apple repositions, this heuristic is approximated by the snakes' distance to the center (see prepareScores). This acts as a backup if the search is interrupted early.

A final noteworthy aspect of NegaSnake's strategy (unrelated to the search) is its deliberate prolonging of moves. If it has a significant lead (≥4 apples) or crashing is unavoidable even though it's leading by length, it spends the full second to output its next move by sleeping the leftover time. Respectively, this minimizes the time for the opponent to catch up, and maximizes the probability of winning by timeout.

Future improvements

  • Currently, the transposition table only caches scores from depths of ≤12 ply (note that a ternary tree of this height can already contain ~800k nodes). Consequently, at greater depths, moves cannot be ordered. Instead of storing all scores up to some depth, it should store the 'most important' scores from the whole search tree. This can be achieved using replacement strategies.
  • Alpha-beta pruning can be replaced by a more efficient search algorithm, such as PVS. The principal variation is easily determined from the (improved) transposition table. Another option is completely replacing the search by a Monte Carlo tree search, possibly in combination with a neural network (see last point).
  • Instead of immediately evaluating the position if depth == 0, you could first perform a quiescence search that, for example, continues while either snake must make a forced move. This extends the search in dangerous or promising situations.
  • If the apple recently repositioned, the positioning feature is calculated at a relatively great depth, where there are many nodes. Since BFSs are quite costly, the search is usually interrupted before having finished, leaving the bot vulnerable at specific moments during a match. This weakness can be mitigated by using a fast approximation in these scenarios. One option is to approximate the number of squares using Voronoi cells centered at the snakes' heads, ignoring obstacles. Their area is efficiently calculated using the shoelace formula.
  • The weights (and features themselves!) used in the evaluation function are based on intuition. Optimizing these would increase NegaSnake's strength. This can be done either manually, by having bots with different parameters compete and keeping the best, or using machine learning. Alternatively, the evaluation can be completely computed using a neural network, à la Stockfish.

Acknowledgments

NegaSnake was developed within E.S.A.I.V. Serpentine. Special thanks to my teammates Boris Muller, Tunahan Sarı, and Imre Schilstra. Parts of this text are excerpts from sections I wrote for a report published by Serpentine.

License

NegaSnake is licensed under GPLv3 (or any later version). The framework and example bots by Luiz Araújo et al. are licensed under APLv2.