- Long-range: Also know as long Range, this player's main goal is to increase the distance between the opponent and himself. It will back up using a speed of 1 to escape the opponent and any projectile coming its way. It will fire it's arrow once every 30 updates and use it's shield when it goes below 50 percent health and only has one shield.
- Mid-Range: Also know as Mid, this player's main goal is to keep a distance of 250 from the opponent, once in this "sweet spot" it will throw fireballs (that only last for a given range of 200) every 30 updates. It moves with a speed of 1 and in a manner to make sure his opponent cannot circle away from him. It only attempts to get closer to his opponent when he has above 70 percent health and is farther than 250, once it's health goes below 70 percent it will attempt to dodge any projectiles headed it's way, then attempt to maintain (or reach) the sweet spot. It uses it's shield when it goes below 50 percent health and only has one shield.
- Close-Range: Also know as short, this player's main goal is to get as close as possible to his opponent, once within 50 distance he will use his knife every 30 updates. It moves in the fastest way to possible to the opponent. It only attempts to get closer to his opponent when he has above 45 percent health, once it's health goes below 45 percent it will attempt to dodge any projectiles headed it's way first then move closer. It uses it's shield when it goes below 50 percent health and only has one shield.
Currently the players are balanced so they win around 50% of the games against each other on average: short = range = mid.
The Dynamic controllers are made up of two groups of weights, shooting and movement weights. The shooting weights change the angle and type of shot, there are 7 different angles for each of the three attacks (long, mid, and short). The movement weights have groups within the weights. The first 7 weights correspond to the distance that a player will not move away from the enemy at, the second 7 weights correspond to the distance that a player will move towards the enemy at. The next two weights correspond to how a player will move (toward or away from the opponent), the next two weight correspond to how a player will dodge an incoming projectile within the x direction and the last two weights correspond to how a player will dodge an incoming projectile within the y direction.
The Dynamic controller updates the weeights in two situations, if there is a total health loss of 100 between the two players then the weights are updated and new weights are chosen. In the event that there is not a loss of 100 health within 300 updates, then the players will choose new weights but the weights will not be updated. This is so that a bad combination can be skipped but the weights will not be penalized since there was not a fair assessment of the weights ability.
- Master: The master was trained against the 3 FSM's with a set of pre-trained weights, it was trained tracking individual weights and looking for any interactions within the movement and shooting weights. As the master plays it will not update the weights, it will just use the weights that are already set to fight its opponent.
- Average: The average was trained against the 3 FSM’s with a set of pre-trained weights, it was trained tracking individual weights and looking for any interactions within the movement and shooting weights. As the average plays it will update the weights, learning from the opponent that it is currently facing and adapting to strategies that help the FSM win. This starts with the same weights that the Master uses but adapts them to the game.
- Random: The random starts out with equal weights across the categories (not trained at all) and learns from the current opponent it is playing against.
The GENN is created using a combination of TensorFlow frameworks and various tools. The GENN is a neural network that was trained using the Genetic algorithm. It was trained using 999 generations. Each generation was kept, bred, or mutated every 9 games. 9 games was chosen because the training process for each network was to play 3 games against each of the finite state machines (short, mid, and range). The networks were assigned a random amount of hidden layers between 1 and 100 with up to 100 nodes in each layer with random weights for each connection. The activation functions deemed best were sigmoid for the desired movement output and tanh for the probabilities of which type of firing method to shoot (short, mid, or range). Each network played their 9 training games against the FSM’s and the amount of health (including any unused shields) were calculated at the end of each game. The resulting health of the matches were used as the fitness score to train the networks. At the end of each of the 9 game iterations, the top 10% of networks were kept to compete in the next set of 9 games, the remaining 90% were bred from the returning top 10%, using characteristics of the top 10%. Then of the 90% that were bred from the the top 10%, 10% of those were mutated to have a different amount of layers and nodes using a bit flip mutation. This process was run for 100 GENN’s at a given time using the above training method. The weights were saved and outputted after the final run to save the best networks configuration. The top 5 networks were uploaded to the GitHub and saved, their performances varied across the top FSM’s. The network with the highest fitness score was saved and will be used whenever the GENN is called within the game. The training was performed on a 16 core server and to run 100 networks through 999 generations (totaling 99,900 games) and took around six days.