Tic-tac-toe with reinforcement learning, made at Hack McWiCS 2024.
The tic-tac-toe bot is implemented using a Q-learning reinforcement model. The model is then trained against randomly selected moves.
After 600,000 episodes of training, we obtain the following results:
- 100% win rate when playing first against random moves
- 92% win rate and 8% draw rate when playing second against random moves
- 100% draw rate when playing against itself
- a more polished UI (web app)
- PVP mode
- 4x4, 5x5 tic-tac-toe
- games with higher state-space complexity (connect four) with deep Q-learning