Game AI with Reinforcement Learning

Industry Day 2025

Project Overview

This project explores the use of Reinforcement Learning (RL) to create intelligent gameplay strategies in an Auto Battler game environment. Using Stable Baselines3, Gymnasium, and the Deep Q-Network (DQN) algorithm, the AI learns to optimally place units on a board turn by turn.

Technologies and Frameworks Used

Python C++ Stable Baselines3 SFML Gymnasium

How the Game Works

Each player places a unit on the board each turn. After placement, the units automatically battle each other. The health points of surviving units are subtracted from the opponent’s total health. The game continues for 9 turns or until one player's health reaches zero. The player with the highest health at the end wins.

Reinforcement Learning Approach

The AI learns by interacting with a game mechanic that randomly places opponent units on the board. No human intervention is required — the AI gathers data independently through repeated gameplay. Each action (placing a unit) affects the evolving game state, and rewards are shaped based on combat outcomes and end-game results.

AI Learning Cycle

AI Learning Cycle Diagram

Figure 1: Flowchart showing the AI's training cycle in the Auto Battler environment, including gameplay, data gathering, and learning phases.

Current and Future Work

Work is ongoing to refine the AI's performance by improving the action space and reward functions. Teaching the AI to play against itself (self-play) is a key next step for deeper strategy development.


Additionally, I am exploring the implementation of an AlphaZero-style agent, using Monte Carlo Tree Search (MCTS) combined with reinforcement learning. This will allow for comparisons between different algorithms (DQN vs AlphaZero) to evaluate their effectiveness and learning speed in the Auto Battler environment.